# CURRENT ADVANCES IN THE RESEARCH OF RNA REGULATORY ENZYMES

EDITED BY : Akio Kanai and Tohru Yoshihisa PUBLISHED IN : Frontiers in Genetics

#### Frontiers eBook Copyright Statement

The copyright in the text of individual articles in this eBook is the property of their respective authors or their respective institutions or funders. The copyright in graphics and images within each article may be subject to copyright of other parties. In both cases this is subject to a license granted to Frontiers. The compilation of articles constituting this eBook is the property of Frontiers.

Each article within this eBook, and the eBook itself, are published under the most recent version of the Creative Commons CC-BY licence. The version current at the date of publication of this eBook is CC-BY 4.0. If the CC-BY licence is updated, the licence granted by Frontiers is automatically updated to the new version.

When exercising any right under the CC-BY licence, Frontiers must be attributed as the original publisher of the article or eBook, as applicable.

Authors have the responsibility of ensuring that any graphics or other materials which are the property of others may be included in the CC-BY licence, but this should be checked before relying on the CC-BY licence to reproduce those materials. Any copyright notices relating to those materials must be complied with.

Copyright and source acknowledgement notices may not be removed and must be displayed in any copy, derivative work or partial copy which includes the elements in question.

All copyright, and all rights therein, are protected by national and international copyright laws. The above represents a summary only. For further information please read Frontiers' Conditions for Website Use and Copyright Statement, and the applicable CC-BY licence.

ISSN 1664-8714 ISBN 978-2-88963-215-2 DOI 10.3389/978-2-88963-215-2

### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# CURRENT ADVANCES IN THE RESEARCH OF RNA REGULATORY ENZYMES

Topic Editors: Akio Kanai, Keio University, Japan Tohru Yoshihisa, University of Hyogo, Japan

Sequence- and shape-dependent recognition by the specific RNA-binding protein. The image created by Mr. Masahiro C. Miura (Keio Univ.), Prof. Tohru Yoshihisa (Univ. of Hyogo), and Prof. Akio Kanai (Keio Univ.).

Citation: Kanai, A., Yoshihisa, T., eds. (2019). Current Advances in the Research of RNA Regulatory Enzymes. Lausanne: Frontiers Media SA. doi: 10.3389/978-2-88963-215-2

# Table of Contents

*04 Editorial: Current Advances in the Research of RNA Regulatory Enzymes* Akio Kanai and Tohru Yoshihisa

### I. ENZYMATIC CATALYSIS REACTIONS IN WHICH RNAs ARE THE SUBSTRATES

*06 Human BCDIN3D is a Cytoplasmic tRNAHis-Specific 5*′*-Monophosphate Methyltransferase*

Kozo Tomita and Yining Liu


Hiroyuki Hori


Akira Hirata

*86 tRNA Processing and Subcellular Trafficking Proteins Multitask in Pathways for Other RNAs*

Anita K. Hopper and Regina T. Nostramo

### II. RNA-BINDING PROTEINS THAT CONTROL THE FATES OF THEIR TARGET RNAs

*100 RNA Polymerase II-Dependent Transcription Initiated by Selectivity Factor 1: A Central Mechanism Used by MLL Fusion Proteins in Leukemic Transformation*

Akihiko Yokoyama

*115 Emerging Evidence of Translational Control by AU-Rich Element-Binding Proteins*

Hiroshi Otsuka, Akira Fukao, Yoshinori Funakami, Kent E. Duncan and Toshinobu Fujiwara

*125 Recent Progress on the Molecular Mechanism of Quality Controls Induced by Ribosome Stalling*

Ken Ikeuchi, Toshiaki Izawa and Toshifumi Inada

# Editorial: Current Advances in the Research of RNA Regulatory Enzymes

*Akio Kanai1,2\* and Tohru Yoshihisa3\**

*1 Institute for Advanced Biosciences, Keio University, Tsuruoka, Japan, 2 Faculty of Environment and Information Studies, Keio University, Fujisawa, Japan, 3 Graduate School of Life Science, University of Hyogo, Ako-gun, Japan*

Keywords: RNA regulatory enzyme, RNA-Binding Protein, RNA modification, transcription, translation

**Editorial on the Research Topic**

#### **Current Advances in the Research of RNA Regulatory Enzymes**

Almost 30 years ago, when the editors were graduate course students in life sciences in Tokyo, Japan, the importance of biochemistry was stronger than it is today, and it was necessary to understand fields such as protein purification and enzyme kinetics just to read a molecular biology paper. We remember a book that was very useful back then, entitled *The Biochemistry of the Nucleic Acids, 10th Edition* (Roger et al., 1986), which summarized a variety of enzymes related to the regulation of nucleic acids and their biological functions.

#### *Edited by:*

*William Cho, Queen Elizabeth Hospital (QEH), Hong Kong*

#### *Reviewed by:*

*Thomas Preiss, Australian National University, Australia*

#### *\*Correspondence:*

*Akio Kanai akio@sfc.keio.ac.jp Tohru Yoshihisa tyoshihi@sci.u-hyogo.ac.jp*

#### *Specialty section:*

*This article was submitted to RNA, a section of the journal Frontiers in Genetics*

*Received: 09 August 2019 Accepted: 12 September 2019 Published: 09 October 2019*

#### *Citation:*

*Kanai A and Yoshihisa T (2019) Editorial: Current Advances in the Research of RNA Regulatory Enzymes. Front. Genet. 10:973. doi: 10.3389/fgene.2019.00973*

Because we have walked the paths of the life sciences with this experience of biochemistry, it is no exaggeration to say that we feel compelled to act as editors of the research topic 'Current Advances in the Research of RNA Regulatory Enzymes' in the RNA section of *Frontiers in Genetics*. Since our student days, huge attention has been drawn to the numerous noncoding RNAs, most of which were first identified at the beginning of this century. However, as you will have noticed already, only a few RNAs, such as antisense RNAs and ribozymes, exhibit regulatory functions by themselves *in vivo* while enzymes and protein factors are required for most scenarios related to almost all other noncoding RNAs. For example, for microRNAs (miRNAs) to function, a protein partner complex, called the 'RNA-induced silencing complex' (RISC), is essential, in which the Argonaute is the core protein. Many enzymes and protein partners are also required to construct a mature and functional miRNA after the precursor miRNA gene is expressed in the nucleus.

Although many RNA researchers are already aware of the importance of enzymes and other proteins in RNA Biology, the research efforts in this field are still heavily weighted towards the RNA side rather than toward the enzyme and protein side. Therefore, we set the editorial direction of our research topic towards the enzyme/protein side. Although we cannot list these enzymes as exhaustively as the biochemistry book mentioned above, we have collected 11 excellent review papers on this research topic. Many of them are from the field of tRNA research because we both work in this particular field. This is a pertinent direction because several important subjects, including RNA modification, RNA editing, intracellular RNA trafficking, and the regulation of RNA processing, can be exemplified in one type of RNA molecules, tRNA. Therefore, this e-book is focused on important concepts that hold true even if tRNA is replaced with another RNA molecule. The first half of the book considers enzymatic catalysis reactions in which RNAs are the substrates, and the second half considers RNA-binding proteins that control the fates of their target RNAs.

We begin with enzymes that modify the RNA termini. Tomita and Liu introduce a unique enzyme that forms a methyl cap on the 5′-monophosphate of tRNAHis, and Yashiro and Tomita review the structure–function relationships of the uridyltransferases that act on the 3′-termini of

**4**

Kanai and Yoshihisa Editorial: RNA Regulatory Enzymes

noncoding RNAs, such as pre-miRNAs. Interestingly, the TUT4/7 uridyltransferase alternates between a protective or degradative role for the *let-7* miRNA precursor according to the presence or absence of its the partner protein Lin28, respectively. Shigematsu et al. describe how the 2′-3′-cyclic-phosphate-forming RNases create a novel 'hidden' layer of the transcriptome. We then move on to enzymes that modify internal nucleotides, especially in tRNAs. Hori summarizes various tRNA-modification enzymes in thermophilic bacteria and their regulation. He explains the importance of the network of modified nucleotides to the stability of tRNA at high temperatures. Dixit et al. focus on the multimerization of the tRNA methyltransferase subunits to accommodate various tRNA substrates, and the interdependent actions of methyltransferase and deaminase. These modifications at or near the anticodon greatly affect decoding of a nearcognate codon corresponding to the same amino acid by a particular tRNA. Hou et al. explain the effects of m1 G37, next to the anticodon, on codon-specific translation and its utilization in Mg2+ homeostasis in bacteria. Some tRNAs are encoded by intron-containing genes, and their splicing is essential for their function. Hirata reviews the structural characteristics and evolutionary relationships of various types of archaeal tRNAsplicing endonucleases. Hopper and Nostramo describe the processing steps for eukaryotic small noncoding RNAs, especially tRNAs, ranging from terminal processing to tRNA-type splicing.

In the second half of the e-book, the proteins that interact with RNAs and their involvement in gene expression are examined. Important aspects here are the dynamics of intracellular movement and localisation of eukaryotic RNAs in addition to their funcational regulation. Hopper and Nostramo summarize how tRNAs move across the nuclear pore during their biogenesis in yeast, emphasizing the involvement of parallel pathways that mainly transport other classes of RNAs. We then address issues related to the transcription and translation of mRNAs, mainstream topics of RNA biology. First, Yokoyama reports that SL1, a subunit of the RNA polymerase I preinitiation complex for rRNA transcription, plays another role as a component of the super elongator complex required for transcription by RNA polymerase II, which is also related to the *MLL* oncogene. Beyond transcriptional control, mRNA translation itself provides another platform for regulation. Otsuka et al. describe recent progress in our understanding of the translational regulation by RNA-binding proteins, which recognize the AU-rich elements (ARE) that occur in many mRNAs. Some ARE-binding proteins regulate mRNA stability, whereas others control the initiation of translation. Organisms face various process malfunctions, and so they are equipped with sophisticated quality control systems to cope with such failures. These quality control mechanisms also function during translation. Ikeuchi et al. discuss the quality control mechanism induced by ribosomes stalled on mRNAs, and provide mechanistic insights into the ribosome-associated quality control that rescues stalled ribosomes on aberrant mRNA, preventing the production of unwanted proteins.

We discuss some RNA-related enzymes and RNA-interacting proteins in this e-book, retaining a focus on biochemical aspects. The proteins discussed above act on a limited but diverse set of target RNAs in their biogenesis, movement, function, and regulation, but we still do not have a full substrate list for each enzyme or protein. We hope that the issues discussed in this e-book have some bearing on the scientific interests of individual readers. We will be greatly pleased if this collection of review articles provides a useful guide to novel research directions.

### AUTHOR CONTRIBUTIONS

AK and TY wrote the manuscript.

### FUNDING

This work was supported in part by a JSPS KAKENHI Grant-in-Aid for Scientific Research (C) (grant number 17K07517) and by a JSPS KAKENHI Gran-in-Aid for Scientific Research on Innovative Areas "Hadean Bioscience" (grant number 26106003) (to A.K.), as well as by a JSPS KAKENHI Grant-in-Aids for Scientific Research (C) (grant number 17KT0113 & 17K07289) and by a JSPS KAKENHI Gran-in-Aid for Scientific Research on Innovative Areas "Nascent Chain Biology" (grant number 17H05672) (to T. Y.).

### REFERENCE

Roger, L., Adams, P., Knowler, J. T., and Leader, D. P. (1986). *The Biochemistry of the Nucleic Acids*. 10th Edition. London, New York: Chapman and Hall.

**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Kanai and Yoshihisa. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Human BCDIN3D Is a Cytoplasmic tRNAHis-Specific 5<sup>0</sup> -Monophosphate Methyltransferase

Kozo Tomita\* and Yining Liu

Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Japan

Bicoid interacting 3 domain containing RNA methyltransferase (BCDIN3D) is a member of the Bin3 methyltransferase family and is evolutionary conserved from worm to human. BCDIN3D is overexpressed in breast cancer, which is associated with poor prognosis of breast cancers. However, the biological functions and properties of BCDIN3D have been enigmatic. Recent studies have revealed that human BCDIN3D monomethylates 5 0 -monophsosphate of cytoplasmic tRNAHis in vivo and in vitro. BCDIN3D recognizes the unique and exceptional structural features of cytoplasmic tRNAHis and discriminates tRNAHis from other cytoplasmic tRNA species. Thus, BCDIN3D is a tRNAHis-specific 5<sup>0</sup> monophosphate methyltransferase. Methylation of the 5<sup>0</sup> -phosphate group of tRNAHis does not significantly affect tRNAHis aminoacylation by histidyl-tRNA synthetase in vitro nor the steady state level or stability of tRNAHis in vivo. Hence, methylation of the 5 0 -phosphate group of tRNAHis by BCDIN3D or tRNAHis itself may be involved in certain unknown biological processes, beyond protein synthesis. This review discusses recent reports on BCDIN3D and the possible association between 5<sup>0</sup> -phosphate monomethylation of tRNAHis and the tumorigenic phenotype of breast cancer.

Keywords: Bicoid interacting 3 domain containing RNA methyltransferase, methylation, tRNA, breast cancer, protein synthesis

### INTRODUCTION

Bicoid interacting 3 domain containing RNA methyltransferase (BCDIN3D) contains an S-(5<sup>0</sup> adenosyl)-L-methionine (AdoMet) binding motif, and is homologous to a conserved family of eukaryotic protein methyltransferases acting on RNA-binding proteins (Zhu and Hanes, 2000). The BCDIN3D is evolutionary conserved and has been identified in various animals from worms to human (Xhemalce et al., 2012), however, its biological properties and functions are unclear. BCDIN3D mRNA overexpression has been reported in human breast cancer cells, which is associated with cellular invasion and poor prognosis in triple-negative breast cancer (Liu et al., 2007; Yao et al., 2016). The molecular basis of involvement of BCDIN3D in the tumorigenic phenotype of breast cancer has remained elusive. This review discusses recent studies on human BCDIN3D. We describe herein that a specific tRNA for histidine (tRNAHis) is now identified as a primary target of BCDIN3D and discuss the association between the tumorigenic phenotype of breast cancer and the methylation of tRNAHis by BCDIN3D.

#### Edited by:

Akio Kanai, Keio University, Japan

#### Reviewed by:

Naoki Shigi, National Institute of Advanced Industrial Science and Technology (AIST), Japan Yohei Kirino, Thomas Jefferson University, United States

#### \*Correspondence:

Kozo Tomita kozo\_tomita@cbms.k.u-tokyo.ac.jp; kozo-tomita@edu.k.u-tokyo.ac.jp

#### Specialty section:

This article was submitted to RNA, a section of the journal Frontiers in Genetics

Received: 21 June 2018 Accepted: 18 July 2018 Published: 03 August 2018

#### Citation:

Tomita K and Liu Y (2018) Human BCDIN3D Is a Cytoplasmic tRNAHis-Specific 5<sup>0</sup> -Monophosphate Methyltransferase. Front. Genet. 9:305. doi: 10.3389/fgene.2018.00305

**6**

### HOW DOES HUMAN BCDIN3D RECOGNIZE SPECIFIC RNA?

Xhemalce et al. (2012) reported that BCDIN3D catalyzes dimethylation of 5<sup>0</sup> -monophosphate of specific precursor microRNAs (pre-miRNAs) (Xhemalce et al., 2012), such as tumor suppressor miR145 and miR23b (He et al., 2007; Shi et al., 2007; Sachdeva et al., 2009; Spizzo et al., 2010), using AdoMet as a methyl-group donor. Dimethylation of the 5<sup>0</sup> monophosphate of pre-miRNA nullifies the negative charge at the 5<sup>0</sup> -terminal of pre-miRNA. Since Dicer recognizes the negative charge at the 5<sup>0</sup> -terminal of pre-miRNAs for efficient and accurate cleavage (Park et al., 2011), the dimethylation of 5 0 -phosphate of pre-miRNA inhibits subsequent processing. Consequently, mature miRNAs are down-regulated. They also reported that the depletion of BCDIN3D mRNA by specific shRNAs suppressed the tumorigenic phenotype of MDA-MB231 breast cancer cells (Xhemalce et al., 2012). Therefore, it was proposed that BCDIN3D promotes the cellular invasion of breast cancer cells by downregulating tumor suppressor miRNAs through dimethylation of the 5<sup>0</sup> -phosphate group of the corresponding pre-miRNAs. However, there are no apparent common features including primary or secondary structures among the corresponding pre-miRNAs of downregulated miRNAs in breast cancer cells. Thus, the mechanisms by which BCDIN3D recognizes only a specific group of pre-miRNAs and downregulates mature miRNAs in breast cancer cells are unclear.

#### CYTOPLASMIC tRNAHIS IS CO-PURIFIED WITH BCDIN3D AND CONTAINS A 5 0 -MONOMETHYLMONOPHOSPHATE GROUP

To identify other potential RNA substrates of BCDIN3D in vivo and to elucidate the mechanism underlying the recognition and regulation of specific RNAs by BCDIN3D, recently, BCDIN3Dbinding RNAs in human HEK293T cells were analyzed (Martinez et al., 2017). When BCDIN3D, expressed in HEK293T cells, was purified from the cell extracts, a distinct 70–80-nucleotie-long RNA molecule was co-purified with BCDIN3D protein.

It was assumed that this co-purified RNA might be cytoplasmic tRNAHis (**Figure 1A**), since the nucleotide sequences of cytoplasmic tRNAHis from human and fruit fly reportedly contained a 5<sup>0</sup> -monomehtylphosphate group (Cooley et al., 1982; Rosa et al., 1983). Analysis of the RNA co-purified with BCDIN3D via RT-PCR and sequencing confirmed that cytoplasmic tRNAHis is co-purified with BCDIN3D from the cell extracts, but not other tRNAs, such as tRNAPhe. Subsequent direct analysis of the RNA via liquid chromatography and mass spectrometry (LC-MS) revealed that this RNA is cytoplasmic tRNAHis. Moreover, the 5<sup>0</sup> -monophosphate of cytoplasmic tRNAHis was fully monomethylated, but not dimethylated at all. Furthermore, 5<sup>0</sup> monophsophate of tRNAHis is reportedly fully monomethylated even under normal physiological conditions in HEK293T cells, as observed previously in cytoplasmic tRNAHis from HeLa cells (Rosa et al., 1983).

### CYTOPLASMIC tRNAHIS IS METHYLATED BY BCDIN3D IN VITRO

The enzymatic activity of recombinant human BCDIN3D expressed in E. coli was examined using human cytoplasmic tRNAHis transcript as a substrate and S-(5<sup>0</sup> -adenosyl)- L- methionine (SAM) as a methyl-group donor in vitro (Martinez et al., 2017). Cytoplasmic tRNAHis transcript was reportedly efficiently methylated by BCDIN3D in vitro; however, unexpectedly, human pre-miR-145, which was previously reportedly dimethylated by BCDIN3D (Xhemalce et al., 2012), is hardly methylated under the same conditions assessed. The reaction products were further analyzed via LC-MS and it was confirmed that BCDIN3D monomethylates 5<sup>0</sup> -monophosphate of cytoplasmic tRNAHis . Almost 100% of the tRNAHis reaction product comprised 5 0 -monomethylphosphate. Moreover, BCDIN3D does not dimethylate 5<sup>0</sup> -monophosphate of tRNAHis or pre-miR145 in vitro. Steady-state kinetics of methylation of these RNA substrates revealed that cytoplasmic tRNAHis is a greater than 2–3 orders of magnitude better substrate than pre-miR145.

BCDIN3D reportedly dimethylates pre-miR145 (Xhemalce et al., 2012). However, only a small fraction (less than 1%) of pre-miR145 substrates was methylated at the reaction end points (Xhemalce et al., 2012). The lower methylation of pre-miR145 by BCDIN3D is consistent with that in the recent study (Martinez et al., 2017). Furthermore, BCDIN3D reportedly transfers two methyl-groups from SAM to 5<sup>0</sup> monophosphate of pre-miR145 (Xhemalce et al., 2012). This is inconsistent with the recent findings of Martinez et al. (2017), wherein neither tRNAHis nor pre-miR145 are dimethylated by BCDIN3D in vitro. Perhaps, the efficiency of dimethylation of 5<sup>0</sup> -phosphate of pre-miR145 by BCDIN3D would be much lower than that of monomethylation of 5<sup>0</sup> -phosphate of premiR145 in vitro. The previously observed methylation of pre-miR145 (Xhemalce et al., 2012) is at baseline levels, as compared with that in cytoplasmic tRNAHis, and is not significant.

### CYTOPLASMIC tRNAHIS IS METHYLATED BY BCDIN3D IN VIVO

BCDIN3D-knockout HEK293T cells, established via CRISPR/Cas9 editing, are viable, although they exhibit a slightly reduced growth rate than the parental cells (Martinez et al., 2017). Cytoplasmic tRNAHis isolated from BCDIN3Dknockout cells completely lost their methyl moiety at the 5 0 -monophosphate group, as evident from LC-MS. Exogenous expression of BCDIN3D in the BCDIN3D-knockout cell restored the 5<sup>0</sup> -monomethylphosphate modification of

cytoplasmic tRNAHis. Thus, the BCDIN3D is responsible for the monomethylation of 5<sup>0</sup> -monophosphate of cytoplasmic tRNAHis in HEK293T cells under normal physiological conditions. In BCDIN3D-knockout cells, no other RNAs except for cytoplasmic tRNAHis, are significantly methylated by recombinant BCDIN3D in vitro (Martinez et al., 2017).

A recent study using HEK293T cells reported that BCDIN3D-knockout or BCDIN3D overexpression do not alter mature miR145 expression levels (Martinez et al., 2017), concurrent with the recent in vitro results showing that BCDIN3D does not dimethylate the 5<sup>0</sup> -monophosphate group of either tRNAHis or pre-miR145. Only pre-miRNA with 5<sup>0</sup> -dimethylated phosphate, but not pre-miRNA with 5 0 -monomethylated phosphate, is processed at lower levels by Dicer (Xhemalce et al., 2012). These observations also suggest that BCDIN3D does not dimethylate the 5<sup>0</sup> -phosphate group of pre-miR145.

Together with the recent results of in vitro methylation assays using recombinant BCDIN3D and tRNAHis transcript and the pre-miR145 transcript (Martinez et al., 2017), the primary target of BCDIN3D is cytoplasmic tRNAHis rather than premiRNAs. BCDIN3D displays monomethylation activity on the 5 0 -phosphate group of RNA. Considering the significantly lower activity of BCDIN3D toward pre-miR145 and that miR145 expression is not regulated by BCDIN3D in vivo, methylation of pre-miR145 probably does not occur in HEK293T cells. Under certain biological process or specific conditions in breast cancer cells, BCDIN3D might recognize specific pre-miRNAs, such as pre-miR145, through the regulatory factors which assist BCDIN3D in recognizing specific RNA species. Elucidation of the regulatory mechanism of specific pre-miRNA (di)methylation process by BCDIN3D in breast cancer cells awaits further study.

## tRNAHIS RECOGNITION BY BCDIN3D

Human cytoplasmic tRNAHis is matured through unique processes and has unique structural features among cytoplasmic tRNA species (Jackman et al., 2012; Betat et al., 2014; **Figure 1B**). After transcription by RNA polymerase-III, the 5<sup>0</sup> -leader and 3<sup>0</sup> tailer sequences of precursor tRNAHis are cleaved. Thereafter, a single guanosine residue (G) is attached to the 5<sup>0</sup> -end (at position -1) in the 30–5<sup>0</sup> direction by tRNAHis-specific guanylyltransferase (Thg1) (Gu et al., 2003; Jackman and Phizicky, 2006; Jackman et al., 2012) and the CCA is added at the 3<sup>0</sup> -end (positions 74–76) (Tomita and Yamashita, 2014). Consequently, the mature form of cytoplasmic tRNAHis has an 8-nucleotide-long acceptor helix with G−1:A<sup>73</sup> mis-paring at the top the helix, while other cytoplasmic tRNAs have 7-nucleotide-long acceptor helices.

In vitro steady-state kinetics of methylation of mutant cytoplasmic tRNAHis transcripts by recombinant BCDIN3D revealed that BCIDN3D recognizes G−1, G−1:A<sup>73</sup> mis-pairing at the top of the acceptor stem and 8-nucleotide-long extended acceptor helix. The minihelix of tRNAHis is also methylated efficiently by BCIN3D. Thus, BCDIN3D recognizes the unique structural features of cytoplasmic tRNAHis, especially in the tophalf region of tRNAHis, and discriminates cytoplasmic tRHAHis from other tRNA species (Martinez et al., 2017).

The structure of human BCDIN3D is still unclear. The amino acid sequence of BCDIN3D is homologous to that of the catalytic domain of methylphosphate capping enzyme (MePCE), which uses SAM to transfer a methyl group onto the γ-phosphate of the

5 0 -guanosine of 7SK RNA (Jeronimo et al., 2007; Shuman, 2007). Structural modeling of human BCDIN3D using the catalytic domain of MePCE and possible tRNA-binding modeling suggest that BCDIN3D recognizes the acceptor stem of tRNAHis and measures the length of acceptor helix of tRNAHis (**Figure 2A**). Only the 5<sup>0</sup> -end of tRNA with an 8-nucleotide-long acceptor helix and G−1:A<sup>72</sup> mis-pairing at the top the acceptor helix could enter the catalytic pocket of BCDIN3D, and the 5<sup>0</sup> -phosphate would be monomethylated. The mechanism underlying the recognition of tRNA by BCDIN3D would differ from those for tRNA recognition by the CCA-adding enzymes (Tomita et al., 2004; Yamashita et al., 2014, 2015; Yamashita and Tomita, 2016),which recognize the T9C loop of tRNA.

The kinetics of methylation of tRNAHis mutants and the structural model of BCDIN3D and tRNAHis complex also suggest that human BCDIN3D is tRNAHis-specific 5<sup>0</sup> -monophosphate methyltransferase, and cytoplasmic tRNAHis is a primary target of human BCDIN3D.

#### THE BIOLOGICAL ROLE OF 5 0 -METHYLATION OF CYTOPLASMIC tRNAHIS

The biological role of 5<sup>0</sup> -monomethylphosphate of cytoplasmic tRNAHis remains unclear. While the 5<sup>0</sup> -monomethylphosphate of cytoplasmic tRNAHis decreases the affinity of tRNAHis toward histidyl-tRNA synthetase (Martinez et al., 2017), as expected from the complex structure of bacterial histidyl-tRNA synthetase with tRNAHis (Tian et al., 2015) and biochemical evaluation (Fromant et al., 2000), the overall aminoacylation efficiency is not affected by the modification. The steady-state level of cytoplasmic tRNAHis in BCDIN3D-knockout cells and its parental HEK293T cells are not significantly different. Furthermore, the stabilities of cytoplasmic tRNAHis from HEK293T cells and BCDIN3Dknockout cells after treatment with actinomycin-D do not show significant differences (Martinez et al., 2017). However, 5 0 -monomethylmonophosphate protects cytoplasmic tRNAHis from degradation in vitro in cytoplasmic cell extracts. Thus, methylation of 5<sup>0</sup> -monophosphate of cytoplasmic tRNAHis might be involved in its stability under specific conditions or in certain biological processes.

## PERSPECTIVE

The correlation between methylation of the 5<sup>0</sup> -monophosphate group of cytoplasmic tRNAHis and tumorigenic phenotype of breast cancer remains unknown (**Figure 2B**). tRNAs are involved in various biological process in cells (Sobala and Hutvagner, 2011; Raina and Ibba, 2014; Megel et al., 2015; Kumar et al., 2016; Park and Kim, 2018; Schimmel, 2018) beyond their established functions, as adaptors in protein synthesis.

In breast cancer cells, initiator tRNAMet is reportedly upregulated (Pavon-Eternod et al., 2013). Furthermore, in highly metastatic breast cancer cells, the upregulation of specific tRNAs, such as tRNAGluUUC and tRNAArgCCG, stabilizes mRNAs containing the corresponding codons and enhances translation (Goodarzi et al., 2016). However, knockout of BCDIN3D in HEK293T does not affect the steady-state level of cytoplasmic tRNAHis (Martinez et al., 2017). Thus, 5<sup>0</sup> -monophosphate methylation of cytoplasmic tRNAHis would not enhance the translation of specific mRNAs, although this warrants further investigation. Small RNA fragments have reportedly been derived from tRNAs, i.e., tRNA fragments (tRFs), and participate in various cellular functions (Sobala and Hutvagner, 2011; Raina and Ibba, 2014; Kumar et al., 2016). Under various cellular stress conditions,

tRFs are often produced (Ivanov et al., 2011, 2014; Durdevic et al., 2013; Durdevic and Schaefer, 2013; Gebetsberger and Polacek, 2013). In breast and prostate cancer, specific tRNAs, such as cytoplasmic tRNALys and tRNAHis, are cleaved by angiogenin, and the tRNA half fragments are abundantly expressed in a sex hormone-dependent manner (Honda et al., 2015). These tRNA half fragments also promote proliferation of breast and prostate cancer cells by a yet unknown mechanism. In human and mouse cells, 3<sup>0</sup> - or 5<sup>0</sup> - terminal tRFs (3<sup>0</sup> -tRF or 5<sup>0</sup> -tRF) are produced and accumulate in an asymmetric manner. These tRFs associate with Ago2 and the tRFs probably serve as typical miRNAs. The 3 0 -tRF, but not the 5<sup>0</sup> -tRF, derived from cytoplasmic tRNAHis is complementary to human endogenous retroviral sequences in the genome (Li et al., 2012).

It would be noteworthy to assume that 5<sup>0</sup> -monomethylation of 5<sup>0</sup> -phosphate of tRNAHis regulates the expression of the tRNA half fragments and/or tRFs derived from tRNAHis in breast cancer cells or under specific biological or stress conditions. The production of tRNA half fragments and/or tRFs, in turn, might regulate the genes involved in tumorigenesis in breast cancers. Future studies are required to understand whether methylation

### REFERENCES


of the 5<sup>0</sup> -phosphate group of tRNAHis by BCDIN3D is involved in the tumorigenic phenotype of breast cancer and other cancers and to potentially elucidate the unknown functions of tRNAs, beyond their established functions.

### AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

### FUNDING

The study in our laboratory was supported in part by grants to KT from the Funding Program for Next Generation World-Leading Researchers of JSPS, by Grants-in-Aid for Scientific Research (A), and Grant-in-Aid for Scientific Research on Innovative Areas from JSPS, Takeda Science Foundation, Japan Foundation for Applied Enzymology, Terumo Foundation for Life Science and Art, and Princess Takamatsu Cancer Research Foundation.



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Tomita and Liu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Function and Regulation of Human Terminal Uridylyltransferases

#### Yuka Yashiro and Kozo Tomita\*

Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Japan

RNA uridylylation plays a pivotal role in the biogenesis and metabolism of functional RNAs, and regulates cellular gene expression. RNA uridylylation is catalyzed by a subset of proteins from the non-canonical terminal nucleotidyltransferase family. In human, three proteins (TUT1, TUT4, and TUT7) have been shown to exhibit template-independent uridylylation activity at 3<sup>0</sup> -end of specific RNAs. TUT1 catalyzes oligo-uridylylation of U6 small nuclear (sn) RNA, which catalyzes mRNA splicing. Oligo-uridylylation of U6 snRNA is required for U6 snRNA maturation, U4/U6-di-snRNP formation, and U6 snRNA recycling during mRNA splicing. TUT4 and TUT7 catalyze mono- or oligo-uridylylation of precursor let-7 (pre–let-7). Let-7 RNA is broadly expressed in somatic cells and regulates cellular proliferation and differentiation. Mono-uridylylation of pre–let-7 by TUT4/7 promotes subsequent Dicer processing to up-regulate let-7 biogenesis. Oligouridylylation of pre–let-7 by TUT4/7 is dependent on an RNA-binding protein, Lin28. Oligo-uridylylated pre–let-7 is less responsive to processing by Dicer and degraded by an exonuclease DIS3L2. As a result, let-7 expression is repressed. Uridylylation of pre– let-7 depends on the context of the 3<sup>0</sup> -region of pre–let-7 and cell type. In this review, we focus on the 3<sup>0</sup> uridylylation of U6 snRNA and pre-let-7, and describe the current understanding of mechanism of activity and regulation of human TUT1 and TUT4/7, based on their crystal structures that have been recently solved.

#### Keywords: terminal uridylyltransferase, TUTase, TUT1, TUT4/7, U6 snRNA, let-7, biogenesis, splicing

## INTRODUCTION

Modification of the 3<sup>0</sup> -end of RNA by template-independent nucleotide addition is a posttranscriptional modification that plays important regulatory roles in gene expression. A well-known example of 3<sup>0</sup> -end modification is the addition of CCA to the 3<sup>0</sup> -end of tRNA at positions 74–76 by CTP:(ATP)-tRNA nucleotidyltransferase (CCA-adding enzyme) and related enzymes (Deutscher, 1990; Tomita and Weiner, 2001, 2002; Weiner, 2004). CCA-addition to the 3<sup>0</sup> -end of tRNA is required for amino acid attachment to the 3<sup>0</sup> -terminus of tRNA by aminoacyl-tRNA synthetases (Sprinzl and Cramer, 1979), and also for peptide bond formation on the ribosome (Green and Noller, 1997; Kim and Green, 1999; Nissen et al., 2000). Further, CCA-addition to the 3<sup>0</sup> -end of tRNA is involved in the quality control of dysfunctional tRNAs. Dysfunctional tRNA molecule with

#### Edited by:

Akio Kanai, Keio University, Japan

#### Reviewed by:

Tohru Yoshihisa, University of Hyogo, Japan John P. Hagan, University of Texas Health Science Center at Houston, United States Yuka W. Iwasaki, Keio University, Japan

#### \*Correspondence:

Kozo Tomita kozo\_tomita@cbms.k.u-tokyo.ac.jp; kozo-tomita@edu.k.u-tokyo.ac.jp

#### Specialty section:

This article was submitted to RNA, a section of the journal Frontiers in Genetics

Received: 10 July 2018 Accepted: 24 October 2018 Published: 12 November 2018

#### Citation:

Yashiro Y and Tomita K (2018) Function and Regulation of Human Terminal Uridylyltransferases. Front. Genet. 9:538. doi: 10.3389/fgene.2018.00538

**12**

**Abbreviations:** CM, catalytic module; CPSF, cleavage and polyadenylation specificity factor; dsRNA, double-stranded RNA; ISL, internal stem-loop; KA-1, kinase associated-1; LIM, Lin28-interacting module; miRNA, microRNA; PAP, poly(A) polymerase; RRM, RNA recognition motif; snRNA, small nuclear RNA; snRNP, small nuclear ribonucleoprotein; TUTase, terminal uridylyltransferase; ZF, zinc finger; ZK, zinc knuckle.

an unstable acceptor stem is modified by CCACCA addition, and the CCACCA tail serves as a degradation signal for cellular RNA decay machinery (Wilusz et al., 2011; Betat and Morl, 2015; Kuhn et al., 2015). Another well-known example of template-independent nucleotide addition to the 3<sup>0</sup> -end of RNA is polyadenylation of mRNA by a canonical PAP. Polyadenylation of mRNA regulates mRNA stability, mRNA export from the nucleus to cytoplasm, and translation initiation in eukaryotes (Beelman and Parker, 1995; Sachs et al., 1997; Wahle and Rüegsegger, 1999; Edmonds, 2002; Moore and Proudfoot, 2009). Polyadenylation of mRNA also regulates degradation of mRNA in eubacteria (Carpousis et al., 1999; Dreyfus and Régnier, 2002; Régnier and Hajnsdorf, 2009).

Detailed mechanism of polyadenylation by canonical PAPs (Bard et al., 2000; Martin et al., 2000; Balbo and Bohm, 2007), and that of CCA-addition by CCA-adding enzymes and related enzymes have been clarified in the last two decades (Li et al., 2002; Okabe et al., 2003; Xiong et al., 2003; Tomita et al., 2004, 2006; Xiong and Steitz, 2004, 2006; Toh et al., 2008, 2009, 2011; Pan et al., 2010; Tomita and Yamashita, 2014; Yamashita et al., 2014, 2015; Yamashita and Tomita, 2016). However, a new family of PAPs, non-canonical PAPs, have emerged, with the fission yeast cytoplasmic PAP, Cid1, first identified as a non-canonical PAP (Wang et al., 2000), which was later revealed to be a terminal uridylyltransferase (Rissland et al., 2007). Non-canonical PAPs are conserved and play important roles in gene expression in various eukaryotes, from yeast to human (Stevenson and Norbury, 2006; Norbury, 2010; Scott and Norbury, 2013; Lee et al., 2014; De Almeida et al., 2018). Phylogenetic distribution of non-canonical PAPs in eukaryotes has recently described (Chang et al., 2018). The family of proteins share the catalytic domain with canonical PAPs but contain different ribonucleotide base recognition motifs (Martin and Keller, 2007). As a result, some of the non-canonical PAPs bearing histidine insertion in the ribonucleotide base recognition motif use UTP as a substrate and function as TUTases (Kwak and Wickens, 2007; Rissland et al., 2007; Mullen and Marzluff, 2008; Wickens and Kwak, 2008).

Various classes of RNAs, including mRNA, miRNA and snRNA, are uridylylated by non-canonical terminal nucleotidyltransferase family of enzymes. In Trypanosome mitochondria, uridylylation is required for guide RNA maturation (Aphasizhev et al., 2016). Uridylylation is also important for regulation of small RNA expression. In Drosphila melanogaster, a TUTase named Tailor prevents biogenesis of mirtron (Bortolamiol-Becet et al., 2015; Reimao-Pinto et al., 2015; Rissland, 2015), while uridylation serves as a degradation marker for small RNAs in various organisms (De Almeida et al., 2018). In addition, uridylylation also facilitates mRNA decay. Uridylylation-mediated mRNA degradation contributes to cellular mRNA metabolism and also is involved in maternal mRNA clearance during maternal to zygotic transition (Scott and Norbury, 2013; Lee et al., 2014; Lim et al., 2014; Morgan et al., 2017; Chang et al., 2018). Thus, uridylylation of RNA 3 0 -ends plays a pivotal role in the biogenesis and metabolism of functional RNAs, facilitating regulation of gene expression. The detailed functions of uridylylation were recently reviewed (De Almeida et al., 2018; Menezes et al., 2018).

In human, seven non-canonical nucleotidyltransferases have been identified, with diverse cellular functions (Stevenson and Norbury, 2006; Martin and Keller, 2007; Wilusz and Wilusz, 2008). In this review, we use the updated HUGO-approved nomenclature to refer those enzymes, as HUGO-approved gene symbols for those non-canonical terminal nucleotidyltransferases have been recently changed (**Figure 1**). Among the seven human non-canonical terminal nucleotidyltransferases, four enzymes show adenylyltransferase activity. MTPAP is a mitochondorial PAP, which regulates stability of mitochondrial mRNAs (Tomecki et al., 2004; Nagaike et al., 2005). TENT2 adenylates selected mRNAs and miRNAs in cytoplasm (Kwak et al., 2004; Nagaike et al., 2005; Katoh et al., 2009; Glahder and Norrild, 2011; D'Ambrogio et al., 2012), while TENT4A and TENT4B add poly(A) to various classes of nuclear RNAs and involve in RNA degradation as a subunit of a TRAMP-like complex (Berndt et al., 2012; Ogami et al., 2013; Sudo et al., 2016). TENT4A and TENT4B have also recently been shown to be responsible for mRNA guanylylation (Lim et al., 2018).

The other three enzymes (TUT1, TUT4, and TUT7) are the TUTases that mediate template-independent uridylylation at the 3 0 -end of RNAs in the human cells. TUT1 is a nuclear TUTase and required for maturation process of the 3<sup>0</sup> -end of U6 snRNA (Scott and Norbury, 2013; Lee et al., 2014; Lim et al., 2014). On the other hand, TUT4 and TUT7 mainly localize in cytoplasm, and they are involved in various cellular processes, including regulation of miRNA biogenesis, surveillance for defective noncoding RNAs, replication dependent decay of poly(A)- histone mRNAs, and degradation of poly(A) + mRNAs (De Almeida et al., 2018; Menezes et al., 2018). In addition to those regulatory roles, TUT4 and TUT7 are also reported to uridylylate viral RNAs and LINE-1 mRNAs and act as immune system against genomic invasion (Le Pen et al., 2018; Warkocki et al., 2018; Yeo and Kim, 2018).

Recently, the crystal structures of human TUTases, TUT1 and TUT7, have been reported, and together with the biochemical studies of these enzymes, the molecular bases of uridylylation of 3 0 -end of specific RNAs have been proposed (Faehnle et al., 2017; Yamashita et al., 2017). In the current review, we describe the molecular mechanism and regulation of uridylylation of specific RNAs by human TUT1 and TUT7, based on their structures.

### TUT1: OLIGOURIDYLYLATION OF U6 snRNA

### Biogenesis of U6 snRNA

Pre-mRNA splicing in eukaryotes is catalyzed by the spliceosome composed of five small ribonucleoprotein complexes (U1, U2, U4, U5, and U6 snRNPs) and a large number of proteins (Will and Luhrmann, 2011). U6 snRNP is composed of U6 snRNA, p110 (hPrp24), and heteroheptameric Lsm2–8 ring proteins. Proteins p110 and Lsm2–8 promote the annealing of U6 and U4 snRNAs for U4/U6 di-snRNP formation (Jandrositz and Guthrie, 1995; Raghunathan and Guthrie, 1998; Achsel et al., 1999). U5 snRNP joins the U4/U6 di-snRNP to form U4/U6•U5 tri-snRNP. The U4/U6•U5 tri-snRNP is recruited to the pre-spliceosome, composed of pre-mRNA, and U1 and U2 snRNPs. U6 snRNA

respectively. RNA recognition motif (RRM), is shown as a green box. The figure is modified from Heo et al. (2009) and Lee et al. (2014).

forms an alternative helix with the U2 snRNA, following which two-step splicing reaction proceeds, accompanying the structural rearrangements of U6 snRNA in the spliceosome. In base-paired U6-U2 snRNAs, U6 snRNA participates in active-site formation and divalent cation coordination for the catalysis of splicing (Fica et al., 2013).

U6 snRNA is transcribed by RNA polymerase III and undergoes multiple maturation processes (Wilusz and Wilusz, 2013). The U6 snRNA transcript has a 5<sup>0</sup> -stem, ISL, and telestem secondary structures (Rinke and Steitz, 1985; Karaduman et al., 2006; **Figure 2A**). The U6 snRNA primary transcript contains four genome-encoded 3<sup>0</sup> -end uridines (U4-OH) (**Figure 2B**). After transcription, the 3<sup>0</sup> -end is oligo-uridylylated by TUT1 (Trippe et al., 1998; Trippe et al., 2006). Then, the oligouridylylated tail of U6 snRNA is trimmed by a 30–5<sup>0</sup> exonuclease, Mpn1 (Usb1) (Mroczek et al., 2012; Shchepachev et al., 2012; Hilcenko et al., 2013). The 3<sup>0</sup> -end of the mature U6 snRNA has five uridines capped with a 2<sup>0</sup> ,30 -cyclic phosphate (U4–U > p), which protects U6 snRNA from degradation.

The oligo-uridylylated tail of U6 snRNA is the binding site for the Lsm2–8 complexes (Achsel et al., 1999; Vidal et al., 1999); for the annealing of U6 and U4 snRNAs to form di-U4/U6 snRNP; and for the recycling of U6 snRNA after the splicing reaction (Bell et al., 2002). Thus, 3<sup>0</sup> -oligo-uridylylation of U6 snRNA by TUT1 contributes to efficient pre-mRNA splicing in cells. Human TUT1 was originally identified as a U6 snRNA–specific TUTase (Trippe et al., 2003; Trippe et al., 2006). Subsequently, it was also reported that TUT1 can function as a PAP acting with specific mRNAs under specific conditions (Mellman et al., 2008).

### Structure of Human TUT1

Recently, the crystal structures of human TUT1, and its complexes with UTP or ATP have been reported (Yamashita et al., 2017). These were the first structures of a TUTase from a higher eukaryote. Human TUT1 is a multi-domain protein composed of an N-terminal ZF, N-terminal RRM, a catalytic motif in the middle, and an uncharacterized C-terminal domain (Trippe et al., 2006). The catalytic motif is composed of nucleotidyltransferase domain and PAP-associated domain (**Figure 1**). Since crystals of full-length human TUT1 protein could not be obtained, truncated forms of TUT1 protein were crystallized and their structures were determined.

TUT1 (TUT1\_delN), lacking N-terminal ZF and RRM, consists of three domains: the catalytic palm and finger domains, and an additional distinct domain linked to the C-terminus of the protein (**Figure 3A**). The C-terminal region of TUT1 is the previously unidentified RNA-binding domain, named KA-1 domain.


,30


The overall structure of the catalytic core palm and finger domains of TUT1 shares topological homology with those of yeast Cid1 and vertebrate mitochondrial PAP (Bai et al., 2011; Lunde et al., 2012; Munoz-Tello et al., 2012; Yates et al., 2012; Lapkouski and Hallberg, 2015). The palm domain of human TUT1 consists of five-stranded β-sheets and two α-helices, and three catalytic carboxylates (Asp216, Asp218, and Asp381). The structure of TUT1 palm domain shares homology with those of DNA polymerase β family proteins (Aravind and Koonin, 1999). The finger domain has a helical structure with ten α-helices and three α-sheets, and is homologous to the central domain of PAPα (Bard et al., 2000; Martin et al., 2000). The incoming nucleotide is located in the cleft between the palm and fingers.

snRNA is trimmed by Usb1. Mature U6 snRNA harbors five 3<sup>0</sup>

The C-terminal domain of TUT1 consists of four anti-parallel β-sheets and five α-helices (**Figure 3B**). This domain shares topological homology with the KA-1 domain of various proteins (Moravcevic et al., 2010). Structure of another crystal form of TUT1\_delN suggests that the KA-1 domain can rotate by approximately 40 degrees with respect to the catalytic core domains, using α14 as the axis of rotation (**Figure 3C**). In the TUT1 structure lacking C-terminal KA-1 and N-terminal ZF domains, the N-terminal RRM adapts a typical RRM fold (Kenan et al., 1991), with four anti-parallel β-sheets stacked onto two α-helices. The RRM is connected to the catalytic domain by a flexible linker (**Figure 3A**). Thus, the N-terminal RRM and ZF are mobile in the RNA substrate-free form of TUT1.

### Nucleotide Recognition by TUT1

The structures of TUT1 in complex with either UTP or ATP have been reported (**Figure 3D**). Both UTP and ATP reside in the cleft between the palm and finger domains. In the structure of UTP-bound TUT1, the uracil base is sandwiched between Tyr432 and the side chain of Arg366. The O<sup>2</sup> and O<sup>4</sup> atoms of UTP form hydrogen bonds with Asn392 and His549, respectively. The N<sup>3</sup> atom of UTP forms a hydrogen bond with a water molecule that also forms a hydrogen bond with Asp543. In the ATP-bound structure, only the N<sup>1</sup> atom of the adenine base of ATP forms a hydrogen bond with His549. The mechanism of nucleotide recognition by TUT1 and the specificity of TUT1 are essentially the same as those of yeast Cid1 (Lunde et al., 2012; Munoz-Tello et al., 2012; Yates et al., 2012). Human TUT1 incorporates UMP more efficiently than AMP into U6 snRNA transcript ending with four uridines. The steady-state kinetics of nucleotide incorporation into U6 snRNA indicate that UTP is a much better substrate of TUT1 than ATP (around 700-fold) (Yamashita et al., 2017).

### Domain Requirement for U6 snRNA Recognition by TUT1

The structure of human TUT1-U6 snRNA complex is not yet available. However, recent biochemical studies using full-length and truncated human TUT1 variants suggest that U6 snRNA is recognized by multiple domains of TUT1 (Yamashita et al., 2017). Human TUT1 possesses additional domains compared with the yeast Cid1 structure. TUT1 is composed of N-terminal ZF, RRM, palm, finger, and KA-1 domains (**Figures 1**, **3A**). The domain organization of TUT1 is also different from those of other human non-canonical terminal nucleotidyltransferase families, although the structures of catalytic domains are homologous (Stevenson and Norbury, 2006; Martin and Keller, 2007; Wilusz and Wilusz, 2008).

Steady-state kinetics revealed that human TUT1 variants lacking the N-terminal ZF domain (1Z), lacking both the ZF and RRM domains, (1ZR), or lacking the KA-1 domain (1KA-1) exhibit reduced uridylylation of U6 snRNA transcript. The K<sup>m</sup> values of U6 snRNA for 1Z and 1ZR are ca. 5-folds higher than

that for wild-type TUT1. The overall uridylylation efficiencies of 1Z and 1ZR are less than 0.2% that of wild-type TUT1, and their reduced activities are associated with reduced catalytic efficiencies. Thus, the N-terminal ZF and RRM domains might assist in the proper positioning of the 3<sup>0</sup> -end of U6 snRNA within the catalytic site for catalysis. The K<sup>m</sup> value of U6 snRNA for 1KA-1 is about 10-folds higher than that of wild-type TUT1, and the overall uridylylation efficiency of 1KA-1 is ca. 20% that of wild-type TUT1. Hence, the C-terminal KA-1 domain increases TUT1 affinity for U6 snRNA at the UMP-incorporation stage.

The KA-1 domain of TUT1 is conserved among vertebrates, with positively charged clusters on the KA-1 surface (**Figure 4A**). The KA-1 domain itself is able to bind RNA, and substitutions of positively charged amino acids in the KA-1 domain to alanine reduce or abolish the RNA-binding activity. Thus, the previously unidentified C-terminal domain, KA-1, is an RNAbinding domain involved in U6 snRNA recognition, together with the N-terminal ZF and RRM domains. The N-terminal RRM is mobile relative to the catalytic core domains, and the C-terminal KA-1 rotates relative to the catalytic core domains (**Figures 3A,C**). Thus, at the UMP-incorporation stage, the domain movements would be coupled with the recognition of U6 snRNA.

### Interaction Between TUT1 and U6 snRNA, and Oligo-Uridylylation

TUT1 tightly interacts with U6 snRNA in vivo. The interactions between U6 snRNA and TUT1, and TUT1 truncated variants were recently analyzed by using Tb(III) hydrolysis mapping (Walter et al., 2000), and the protection patterns for U6 snRNA in the presence and absence of TUT1 protein and its variants were assessed. These studies demonstrated the TUT1 domain requirements for U6 snRNA recognition, as well as structural changes of U6 snRNA upon TUT1 binding.

oligo-uridylylation. N-terminal ZF and RRM (orange), catalytic palm and fingers (green), and C-terminal KA-1 (cyan).

U6 snRNA is recognized by multiple domains of TUT1 (**Figure 4B**; Yamashita et al., 2017). The N-terminal ZF and RRM domains of TUT1 interact with the single-stranded 5<sup>0</sup> -end of U6 snRNA, and the KA-1 domain interacts with the bulging loops. The core catalytic domain binds tightly to the double-stranded telestem region, and the 3<sup>0</sup> -region of U6 snRNA remains singlestranded. Almost the entire U6 snRNA sequence is recognized by the mobile N-terminal RNA-binding domain and the C-terminal KA-1 domain, cooperatively with the catalytic core domain. The recognition of U6 snRNA by TUT1 is coupled with domain movements and structural changes of U6 snRNA. In particular, interaction with TUT1 induces conformational changes in the 3 0 -ISL and the bulging loop of U6 snRNA.

The presence of N-terminal and C-terminal RNA-binding domains prevents U6 snRNA from dislodging from the enzyme surface during uridylylation reaction (**Figure 4C**). The C-terminal KA-1 of TUT1 might function as an anchor of the U6 snRNA molecule during oligo-uridylylation. TUT1 lacking C-terminal KA-1 or protein variants with substitutions of the positively charged residues in KA-1 (**Figure 4A**) add a relatively small number of UMPs (–2 nts) compared with wild-type TUT1 (–5 nts) (Yamashita et al., 2017). Absence of the KA-1 domain or loss of KA-1 RNA-binding activity would allow U6 snRNA to translocate easily on the enzyme surface. Following incorporation of several UMP molecules at the 3<sup>0</sup> -end of U6 snRNA by a series of open-to-closed conformation cycles of the catalytic domain (Munoz-Tello et al., 2014; Yates et al., 2015), the 3<sup>0</sup> region of the oligo-uridylylated tail would be compressed within the active pocket of TUT1. Consequently, the 3<sup>0</sup> -end of U6 snRNA would no longer relocate to the active site. Finally, TUT1 terminates RNA synthesis and oligo-uridylylated U6 snRNA is released from the enzyme, as observed in the mechanism of RNA synthesis termination by tRNA nucleotidyltransferases (Tomita and Yamashita, 2014; Yamashita et al., 2014, 2015; Yamashita and Tomita, 2016).

### TUT1 Can Function as a PAP

fgene-09-00538 November 9, 2018 Time: 17:9 # 7

While TUT1 has been originally identified as a U6 snRNAspecific TUTase (Trippe et al., 1998, 2003, 2006), TUT1 also reportedly functions as a PAP, acting on specific mRNAs under oxidative stress conditions (Mellman et al., 2008). TUT1 interacts with phosphatidylinositol 4-phosphate 5-kinase Iα (PIPKIα) and its PAP activity is also activated by phosphatidylinositol 4,5 bisphosphate (PIns4,5P2) in vitro (Mellman et al., 2008; Mohan et al., 2015). Upon oxidative stress, TUT1 is recruited into the CPSF complex for the polyadenylation of specific oxidative-stress response mRNAs (Mellman et al., 2008; Laishram and Anderson, 2010). The PAP activity of TUT1 is also activated by several protein kinases (Gonzales et al., 2008; Laishram et al., 2011; Mohan et al., 2015).

The structure of TUT1-ATP complex revealed that the adenine base forms only one hydrogen bond with His549 (**Figure 3D**). Biochemical analysis indicated that TUT1 has a lower affinity for ATP than for UTP in vitro (Yamashita et al., 2017). The interaction of TUT1 with other factors and/or its phosphorylation by several kinases might promote CPSF complex formation at specific mRNAs. Since the KA-1 domain of MARK-3 binds to phospholipids (Moravcevic et al., 2010), the mobile KA-1 domain of TUT1 might interact with PIns4,5P2. This interaction might regulate the activity or localization of TUT1, and TUT1 recruitment to the CPSF complex might induce allosteric structural changes of TUT1 nucleotide-binding pocket to accommodate ATP. Thus, TUT1 might be able to add poly(A) tails to specific mRNAs under specific biological conditions. Detailed mechanism of the alteration of the nucleotide specificity of TUT1 in specific biological processes awaits further study.

### TUT4 AND TUT7: URIDYLYLATION OF Pre–Let-7

### Biogenesis of Let-7

MiRNAs are small (21–25-nt) non-coding RNAs that function in gene silencing. Together with Argonaute proteins, miRNAs form RNA-induced silencing complex, and inhibit protein synthesis or induce mRNA degradation by base-pairing with target mRNAs (Braun et al., 2012). Let-7 is a highly conserved miRNA, from nematode to human, and is known to regulate various cellular processes (Bussing et al., 2008; Thornton and Gregory, 2012). It regulates cellular proliferation by acting as a tumor suppressor. It also regulates cellular differentiation, development, and apoptosis, and is involved in glucose metabolism (Reinhart et al., 2000; Houbaviy et al., 2003; Takamizawa et al., 2004; Johnson et al., 2005; Tsang and Kwok, 2008; Zhu et al., 2011).

The synthesis of most miRNAs begins with the transcription of a primary miRNA transcript (pri-miRNA) by RNA polymerase II. Then, pri-miRNA is cleaved to become precursor miRNA (premiRNA) by Drosha. Pre-miRNA is exported to the cytoplasm by Exportin-5. In the cytoplasm, pre-miRNA is further processed by Dicer to produce mature miRNA, which functions in gene silencing (Ha and Kim, 2014).

Among seven non-canonical terminal nucleotidyltransferase family proteins, TUT4 and TUT7 have similar domain organizations (**Figure 1**), and both are involved in the uridylylation of pre–let-7. Biogenesis of let-7 is regulated by two

FIGURE 5 | Functional duality of TUT4/7 in the biogenesis of let-7. (A) In the absence of LIN28, mono-uridylylation of pre–let-7 that harbors 1-nt 3<sup>0</sup> -overhang (group II) by TUT4/7 promotes Dicer processing of pre–let-7. (B) In the presence of LIN28, TUT4/7 oligo-uridylylates pre-let-7 and inhibits Dicer processing of pre–let-7. Oligo-uridylylated pre–let-7 is degraded by DIS3L2, an exonuclease. (C) Schematic representation of domain organization of TUT4/7 and Lin28. ZKs in Lin28 interact with LIM of TUT4/7 in the presence of pre–let-7 (Faehnle et al., 2017; Wang et al., 2017). (D) Schematic representation of the secondary structure of pre–let-7 and interactions with Lin28B (Nam et al., 2011; Wang et al., 2017). ZK of Lin28 binds GGAG motif in pre–let-7 and CDS binds the terminal loop of preE.

release, pre–let-7 tanslocates. In the absence of Lin28, the mono-uridylylated pre–let-7 cannot form a stable complex with TUT7, and is easily released from TUT7.

distinct modes of uridylylation of pre–let-7: mono-uridylylation and oligo-uridylylation (**Figures 5A,B**).

Mono-uridylylation of pre–let-7 is observed in differentiated and somatic cells where Lin28 is not expressed. Group II pre–let-7 with 1-nt 3<sup>0</sup> -end overhang after Drosha processing is monouridylylated (Heo et al., 2012). This mono-uridylylation of pre– let-7 is mediated by TUT4/7, and promotes subsequent Dicer processing, as pre–let-7 with 2-nt 3<sup>0</sup> -overhang is a good substrate of Dicer. Thus, TUT4/7 promotes biogenesis of let-7, serving as a biogenesis factor (**Figure 5A**).

On the other hand, in embryonic cells and cancer cells, RNA-binding protein Lin28 is expressed and let-7 expression is repressed. Lin28 binds to a conserved sequence (5<sup>0</sup> -GGAG-3 0 ) in the loop region of pre–let-7after Drosha processing (Heo et al., 2008; Newman et al., 2008; Nam et al., 2011). Lin28 binding to pre–let-7 competes with Dicer cleavage of pre–let-7, recruits TUT4/7, and promotes oligo-uridylylation of pre–let-7 (Heo et al., 2008, 2009; Piskounova et al., 2008; Hagan et al., 2009; Thornton et al., 2012). Oligo-uridylylation of pre–let-7 inhibits the Dicer processing and causes degradation of pre-let-7 by DIS3L2 (Astuti et al., 2012; Ustianenko et al., 2013), an exonuclease that preferably degrades poly(U) tail. Hence, oligo-uridylylation of pre–let-7 represses expression of mature let-7 (**Figure 5B**). TUT4 is mainly responsible for the oligo-uridylytation of pre-let-7, because single knockdown of TUT4 is sufficient to increase the mature let-7 levels (Heo et al., 2009; Thornton et al., 2012). On the other hand, TUT7 is thought to have limited or redundant role, because single knockdown of TUT7 has no effect on mature let-7 level, but double knockdown of TUT4 and TUT7 increases mature let-7 more significantly than the single knockdown of TUT4 (Thornton et al., 2012). In the case of Lin28 dependent oligouridylylation, TUT4/7 serves as a negative regulator of let-7 biogenesis, and contributes to tumorigenesis and embryonic stem cell maintenance, by canceling the repression of several genes.

The functional duality of uridylylation by TUT4/7 depends on the length of 3<sup>0</sup> -tail of pre–let-7 and the cell type. Lin28 and TUT4/7 act as a molecular switch in the developmental and pathological transition observed in cancer.

### Domain Structures of Human TUT4/7 and Lin28

TUT4 and TUT7 have similar domain organization (**Figure 1**), and are multi-domain enzymes composed of an N-terminal ZF domain, two nucleotidyltransferase domains (NTD1 and NTD2) connected by a flexible linker, and three zinc knuckle domains (ZK) (CCHC-type zinc fingers in **Figure 1**). NTD1 in the N-terminal portion of the protein is not an active nucleotidyltransferase, since it lacks three catalytic carboxylates. By contrast, NTD2 has such three catalytic carboxylates and participates in the nucleotidyltransfer reaction (**Figure 5C**).

During the Lin28-dependent oligo-uridylylation of pre–let-7 by TUT4/7, Lin28 and pre–let-7 interact with the N-terminal half of TUT4/7 (Nam et al., 2011; Thornton et al., 2012; Faehnle et al., 2017; Wang et al., 2017). The N-terminal and C-terminal halves of TUT4/7 are referred to as LIM and CM, respectively (**Figure 5C**).

The molecular mechanism of Lin28 binding to pre–let-7 RNA is well understood (Loughlin et al., 2011; Nam et al., 2011; Mayr et al., 2012; Wang et al., 2018). Human Lin28 harbors an N-terminal cold-shock domain, and two C-terminal ZKs. Coldshock domain binds to a stem-loop structure in the pre-element (preE) and ZKs bind to a conserved GGAG motif located near the 3<sup>0</sup> -end of preE (**Figure 5D**). ZKs of Lin28 are necessary and sufficient for the ternary interactions of TUT4/7, Lin28, and pre–let-7 (Faehnle et al., 2017; Wang et al., 2017).

### Structure of the Catalytic Core of TUT7 During Mono-Uridylylation

Recently, crystal structure of a complex of human TUT7 CM with 14-bp palindromic dsRNA and UTP was reported (Faehnle et al., 2017). The RNA used for crystallization contained a 1 nt 3<sup>0</sup> -overhang, thus mimicking the duplex stem of group II pre–let-7 (Heo et al., 2012). Hence, the structure of CM in complex with dsRNA reflects mono-uridylylation of pre–let-7 (**Figure 6A**).

The overall structure of TUT7 CM shares topological homology with those of yeast Cid1 and vertebrate mitochondrial PAP (Bai et al., 2011; Lunde et al., 2012; Munoz-Tello et al., 2012; Yates et al., 2012; Lapkouski and Hallberg, 2015). It is also homologous to the catalytic core structure of human TUT1 (Yamashita et al., 2017), and consists of palm and finger domains. TUT7 (and TUT4) CM contains ZKs (**Figures 1**, **5C**). However, ZK2 is not visible in the structure, suggesting that it is displaced. In the structure of CM with dsRNA and UTP, UTP resides at the bottom of the cleft between palm and fingers, as observed in the structure of human TUT1 complexed with UTP. The O<sup>4</sup> atom and O<sup>2</sup> atom of UTP form hydrogen bonds with His1286 and Asn1130, respectively. The N<sup>3</sup> atom of UTP interacts with

Asp1280 via a water molecule (**Figure 6B**). The mechanism of UTP selection by human TUT7 (and TUT4) is essentially the same as that for yeast Cid1 and human TUT1 (**Figure 3D**).

The dsRNA, mimicking the duplex stem of pre–let-7, is clamped by two regions: the 5<sup>0</sup> -anchor and groove loop (**Figures 6A,C**). Leu1096 and Ile1099 in the 5<sup>0</sup> -anchor (palm) provide a hydrophobic platform for interactions and stack with the first-base pair of dsRNA. The groove loop (fingers) interacts with the minor groove of dsRNA through van der Waals interactions and hydrogen bonding. Consequently, the 3 0 -end overhanging nucleotide (3<sup>0</sup> -U) of group II pre-miRNA can enter the catalytic pocket. The uracil base of 3<sup>0</sup> -U is sandwitched between the uracil base of incoming UTP in the catalytic site and Val1104 in the 5<sup>0</sup> -anchor. The structure represents the pre-reaction stage of monouridylation. Following the nucleotidyltransfer reaction, the release of byproduct, pyrophosphate, would trigger the translocation of the double helix of pre–let-7, with the double helix of pre–let-7 no longer fixed or stabilized by the 5<sup>0</sup> -anchor and groove loop. Consequently, TUT7/4 would terminate the mono-uridylylation reaction and release mono-uridylylated pre–let-7 (**Figure 6C**).

### Structure of the Catalytic Core of TUT7 During Oligo-Uridylylation

Structure of human TUT7 CM complexed with 2-nt oligo(U) (50 -UUOH-3<sup>0</sup> ) and a UTP analog, reflecting oligo-uridylylation (pre-catalytic stage), was reported (Faehnle et al., 2017; **Figure 7A**). Structure of a TUT7 CM in complex with a 5-nt oligo(U) (5<sup>0</sup> -UUUUUOH-3<sup>0</sup> ), reflecting post-uridylylation (post-catalytic stage) was also reported (**Figure 7B**). In the

pre–let-7 in the presence of Lin28 (lower). Only the catalytic site in the CM is presented.

structures of the CM-oligo(U) complex, ZK2 is clearly visible and interacts with the oligo(U) chain. In the structure of the CM-dsRNA complex representing mono-uridylylation, ZK2 is not visible and is displaced because of the presence of dsRNA (**Figure 6A**).

In both CM-oligo(U) complexes, ZK2 interacts with uridine at a position corresponding to -2 (**Figures 7A,B**). The O<sup>4</sup> atom of uridine at position -2 forms a hydrogen bond with His1355, and the O<sup>2</sup> atom of uridine at position -2 forms a hydrogen bond with a water molecule, which also hydrogen-bonds with Lys1353. The uridine at position -1 also hydrogen-bonds with Asn1124, and the uracil base is sandwitched between uracil base at position + 1 and Val1104. Thus, ZK2 would stabilize the oligo(U) reaction product and aid the translocation of oligo(U) via uracil-specific interactions at the oligo-uridylylation site (**Figure 7C**).

### Mechanism of Switching Between Monoand Oligo-Uridylylation

A TUT7/4 activity switch has been proposed based on the structures of TUT7/4 CM in complex with various RNAs (**Figures 8A,B**). Transient interaction between TUT7/4 and group II pre–let-7 favors addition and release before oligouridylylation occurs. Hence TUT7/4 mono-uridylylate group II pre–let-7 (**Figure 8A**). Since group I pre–let-7 with a 2 nt 3<sup>0</sup> -overhang binds at the post-catalytic state, pre–let-7 is released without oligo-uridylylation. The double-stranded stem of pre–let-7 prevents ZK2 engagement in the process.

Lin28 controls the oligo-uridylylation switch by recruiting TUT7/4 LIM to the GGAG motif in the terminal loop of pre– let-7 (**Figure 8B**). The stable ternary complex of TUT7/4, Lin28, and pre–let-7 allows the 3<sup>0</sup> -end of pre–let-7 to stay in the active site in CM, and supports processive oligo-uridylylation by the CM. During oligo-uridylylation, ZK2 in the CM interacts with 3 0 -oligo(U) tail and stabilizes the translocation of oligo(U) tail.

It is not yet clear how LIM interacts with ZK of Lin28 and the GGAG motif of pre–let-7, and how the interaction relocates the 3 0 -end of pre–let-7 in the catalytic pocket of CM to initiate oligouridylylation. Elucidation of these mechanisms awaits further structural analysis.

### REFERENCES


### PERSPECTIVES

TUT1 participates in the target RNA-directed miRNA degradation, TRDM (Haas et al., 2016), where TUT1 oligouridylylates specific miRNAs for degradation by DIS3L2. TUT4 and TUT7 also oligo-uridylylate histone mRNAs for degradation after the inhibition of DNA replication (Schmidt et al., 2011; Lackey et al., 2016). They also oligo-uridylylate Ago cleaved pre-miRNAs with 5<sup>0</sup> overhangs (Liu et al., 2014). Similarly, TUT4/7 oligo-uridylylates 3<sup>0</sup> -end of polyadenylylated mRNAs and marks them for degradation (Lim et al., 2014; Morgan et al., 2017; Chang et al., 2018). TUT4/7 uridylylates mature miRNAs (Jones et al., 2009; Jones et al., 2012; Thornton et al., 2015), which blocks miRNA activity, probably by affecting either the target specificity or RNA-induced silencing complex loading (Jones et al., 2012). Thus, RNA uridylylation by TUTases plays important roles in various aspects of gene expression. The molecular mechanisms of specific RNA substrates by TUTases remain elusive and cannot be fully explained by the currently solved structures. TUTases would recognize various RNAs either directly or through the regulatory factors which assist TUTases in recognizing specific RNA species. Elucidation of the regulatory mechanism of specific RNA uridylylation by TUTases awaits further study.

### AUTHOR CONTRIBUTIONS

YY and KT wrote the manuscript together.

### FUNDING

Work in the authors' laboratory was supported in part by grants (to KT) from the Funding Program for Next Generation World-Leading Researchers of JSPS (LS135); by Grants-in-Aid for Scientific Research (A) (18H03980); Grant-in-Aid for Scientific Research on Innovative Areas from JSPS (26113002); Takeda Science Foundation; Japan Foundation for Applied Enzymology; Terumo Foundation for Life Science and Art; and Princess Takamatsu Cancer Research Foundation.


adenylation and PARN-dependent trimming. RNA 18, 958–972. doi: 10.1261/ rna.032292.112


pre-microRNA uridylation. Cell 138, 696–708. doi: 10.1016/j.cell.2009. 08.002



(TUT4) and Zcchc6 (TUT7). RNA 18, 1875–1885. doi: 10.1261/rna.0345 38.112


the selective blockade of let-7 processing by LIN28. Cell Rep. 23, 3091–3101. doi: 10.1016/j.celrep.2018.04.116


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Yashiro and Tomita. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

### Generation of 2<sup>0</sup> ,30 -Cyclic Phosphate-Containing RNAs as a Hidden Layer of the Transcriptome

### Megumi Shigematsu, Takuya Kawamura and Yohei Kirino\*

Computational Medicine Center, Sidney Kimmel Medical College, Thomas Jefferson University, Philadelphia, PA, United States

Cellular RNA molecules contain phosphate or hydroxyl ends. A 2<sup>0</sup> ,30 -cyclic phosphate (cP) is one of the 3<sup>0</sup> -terminal forms of RNAs mainly generated from RNA cleavage by ribonucleases. Although transcriptome profiling using RNA-seq has become a ubiquitous tool in biological and medical research, cP-containing RNAs (cP-RNAs) form a hidden transcriptome layer, which is infrequently recognized and characterized, because standard RNA-seq is unable to capture them. Despite cP-RNAs' invisibility in RNA-seq data, increasing evidence indicates that they are not accumulated simply as non-functional degradation products; rather, they have physiological roles in various biological processes, designating them as noteworthy functional molecules. This review summarizes our current knowledge of cP-RNA biogenesis pathways and their catalytic enzymatic activities, discusses how the cP-RNA generation affects biological processes, and explores future directions to further investigate cP-RNA biology.

Keywords: 2<sup>0</sup> ,30 -cyclic phosphate (cP), cP-containing RNA (cP-RNA), cP-RNA-seq, ribonuclease, tRNA half, noncoding RNA (ncRNA), angiogenin (ANG)

### INTRODUCTION

After transcription, newly synthesized RNA molecules must undergo maturation steps to become functional molecules, and unnecessary RNAs are subjected to turnover. In both the RNA maturation and turnover mechanisms, enzymatic cleavage of RNA molecules plays a crucial role. When cleaved, RNAs can generally possess a hydroxyl group (OH), a phosphate (P), or a 2<sup>0</sup> ,30 cyclic phosphate (cP) at their termini. While OH and P can be found at both the 5<sup>0</sup> - and 3<sup>0</sup> -ends of RNAs, a cP presents only at the 3<sup>0</sup> -end of RNAs in which the 2<sup>0</sup> - and 3<sup>0</sup> -positions of ribose is bridged by the phosphate (**Figure 1**). Catalytic machineries of RNA cleavage determine the terminal phosphate states of the generated RNA molecules, which is not just a consequence of the cleavage, but, in many cases, is critical for further RNA maturations and functions. The current, standard RNA-sequencing (RNA-seq) methods rely on 5<sup>0</sup> -P/3<sup>0</sup> -OH ends of RNAs, and thus, RNAs with a cP (cP-containing RNAs: cP-RNAs) cannot be captured because cP end cannot be ligated to the 3 0 -adapter by ATP-dependent ligase. Consequently, cP-RNAs are "invisible" in RNA-seq data and therefore form a hidden component of transcriptome. However, accumulating evidence indicates that the cP-RNA generation is significant in various biological processes. Here, we summarize our

Edited by:

Akio Kanai, Keio University, Japan

### Reviewed by:

Gota Kawai, Chiba Institute of Technology, Japan Kiong Ho, University of Tsukuba, Japan Stefan Weitzer, Medizinische Universität Wien, Austria

> \*Correspondence: Yohei Kirino yohei.kirino@jefferson.edu

#### Specialty section:

This article was submitted to RNA, a section of the journal Frontiers in Genetics

Received: 28 September 2018 Accepted: 06 November 2018 Published: 27 November 2018

#### Citation:

Shigematsu M, Kawamura T and Kirino Y (2018) Generation of 2<sup>0</sup> ,30 -Cyclic Phosphate-Containing RNAs as a Hidden Layer of the Transcriptome. Front. Genet. 9:562. doi: 10.3389/fgene.2018.00562

**26**

knowledge of cP-RNAs' biogenesis mechanisms, expression, and molecular functions and discuss how to further interrogate cP-RNA biology.

### POSSIBLE CATALYTIC MECHANISMS OF cP FORMATION

There are multiple situations in which a cP is formed at the 3<sup>0</sup> end of RNA molecules. cP frequently appears as an intermediate form during RNA cleavage by many endoribonucleases [e.g., pancreatic ribonuclease (RNase A), RNase T1, and RNase T2], which eventually generate RNAs with 3<sup>0</sup> -P/5<sup>0</sup> -OH ends (Cuchillo et al., 1997; Irie, 1997; Nichols and Yue, 2008). RNA cleavage by these enzymes is composed of two steps: (i) transesterification (transphosphorylation), forming an intermediate cP, and (ii) cP hydrolysis to generate a 3<sup>0</sup> -P (Fabian and Mantsch, 1995; Lilley, 2011) (**Figure 2A**). RNase A, the best studied enzyme among such endoribonucleases, contains a catalytic triad, His12, Lys41, and His119, which, especially the two histidines, serve as general acidbase catalysts during both steps (Roberts et al., 1969; Thompson and Raines, 1994; Cuchillo et al., 2011). Step (i) is initiated with 2<sup>0</sup> -OH deprotonation by a base catalyst, His12, followed by nucleophilic attack of the phosphorus by the generated 2<sup>0</sup> -oxygen (O), which causes transesterification to form a 2<sup>0</sup> ,30 -cP. His119 assists the reaction as an acid catalyst by donating a proton to the leaving group, forming a 5<sup>0</sup> -OH end. Lys41 forms a hydrogen bond with 2<sup>0</sup> -O to transiently stabilize the cP. In step (ii), to hydrolyze the cP, His119 serves as a base catalyst to remove a proton from the vicinal water molecule, while His12 serves as an acid catalyst by donating a proton to form 2<sup>0</sup> -OH, generating a 3 0 -P as a final form.

Although the above case produces a cP just as an intermediate form, many ribonucleases, as summarized in **Table 1**, generate a cP as a final form via their RNA cleavage that only conducts step (i) without proceeding to step (ii) (**Figure 2B**). As a well-studied example, RNA cleavage by angiogenin (ANG), an endoribonuclease belonging to the RNase A superfamily (Dyer and Rosenberg, 2006; Sheng and Xu, 2016), yields a cP end (Shapiro et al., 1986). ANG contains the catalytic triad, His13, Lys40, and His114, which are well-conserved among RNase A superfamily members, but shows 105–10<sup>6</sup> -fold lower ribonucleolytic activity compared to RNase A (Shapiro et al., 1986; Harper and Vallee, 1989). Certain unique structural features of ANG can explain this low catalytic activity. ANG's substrate binding pocket is obstructed by Gln117 (Acharya et al., 1994; Russo et al., 1994) (**Figure 2C**), which is stabilized by a hydrogen bond with Thr44 (Leonidas et al., 1999, 2002; Holloway et al., 2004). Two hydrogen bonds from Asp116 and Ser118 further stabilize Gln117's obstructive position (Russo et al., 1994). These steric hindrances would cause decreased substrate accessibility, possibly leading to low cleavage and cP hydrolysis activities. Indeed, a single mutation of Gln117 to Gly showed at least a ∼20-fold increase in ribonuclease activity, as well as a ∼28-fold increase in cP hydrolysis activity (Russo et al., 1994; Leonidas et al., 2002). In addition, Asp116 could contribute to low cP hydrolysis activity as well as low cleavage activity. While the corresponding Asp121 of RNase A forms a hydrogen bond with catalytic His119, presumably to support its imidazole ring orientation, Asp116 of ANG does not support catalytic His114 but forms two hydrogen bonds with Ser118 (Leonidas et al., 2002). The lack of support for His114 should have an adverse effect on the cP hydrolysis reaction because His114 would have initiated the reaction as a base catalyst, possibly leaving a cP as a final form.

RNA cleavage by colicin E5, a cytotoxic endoribonuclease found in Escherichia coli, also yields a cP as a final form (Ogawa et al., 1999; Ogawa et al., 2006), presumably due to cP structure stabilization. The ribonuclease domain of colicin E5 does not contain histidines, the most frequently utilized catalytic residues (Bartlett et al., 2002), but possesses Arg33 and Lys25 (numbering from C-terminal domain) as catalytic residues (Yajima et al., 2006; Inoue-Ito et al., 2012). Although the catalytic mechanism remains to be further examined, these residues, along with Ile94 that supports the orientation of Arg33, might stabilize a cP structure (Inoue-Ito et al., 2012), which would contribute to generating a cP as a final form. A cP structure may also be stabilized through interaction with a protein during RNA cleavage of a eukaryotic cP-forming exoribonuclease, U six biogenesis protein 1 (USB1), also known as mutated in poikiloderma with neutropenia protein 1 (MPN1). USB1 contains two well-conserved His-x-Ser (HxS) catalytic motifs in the active site cleft (Mroczek et al., 2012; Hilcenko et al., 2013). It is speculated that, while His120 and His208 in these motifs serve as general acid-base catalysts, Ser122 and Ser210 in these motifs coordinate the oxygens in a cP after transesterification, potentially stabilizing a cP structure as a final form by preventing further hydrolysis.

While cP end is predominantly formed by ribonucleasecatalyzed transesterification, RNA 3<sup>0</sup> -terminal phosphate cyclase (RtcA) can catalyze de novo cP formation by a distinct molecular mechanism involving the following three steps (Genschik et al., 1997, 1998; Billy et al., 2000; Filipowicz, 2016). First, RtcA is autoadenylylated with ATP to form a covalent RtcA-AMP intermediate. The autoadenylylation is initiated by a His309 (in E. coli RtcA; His320 in human RtcA)-mediated

nucleophilic attack of ATP α-phosphorus, followed by covalent bond formation and pyrophosphate (PPi) release. Second, the holoenzyme then transfers the AMP to 3<sup>0</sup> -P of the substrate RNA to form an RNA with 3<sup>0</sup> -PP-50A. Third, the energetically unstable phosphoanhydride bond between the two phosphates is cleaved by 2<sup>0</sup> -OH-mediated attack, resulting in cP formation and releasing AMP.

## cP-FORMING ENZYMES

### Ribonucleases

Although the detailed molecular basis of cP formation remains to be determined, RNA cleavage by many ribonucleases produces a cP as a final, predominant form, generating cP-RNAs (**Table 1**). A tRNA splicing endonuclease is one of the oldest ribonucleases known to generate cP-RNAs (Abelson et al., 1998; Hopper and Phizicky, 2003; Yoshihisa, 2014). In eukaryotes, precursors of some tRNAs, such as tRNALeuCAA, tRNAIleUAU, and tRNATyrGUA, contain an intronic region within their anticodon-loop (Chan and Lowe, 2016). Although the splicing activity to remove tRNA introns and cP formations during the splicing were discovered in the early 1980s (Peebles et al., 1983), many years and much effort were required to identify tRNA-splicing endonuclease subunit 2 (Sen2) as the endoribonuclease directly responsible for tRNA splicing (Trotta et al., 1997; Paushkin et al., 2004; Phizicky and Hopper, 2010), partly due to its membrane association property and low cellular expression level. Sen2 is a subunit of the heterotetrameric SEN complex and cleaves the 5<sup>0</sup> -splice site of tRNAs to leave a cP at the 3<sup>0</sup> -end of 5<sup>0</sup> -exons (**Figure 3A**), whereas the 3<sup>0</sup> -splice site is cleaved by Sen34. As expected from its crucial role in tRNA splicing, SEN2 is an essential gene in yeast (Trotta et al., 1997). In humans, SEN2 gene mutations are associated with pontocerebellar hypoplasia (Budde et al., 2008; Namavar et al., 2011; Bierhals et al., 2013).

Angiogenin, originally identified as a protein factor promoting angiogenesis (Fett et al., 1985; Kurachi et al., 1985; Strydom et al., 1985), is another ancient enzyme that produces cP-RNAs (Shapiro et al., 1986). ANG has diverse physiological roles and is associated with various pathological conditions such as cancers and neurodegenerative diseases (Tello-Montoliu et al., 2006; Gao and Xu, 2008; Sheng and Xu, 2016). tRNAs were identified as major endogenous RNA targets of ANG in Xenopus oocytes (Saxena et al., 1992), and, subsequently, ANGmediated cleavages of tRNA anticodon-loops were reported to generate functional tRNA half molecules in human cell lines (Fu et al., 2009; Yamasaki et al., 2009; Honda et al., 2015) (**Figure 3B**). In hormone-dependent cancer cells, ANG cleavage has been shown to occur for mature aminoacylated tRNAs, generating 5<sup>0</sup> -tRNA halves with a 5<sup>0</sup> -P and a 3<sup>0</sup> -terminal cP, and 3<sup>0</sup> -tRNA halves with a 5<sup>0</sup> -OH and a 3<sup>0</sup> -terminal amino acid (Honda et al., 2015). Although ANG homologs are only found in vertebrates (Cho and Zhang, 2006; Sheng and Xu, 2016), 5<sup>0</sup> tRNA halves expressed in Bombyx mori cells still contain a cP (Honda et al., 2017), suggesting that, even in the absence of ANG homologs, those organisms express an unidentified cP-forming endoribonuclease to cleave tRNAs and generate tRNA halves as cP-RNAs.



Nomenclatures of ribonucleases are according to the organisms examined and reported. PP11, placental Protein 11; GCN4, general control protein 4; Endo, endoribonuclease; and Exo, exoribonuclease.

In vertebrates, U6 snRNA mostly contains a cP (Lund and Dahlberg, 1992), whereas all other snRNAs do not (Maraia and Intine, 2002). During maturation of U6 snRNA, several uridines are added to the 3<sup>0</sup> -end of a precursor RNA by terminal uridylyl transferase 1 (TUT1). Subsequently, USB1 (also known as MPN1), a cP-forming 3<sup>0</sup> to 5<sup>0</sup> exoribonuclease, excises a 3<sup>0</sup> terminal uridine stretch to generate a mature 3<sup>0</sup> -end with four or five uridines containing a cP (Shchepachev et al., 2012; Mroczek and Dziembowski, 2013; Didychuk et al., 2018) (**Figure 3C**). Although USB1 belongs to the 2H phosphoesterase superfamily and contains a cyclic phosphodiesterase (CPDase) motif (Nasr and Filipowicz, 2000; Mazumder et al., 2002; Myllykoski et al., 2013), human USB1 lacks the CPDase activity and thus generates a cP as a final form. In contrast, yeast Usb1 retains the CPDase activity (Didychuk et al., 2017), generating 3<sup>0</sup> -P end of U6 snRNA (Lund and Dahlberg, 1992). It will be intriguing to address how and why the difference arose.

rRNA maturation requires a cP-forming endoribonuclease, Lethal in the absence of Ssd1 (Las1 in yeast; Las1L in human) (Castle et al., 2010; Schillewaert et al., 2012). In yeast, rRNA maturation starts from processing of a nascent 37S rRNA precursor into shorter precursors, including 27S rRNA (Henras et al., 2015; Gerstberger et al., 2017). The 27S rRNA is further cleaved by Las1 between 5.8S and 25S rRNA sequences, generating 7S rRNA as a 5<sup>0</sup> -cleavage product with a cP (Gasse et al., 2015; Pillon et al., 2017). The cleavage is catalyzed by an N-terminal α-helical 'higher eukaryotes and prokaryotes nucleotide-binding' (HEPN) domain of Las1, which has been defined as a conserved R8xxxH catalytic motif (8: N, D, or H) (Anantharaman et al., 2013; Pillon et al., 2017). During further processing of 7S rRNA into mature 5.8S rRNA, a cP end of 7S rRNA is removed and processed by unknown mechanisms; therefore, cP is absent in mature rRNAs.

cP is also formed in a mRNA splicing event that plays a crucial role in activating the unfolded protein response (UPR) pathway upon endoplasmic reticulum (ER) stress. Inositol-requiring enzyme 1 (Ire1), a cP-forming endonuclease, is associated with the ER membrane with its C-terminal domain exposed to the cytosol (Urano et al., 2000; Zhang and Kaufman, 2004). While an interaction with 'binding immunoglobulin protein' (BiP), an ER chaperon protein, retains Ire1 as an inactive monomer under normal conditions, ER stress releases BiP, allowing Ire1 to form a homodimer that harbors an active nuclease domain. The activated Ire1 is involved in splicing of HAC1 mRNA (in yeast; XBP1 mRNA in human) by cleaving both 5<sup>0</sup> - and 3<sup>0</sup> -splice sites, in which a cP is formed at the conserved 3<sup>0</sup> -terminal G of 5<sup>0</sup> -cleavage products (Sidrauski and Walter, 1997; Gonzalez et al., 1999; Shinya et al., 2011) (**Figure 3D**). From the spliced, mature form of

HAC1 mRNA, a basic-region leucine-zipper transcription factor HAC1 is expressed, eventually promoting the transcription of its target genes containing UPR-responsive elements (Sidrauski and Walter, 1997; Urano et al., 2000; Zhang and Kaufman, 2004).

cP-forming endoribonucleases are further found in colicins, toxic proteins that are encoded in plasmid DNAs in some E. coli strains to invade and kill other bacteria (Cascales et al., 2007). Among over 20 colicins identified thus far, colicin E5 and D have been shown to cleave the anticodon-loop of tRNAs and form a cP (Ogawa et al., 1999; Ogawa et al., 2006) (**Figure 3E**). While endoribonuclease activity of those colicins is masked by immunity proteins in host E. coli, colicin E5 invades other bacteria and cleaves tRNATyrGUA, tRNAHisGUG, tRNAAsnGUU, and tRNAAspGUC between G at nucleotide position 34 (G34) and U<sup>35</sup> (Ogawa et al., 1999, 2006), and colicin D cleaves all four isoacceptors of tRNAArg between A<sup>38</sup> and G39/C<sup>39</sup> (Tomita et al., 2000), contributing to bacterial lethality.

The E. coli genome also encodes cP-forming endoribonucleases involved in toxin-antitoxin (TA) systems. TA systems involve bacterial stress responses, often considered "suicidal programs," comprising a stable toxin and an unstable antitoxin that neutralizes the cognate toxin in cells (Unterholzner et al., 2013). In the well-studied MazEF system (**Figure 3F**), toxic endoribonuclease MazF is neutralized by antitoxin MazE under normal conditions, but various stresses, such as nutrient limitation, DNA damage, and antibiotic exposure, degrade MazE and thereby release MazF (Jensen and Gerdes, 1995; Yarmolinsky, 1995; Engelberg-Kulka and Glaser, 1999) which cleaves whole cellular mRNAs to prevent further protein production (Zhang et al., 2003). MazF cleaves the 5<sup>0</sup> -side of an ACA motif within mRNAs, and forms a cP (Zhang et al., 2003, 2005a; Vesper et al., 2011). Recent reports showed MazF-catalyzed cleavage of 16S and 23S rRNAs, and some tRNAs such as tRNALysUUU (Vesper et al., 2011; Moll and Engelberg-Kulka, 2012; Schifano et al., 2013; Schifano et al., 2014, 2016; Mets et al., 2017), indicating that MazF is a critical suicide factor causing perturbation of the whole cellular transcriptome. The ChpBIK system, another TA system, also uses a cP-forming enzyme as a toxin. When released from antitoxin ChpBI under stress conditions, toxic endoribonuclease ChpBK cleaves mRNAs at the 5<sup>0</sup> - or 3<sup>0</sup> -side

of A in an ACY sequence motif to prevent further protein production (Christensen et al., 2003; Zhang et al., 2005b). The 5 0 -cleavage products contain.

The genome of some E. coli isolates possesses a prr locus, encoding PrrC endonuclease (also known as anticodon nuclease: ACNase), which is considered to be another bacterial suicide program (Kaufmann, 2000). PrrC activity is usually silenced by interaction with a masking protein, but, upon T4 phage infection, it forms an ACNase complex and cleaves tRNALysUUU between U<sup>33</sup> and U34, which serve as a host defense to inhibit translation of T4 proteins (Amitsur et al., 1987; Morad et al., 1993). The 5 0 -tRNALysUUU half resulting from the PrrC cleavage harbors a cP.

cP-forming cytotoxic endoribonucleases are also present in eukaryotes. Zymocin and PaT are toxin complexes secreted by the yeasts Kluyveromyces lactis and Pichia acaciae, respectively, to inhibit the growth of other yeasts (Lu et al., 2005, 2008; Klassen et al., 2008) (**Figure 3E**). Zymocin is composed of the three subunits; two of them assist target cell binding and invasion, while the remaining γ-subunit cleaves tRNAs in targeted yeasts (Stark and Boyd, 1986). The γ-subunit of zymocin recognizes a 5-methoxycarbonylmethyl-2-thiouridine (mcm<sup>5</sup> s <sup>2</sup>U), a specific modified RNA nucleotide present at np 34 of tRNAGluUUC, tRNALysUUU, and tRNAGlnUUG, and cleaves between U<sup>34</sup> and U<sup>35</sup> of those tRNAs (Lu et al., 2005), leaving a cP at the ribose of mcm<sup>5</sup> s <sup>2</sup>U in the cleavage products. PaT is a heterodimer composed of PaOrf1, a cell invasion-assisting subunit, and PaOrf2, an endonuclease subunit (McCracken et al., 1994; Klassen et al., 2004). PaOrf2 recognizes 5 methoxycarbonylmethyl uridine (mcm5U) and cleaves between U<sup>34</sup> and U<sup>35</sup> of tRNAGlnUUG, leaving a cP at the ribose of mcm5U in the cleavage product (Klassen et al., 2008; Chakravarty et al., 2014).

cP-forming endoribonucleases are further found in viruses. DNA topoisomerase, encoded in vaccinia virus, belongs to the type IB family of eukaryotic DNA topoisomerases and uniquely harbors endoribonucleolytic activity, which forms a cP end at the cleaved RNAs (Sekiguchi and Shuman, 1997; Shuman, 1998). Analogous to yeast topoisomerase I, which can remove single ribonucleotides in DNA duplexes (Kim et al., 2011), topoisomerase's RNA cleavage activity might be involved in maintaining genome integrity during DNA replication. Replicative nidoviral uridylate-specific endoribonuclease (NendoU), encoded in nidovirus, is also a cP-forming endoribonuclease (Ivanov et al., 2004). While the functional role of the endoribonuclease activity in virus infection and replication is not fully understood, NendoU preferentially targets dsRNA and cleaves the 5<sup>0</sup> -side of uridine in G–U or G–U–U sequence to generate cP-RNAs (Ivanov et al., 2004).

### Ribozymes

Ribozymes are another cP-yielding biocatalyst. Among several distinct classes of ribozymes, a class of small, self-cleaving ribozymes is known to generate cPs (Scott and Klug, 1996; Doherty and Doudna, 2001; Serganov and Patel, 2007). Small self-cleaving ribozymes are widely found in bacterial, plant, and mammalian genomes, and are involved in gene controls and expressions (Shih and Been, 2002; Serganov and Patel, 2007). Out of 11 identified ribozymes in this class, 10 have been shown to form cP ends as a result of their cleavage of RNAs (Saville and Collins, 1990; Scott and Klug, 1996; Winkler et al., 2004; Salehi-Ashtiani et al., 2006; Roth et al., 2014; Harris et al., 2015; Li et al., 2015; Weinberg et al., 2015). In the case of the hepatitis delta virus (HDV) ribozyme, the 85-nt minimal self-cleavage domain cleaves between U−<sup>1</sup> and G<sup>1</sup> (Shih and Been, 2002; Puerta-Fernandez et al., 2003). While C<sup>75</sup> is suggested to act as a general acid catalyst by donating a proton from its N3 in the pyrimidine ring to a leaving group, several different molecules have been proposed as potential base catalysts: water or hydroxide from the solvent, water molecules coordinated to the Mg2+, or 2<sup>0</sup> -OH of G<sup>27</sup> positioned closely adjacent to the catalytic site (Ward et al., 2014).

#### Enzymes That Act Directly on the 3<sup>0</sup> -End of RNAs

There are two protein catalysts that have been reported to form a cP by a distinct molecular mechanism from transesterification during RNA cleavage. As described above, RtcA can catalyze de novo cP formation by directly acting on the 3<sup>0</sup> -end of RNAs (Genschik et al., 1997, 1998; Billy et al., 2000; Filipowicz, 2016). Although endogenous RNA target of RtcA is unknown, in the E. coli genome, rtcA and an RNA ligase rtcB form an rtcBA operon, which is implicated in RNA repair pathway (Das and Shuman, 2013a; Burroughs and Aravind, 2016). Archaeal thermophilic RNA ligase from Methanobacterium thermoautotrophicum, MthRnl, is the other enzyme which can also catalyze de novo cP formation (Zhelkovsky and McReynolds, 2014; Yoshinari et al., 2017). While MthRnl can ligate 5<sup>0</sup> -P and 3 0 -OH ends of RNAs (Torchia et al., 2008), when substrate RNAs contain a 3<sup>0</sup> -P, MthRnl coverts it to a 3<sup>0</sup> -cP by a similar mechanism with RtcA (Zhelkovsky and McReynolds, 2014). In addition, MthRnl possesses the 3<sup>0</sup> -deadenylation activity which can remove a 3<sup>0</sup> -terminal adenosine with an OH end and form a cP (Yoshinari et al., 2017). Endogenous RNA target of MthRnl is unknown.

### BIOLOGICAL SIGNIFICANCE OF cP FORMATION AND cP-RNA EXPRESSION

What is the significance of cP formation in RNAs? It has been shown that cP formation in U6 snRNA regulates RNA interaction with protein factors. While nascent U6 snRNA containing 3<sup>0</sup> -OH end is bound by La protein (Maraia and Intine, 2002; Maraia and Bayfield, 2006), cP formation of mature U6 snRNA promotes interaction with Lsm2–8 complexes (Khusial et al., 2005; Licht et al., 2008) (**Figure 3C**). The affinity of the cP-containing RNA to Lsm2–8 is higher than 3<sup>0</sup> -OH-containing RNA, and the interaction of La/3<sup>0</sup> -OH and Lsm2–8/cP is mutually exclusive: even when both La and Lsm2–8 exist in the reaction solution, RNA with 3<sup>0</sup> -OH or with cP only binds to La or Lsm2–8, respectively (Licht et al., 2008). cP formation is, therefore, a critical factor for forming functional spliceosome complexes with Lsm2–8 (Didychuk et al., 2018). Although this is the only proven

example of the cP-regulated formation of RNA-protein complex, cP formation in other cP-RNAs may modulate RNA interaction with protein factors.

RNA ligation reaction can depend on a cP in a substrate RNA. In tRNA splicing, Sen2-mediated cleavage forms a 3<sup>0</sup> terminal cP in 5<sup>0</sup> -exons, which is then ligated to the 5<sup>0</sup> -OH end of 3<sup>0</sup> -exons by tRNA ligase (Popow et al., 2012; Yoshihisa, 2014) (**Figure 3A**). In Arabidopsis thaliana, the tRNA ligase AtRNL is able to ligate cP ends to 3<sup>0</sup> -exons but cannot use 3<sup>0</sup> -P ends as its ligation substrate (Schutz et al., 2010; Tanaka et al., 2011a). This cP-specific ligation activity was also observed in wheat germ extract (Konarska et al., 1982). In this plant ligation process, cP ends of 5<sup>0</sup> -exons are first converted to 2<sup>0</sup> -P and 3<sup>0</sup> -OH. 5<sup>0</sup> - OH ends of 3<sup>0</sup> -exons are phosphorylated, followed by ligation to 3 0 -OH of 5<sup>0</sup> -exons (Popow et al., 2012; Yoshihisa, 2014). Other organisms employ distinct molecular mechanisms in ligation of cP-containing 5<sup>0</sup> -exons to 3<sup>0</sup> -exons in tRNA splicing (Popow et al., 2012; Yoshihisa, 2014). In humans, RtcB was identified as a tRNA ligase (Popow et al., 2011). Experiments using lysates and RtcB immunoprecipitates from HeLa cells suggest that human RtcB prefers cP and 5<sup>0</sup> -OH for ligation (Filipowicz et al., 1983; Popow et al., 2011). However, whether the substrate specificity extends to 3<sup>0</sup> -P and 5<sup>0</sup> -OH containing RNA still awaits analysis using a recombinant human tRNA ligase complex. In mammals, RtcB is involved in splicing of XBP1 mRNA in the UPR pathway (Filipowicz, 2014; Jurkin et al., 2014; Lu et al., 2014) (**Figure 3D**). E. coli RtcB is also able to ligate cP and 5<sup>0</sup> -OH, as well as 3<sup>0</sup> - P and 5<sup>0</sup> -OH (Tanaka and Shuman, 2011; Tanaka et al., 2011b). In the ligation, cP is first converted to 3<sup>0</sup> -P, then ligated to 5 0 -OH (Tanaka et al., 2011a). E. coli RtcB can catalyze the religation of 16S rRNA at the site cleaved by stress-induced MazF activity, which generates full-length 16S rRNA and contributes to restoration from the stress conditions (Temmel et al., 2017).

Besides influencing interaction and activity of proteins, cP formation may play a role in stabilizing RNA molecules by protecting them from degradation. Ehrlich exoribonuclease extracted from Ehrlich ascites cells and various mouse tissues, later defined as exoribonuclease II, was shown to degrade singlestranded RNAs with 3<sup>0</sup> -OH ends more rapidly than those with cP and 3<sup>0</sup> -P ends (Sporn et al., 1969), suggesting that cP formation is advantageous for RNA molecules to exist stably in cells. In contrast, because RNAs with 3<sup>0</sup> -P ends are more rapidly degraded by the exosome complex exoribonuclease, Rrp44, than those with 3<sup>0</sup> -OH ends (Zinder et al., 2016), cP formation could also negatively impact the stability of cP-RNAs. Thus, a cP structure might be able to regulate RNA stability in both directions by affecting degradation activity of nucleases or modulating RNAprotein interactions. Further study is required to shed more light on the potential function of cP formation in RNA stability.

The above described advantages of cP formation may, in turn, suggest the biological significance of cellular cP-RNA expression. While the functional significance of U6 snRNA, which belongs to cP-RNAs, or tRNAs and rRNAs, whose biogenesis is intermediated by cP-RNAs, have been apparent for a long time, previously uncharacterized cP-RNAs are now being demonstrated as functional molecules which play important roles in various biological processes. Representative examples of such functional cP-RNAs include the 5<sup>0</sup> -tRNA half molecules. In mammalian cells, various stress stimuli trigger ANG-mediated tRNA cleavage to produce functional tRNA halves, termed tRNA-derived stress-induced RNAs (tiRNAs) (Fu et al., 2009; Yamasaki et al., 2009) (**Figure 3B**). 5<sup>0</sup> -tiRNAs, corresponding to 5 0 -tRNA halves, have been shown as functional molecules that can promote formation of stress granules and regulate translation via YB-1 protein-involved pathway (Emara et al., 2010; Ivanov et al., 2011, 2014; Lyons et al., 2016).

ANG-mediated tRNA cleavage is also promoted by sex hormone signaling pathways in hormone dependent breast and prostate cancer cells, generating a distinct class of tRNA halves termed sex hormone-dependent tRNA-derived RNAs (SHOT-RNAs) (Honda et al., 2015; Honda and Kirino, 2016) (**Figure 3B**). 5<sup>0</sup> -SHOT-RNAs, belonging to cP-RNAs, promote cell proliferation. The expression levels of SHOT-RNAs in tissues and serum of prostate cancer patients have been shown to be associated with pathological and prognostic parameters, suggesting the use of SHOT-RNAs as potential diagnostic biomarkers (Zhao et al., 2018). In terms of diseases, many different ANG gene mutations have been identified in patients with amyotrophic lateral sclerosis (ALS) and Parkinson's disease (Tello-Montoliu et al., 2006; Gao and Xu, 2008), implying that ANG-catalyzed production of tRNA halves could be involved in the pathogenesis of these neurodegenerative disorders (Thiyagarajan et al., 2012). Indeed, accumulation of tRNA halves contributes to the pathogenesis of a syndromic form of intellectual disability and Dubowitz-like syndrome (Blanco et al., 2014).

cP-RNAs can also function as direct precursors for shorter functional RNAs. In B. mori germ cells, some abundant species of Piwi-interacting RNAs (piRNAs), a germline-specific class of small regulatory RNAs, are produced directly from cP-containing 5 0 -tRNA halves (Honda et al., 2017) (**Figure 3G**). Although many microRNAs (miRNAs) are derived from tRNAs (Shigematsu and Kirino, 2015; Telonis et al., 2015), whether the tRNA-derived miRNAs are also generated from cP-containing tRNA halves has not been examined yet. Further research may reveal more evidence of cP-RNA uses as direct precursors for functional RNAs.

### SPECIFIC SEQUENCING AND QUANTIFICATION OF cP-RNAs

To further expand cP-RNA research, it is imperative to capture cP-RNA expression profiles accurately, which is not possible using standard RNA-seq methods. Specific cP-RNA sequencing can be achieved by cP-RNA-seq (Honda et al., 2015, 2016) which takes advantage of distinct properties of two well-used enzymes, T4 polynucleotide kinase (T4 PNK) and a phosphatase such as calf intestinal phosphatase (CIP). T4 PNK has 3<sup>0</sup> -terminal phosphatase activity that removes both a P and cP from the 3<sup>0</sup> end of RNAs (Amitsur et al., 1987; Das and Shuman, 2013b), whereas CIP removes only a P but not a cP. In cP-RNA-seq, RNAs are first treated with CIP to remove a P, followed by periodate oxidization. Because the oxidation cleaves the 3<sup>0</sup> -end

of all RNAs other than cP-RNAs, subsequent cP removal, adapter ligation, and cDNA amplification steps are exclusively applied to cP-RNAs, leading to selective amplification and sequencing of cP-RNAs (Honda et al., 2015, 2016) (**Figure 4A**). cP-RNA-seq only requires commercially available enzymes and reagents, which is an advantage of the method. As a limitation of the method, RNAs lacking a 2<sup>0</sup> ,30 -diol structure of ribose, such as plant miRNAs and animal piRNAs that contain 2<sup>0</sup> -O-methyl ribose modification (Yang et al., 2006; Kirino and Mourelatos, 2007; Ohara et al., 2007), can also be amplified despite the absence of a cP, because those RNAs would be resistant to periodate oxidation. This point should always be remembered especially when 20–30-nt small RNAs are used for the method. Thus far, cP-RNA-seq has been applied only to the two cell lines, human BT-474 breast cancer cells and B. mori BmN4 cells (Honda et al., 2015, 2017). Although high mapping ratio of the obtained reads to tRNA sequences showed the specificity and credibility of the method, both of the studies narrowly focused on short RNA fraction containing tRNA haves. Further application of the method to broader RNA populations will enable more global identification of cP-RNA species.

As an alternative method, Arabidopsis tRNA ligase AtRNL can be used for specific cP-RNA sequencing (Schutz et al., 2010). Because its ligation activity is specific to a cP but not to a 3<sup>0</sup> -P and 3<sup>0</sup> -OH, AtRNL selectively ligates a 3<sup>0</sup> -adapter to cP-RNAs among all RNA species. After the ligation, for efficient reverse transcription, a 2<sup>0</sup> -P formed at the substrate– adapter junction should be removed by 2<sup>0</sup> -phosphotransferase treatment. Therefore, two specific recombinant proteins, AtRNL and Saccharomyces cerevisiae 2 0 -phosphotransferase Tpt1, were purified and used in the method (Schutz et al., 2010). Application of the method to human brain total RNA identified numerous reads of cP-RNAs containing U6 snRNA. The 3<sup>0</sup> -ends of ∼90% of the U6 snRNA reads were identified as a consistent, mature form, validating the specificity and credibility of the method. Considering the ligation activity for cP-RNAs, RtcB can also be used for cP-RNA sequencing (Donovan et al., 2017). Because RtcB can ligate 3<sup>0</sup> -P ends, as well as cP ends, a phosphatase treatment to remove 3<sup>0</sup> -P prior to RtcB-mediated 3<sup>0</sup> -adaptor ligation would be required for specific capture of cP-RNAs.

After cP-RNA sequencing, amplification and quantification of the representative identified cP-RNA species are necessary to validate their expression and analyze whether a cP end is the major 3<sup>0</sup> -end form of the identified sequences. Standard RT-qPCR, amplifying internal sequences of targeted RNAs, is inappropriate for specific amplification of cP-RNAs because it cannot distinguish between cP-RNAs and RNAs with other terminal states. To specifically analyze cP-RNAs, RNAs treated with T4 PNK or CIP can be subjected to 3<sup>0</sup> -adapter ligation, followed by TaqMan RT-qPCR targeting 3<sup>0</sup> -adapter-RNA ligation products (Honda and Kirino, 2015; Honda et al., 2015, 2017; Shigematsu et al., 2018; Zhao et al., 2018) (**Figure 4B**). The dependency of amplification signals on RNA treatment with T4 PNK, but not with CIP, allows researchers to confirm that the detected signals are derived from cP-RNAs because they should be ligated to a 3<sup>0</sup> -adapter only after cP removal by T4 PNK treatment. As an alternative method for analyzing cP ends, T4 PNK- or CIP-treated RNAs can be subjected to a poly (A) polymerase reaction which is able to add poly (A) tails to 3<sup>0</sup> -OH ends, but not to cP ends (Zaug et al., 1996). Moreover, northern blot can be used to observe slight differences in band mobility between cP-RNAs and RNAs with other terminal states (Honda et al., 2015, 2016).

### FUTURE PERSPECTIVES

fgene-09-00562 November 23, 2018 Time: 15:54 # 9

Despite the findings described in this review, current information regarding cellular expression profiles of cP-RNAs is very limited and fragmented. Although increasing accumulation of RNA-seq data has accelerated the comparative analyses of transcriptomes and, therefore, been critical to identifying significant RNA species in biological phenomena and diseases, the "invisibility" of cP-RNA expression in RNA-seq data makes cP-RNA research be still at an initial stage. The immediate future focus should be on capturing the comprehensive repertoire of cP-RNAs expressed in different tissues and cells by using the above described specific sequencing methods. Given that cP-RNAs are expressed as functional molecules, capturing the entire cP-RNA repertoire would broaden the catalog of functional non-coding RNAs and could reveal significant biological events that have been eluding standard RNA-seq. Besides cP-RNA expression, molecular mechanisms behind cP-RNA biogenesis and function still remain elusive. Presumably, not all cP-RNA-producing enzymes have been identified and characterized to date. Because determining cP-RNA-generating enzymes only by their aminoacid sequences and protein motifs is impossible, discovering

### REFERENCES


novel cP-RNA-generating enzymes will rely on detailed structural and biochemical characterizations of each enzyme. Given the already-proven biological roles of cP formation and cP-RNA expression, it is not surprising that cP-RNAs are involved in a wide range of biological processes. Considering the "hidden" nature of cP-RNAs in conventional RNA-seq data, further research efforts to characterize cP-RNAs would likely clarify substantially greater biological significance of cP-RNAs, which will advance our understanding of the expanding realm of noncoding RNA molecules.

### AUTHOR CONTRIBUTIONS

MS and YK conceptualized the theme and wrote the review with substantial help by TK in compiling reference papers. All authors reviewed and approved the final manuscript.

### FUNDING

Work in the lab on this topic has been supported by the National Institutes of Health Grant (GM106047 and AI130496 to YK), American Cancer Society Research Scholar Grant (RSG-17-059- 01-RMC to YK), the W. W. Smith Charitable Trust Grant (C1608 to YK), and a Japan Society for the Promotion of Science Postdoctoral Fellowship for Research Abroad (to MS).






**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Shigematsu, Kawamura and Kirino. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Regulatory Factors for tRNA Modifications in Extreme-Thermophilic Bacterium Thermus thermophilus

### Hiroyuki Hori\*

Department of Materials Sciences and Biotechnology, Graduate School of Science and Engineering, Ehime University, Matsuyama, Japan

Thermus thermophilus is an extreme-thermophilic bacterium that can grow at a wide range of temperatures (50–83◦C). To enable T. thermophilus to grow at high temperatures, several biomolecules including tRNA and tRNA modification enzymes show extreme heat-resistance. Therefore, the modified nucleosides in tRNA from T. thermophilus have been studied mainly from the view point of tRNA stabilization at high temperatures. Such studies have shown that several modifications stabilize the structure of tRNA and are essential for survival of the organism at high temperatures. Together with tRNA modification enzymes, the modified nucleosides form a network that regulates the extent of different tRNA modifications at various temperatures. In this review, I describe this network, as well as the tRNA recognition mechanism of individual tRNA modification enzymes. Furthermore, I summarize the roles of other tRNA stabilization factors such as polyamines and metal ions.

#### Edited by:

Akio Kanai, Institute for Advanced Biosciences, Keio University, Japan

#### Reviewed by:

Yoshitaka Bessho, Academia Sinica, Taiwan Toshiaki Fukui, Tokyo Institute of Technology, Japan

> \*Correspondence: Hiroyuki Hori

hori.hiroyuki.my@ehime-u.ac.jp; hori@eng.ehime-u.ac.jp

#### Specialty section:

This article was submitted to RNA, a section of the journal Frontiers in Genetics

Received: 26 December 2018 Accepted: 26 February 2019 Published: 08 March 2019

#### Citation:

Hori H (2019) Regulatory Factors for tRNA Modifications in Extreme-Thermophilic Bacterium Thermus thermophilus. Front. Genet. 10:204. doi: 10.3389/fgene.2019.00204 Keywords: methylation, RNA modification, thermophile, Thermus thermophilus, tRNA

## INTRODUCTION

Thermus thermophilus is an extreme-thermophilic bacterium isolated from Mine Hot Spring in Japan that can grow at a wide range of temperatures (50–83◦C) (Oshima and Imahori, 1974). This bacterium can grow under aerobic conditions and possesses only about 2200 genes. Therefore, T. thermophilus strain HB8 was selected as a model organism in the Structural-Biological Whole Cell Project in Japan (Yokoyama et al., 2000). A method for preparing gene disruptant strains of T. thermophilus has been established (Hoseki et al., 1999; Hashimoto et al., 2001). Furthermore, both expression vectors for T. thermophilus proteins in Escherichia coli cells and gene disruption vectors are available from RIKEN Bio Resource Center<sup>1</sup> . Today, T. thermophilus is one of the most studied thermophiles.

Transfer RNA is an adaptor molecule required for the conversion of genetic information encoded by nucleic acids into amino acid sequences in proteins. To date, more than 100 modified nucleosides have been found in tRNA from the three domains of life bacteria, archaea, and eukaryotes (Boccaletto et al., 2018). Modified nucleosides in tRNA primarily function in various

<sup>1</sup>http://dna.brc.riken.jp/ja/thermus

steps of protein synthesis such as stabilization of tRNA structure (Motorin and Helm, 2010; Lorenz et al., 2017), codon-anticodon interaction (Takai and Yokoyama, 2003; Suzuki and Numata, 2014; Agris et al., 2018), prevention of frame-shift errors (Björk et al., 1989; Farabaugh and Björk, 1999; Urbonavicius et al., 2001), recognition by aminoacyl-tRNA synthetases (Muramatsu et al., 1988; Perret et al., 1990; Ikeuchi et al., 2010; Mandal et al., 2010), and recognition by translation factors (Aström and Byström, 1994). In other words, living organisms cannot synthesize proteins efficiently or correctly without tRNA modifications.

**Figure 1** shows the sequence of tRNAPhe from T. thermophilus (Grawunder et al., 1992; Tomikawa et al., 2010). Eleven kinds of modified nucleosides indicated in red in **Figure 1** have been found at ten positions in this tRNA (**Figure 2**); the abbreviations of modified nucleosides, related tRNA modification enzymes, and references (Watanabe et al., 1974; Caillet and Droogmans, 1988; Kammen et al., 1988; Nurse et al., 1995; Persson et al., 1997; Mueller et al., 1998; Esberg et al., 1999; Kambampati and Lauhon, 1999; Bishop et al., 2002; Hori et al., 2002; De Bie et al., 2003; Droogmans et al., 2003; Pierrel et al., 2004; Urbonavicius et al., 2005; Shigi et al., 2006a, 2016; Tomikawa et al., 2010; Ishida et al., 2011; Roovers et al., 2012; Shigi, 2012; Kusuba et al., 2015; Takuma et al., 2015; Yamagami et al., 2016; Bou-Nader et al., 2018) are summarized in **Table 1**. These modifications are post-transcriptionally conferred by tRNA modification enzymes, which generally act only at one position in tRNA. Thus, specific tRNA modification enzymes exist for the specific positions in specific tRNA species, even though they may synthesize the same type of modified nucleoside; for example, 2'-O-methylcytidine at position 32 (Cm32) in E. coli tRNAMet is conferred by TrmJ (Purta et al., 2006), whereas Cm34 in tRNALeu is synthesized by TrmL (Benítez-Páez et al., 2010). Moreover, in several cases, multiple tRNA modification enzymes and related proteins are required for synthesis of one modified nucleoside; for example, the m<sup>5</sup> s <sup>2</sup>U54 modification in T. thermophilus tRNAPhe requires methylation by TrmFO (Urbonavicius et al., 2005) and thiolation by TtuA, TtuB, TtuC, TtuD, and IscS (Shigi et al., 2006a, 2008, 2016; Shigi, 2012) (**Table 1**). In addition, the modifications in the anticodonloop often require multiple enzymes and many substrates. For example, the synthesis of 5-methylaminomethyl-2-thiouridine at position 34 (mnm<sup>5</sup> s <sup>2</sup>U34) in E. coli tRNAs requires ten proteins (mnmA, mnmC, mnmE, mnmG, TusA, TusB, TusC, TusD, TusE, and IscS) and eight substrates (S-adenosyl-Lmethionine, NH<sup>4</sup> <sup>+</sup>, ATP, GTP, 5, 10-methylenetetrahyrdofolate, NADH, glycine, and cysteine) (Armengod et al., 2014). Therefore, living organisms need numerous tRNA modification enzymes and related proteins beyond the multitude of modified nucleosides in tRNA.

Given that T. thermophilus grows at high temperatures, its biomolecules including tRNA and tRNA modification enzymes show extreme heat-resistance. As a result, modified nucleosides in tRNA from T. thermophilus have been studied from the viewpoint of stabilization of tRNA structure at high temperatures. In this review, I focus on the thermal adaptation system of tRNA modifications in T. thermophilus, which is regulated by many factors.

### THE m5s <sup>2</sup>U54 MODIFICATION IN T. thermophilus tRNA IS ESSENTIAL FOR PROTEIN SYNTHESIS AT HIGH TEMPERATURES

#### Discovery of m5s <sup>2</sup>U54 in tRNA From T. thermophilus

The m<sup>5</sup> s <sup>2</sup>U modification is a typical thermophile-specific modified nucleoside in tRNA (Hori et al., 2018). It was originally identified in RNase T1-digested RNA fragments derived from a tRNA mixture from T. thermophilus and sequence of the corresponding fragment strongly suggested that a portion of m5U at position 54 is replaced by m<sup>5</sup> s <sup>2</sup>U (Watanabe et al., 1974). Subsequently, it was found that this modified nucleoside was increased according to increasing in temperature of the cultures and melting temperature of tRNA was risen with increasing in the extent of m<sup>5</sup> s <sup>2</sup>U in tRNA (Watanabe et al., 1976). The presence of m<sup>5</sup> s <sup>2</sup>U54 was first confirmed in tRNAMet <sup>f</sup>1 and tRNAMet f2 (Watanabe et al., 1979), and then tRNA containing m<sup>5</sup> s <sup>2</sup>U54 was separated from tRNA containing m5U54 (Watanabe et al., 1983). The m<sup>5</sup> s <sup>2</sup>U54 modification has been identified in all T. thermophilus tRNA species sequenced so far [tRNAIle1 (Horie et al., 1985), tRNAAsp (Keith et al., 1993) and tRNAPhe (Grawunder et al., 1992; Tomikawa et al., 2010)].

#### Structural Effect of m5s <sup>2</sup>U54 on tRNA

The m<sup>5</sup> s <sup>2</sup>U54 modification forms a reverse-Hoogsteen base pair with A58 (or m1A58), and this base pair stacks with the G53- C61 base pair in the T-stem (Yokoyama et al., 1987). The

hydrophobic interaction between the m<sup>5</sup> s <sup>2</sup>U54-A58 (or m1A58) and G53-C61 base pairs stabilizes the tertiary G18-ψ55 and G19-C56 base pairs between the T- and D-arms. Therefore, m5 s <sup>2</sup>U54 contributes to stabilization of the L-shaped tRNA structure, and the melting temperature of tRNA is increased more than 3◦C by the presence of m<sup>5</sup> s <sup>2</sup>U54 (Watanabe et al., 1976;


TABLE 1 | Modified nucleosides in tRNAPhe from T. thermophiles.

fgene-10-00204 March 7, 2019 Time: 16:54 # 4

Davanloo et al., 1979). Furthermore, the melting temperature of a tRNA mixture is maintained above 85◦C by the 2 thiomodification in m<sup>5</sup> s <sup>2</sup>U54 (Shigi et al., 2006b), enabling T. thermophilus to grow at temperatures of 50–83◦C.

#### Temperature-Dependent Regulation of m5s <sup>2</sup>U54 Content in tRNAPhe and Protein Synthesis

**Figure 3**, which is based on a combination of previous experimental results, shows the percentage of different modified nucleosides (m<sup>5</sup> s <sup>2</sup>U54, Gm18, m7G46, and m1A58) in tRNAPhe , purified from T. thermophilus cells cultured at 50, 60, 70, and 80◦C (Tomikawa et al., 2010; Ishida et al., 2011; Yamagami et al., 2016). As shown in **Figure 3**, the extent of m<sup>5</sup> s <sup>2</sup>U54 in tRNAPhe increases with increasing culture temperatures. The balance between m<sup>5</sup> s <sup>2</sup>U54 and m5U54 regulates the rigidity (flexibility) of tRNA at a wide range of temperatures (50–83◦C).

The presence of m<sup>5</sup> s <sup>2</sup>U54 in tRNA is required for efficient protein synthesis at high temperatures. A previous study measured the activity of Poly (U)-dependent poly-phenylalanine synthesis at several temperatures using various fractions (tRNAPhe, ribosome, and supernatant of 100,000 × g centrifugation fraction) prepared from T. thermophilus cells cultured at 50 and 80◦C (Yokoyama et al., 1987). Polyphenylalanine was effectively synthesized at 80◦C only when tRNAPhe from cells cultured at 80◦C was used. As shown in **Figure 3**, the proportion of m<sup>5</sup> s <sup>2</sup>U54 in tRNAPhe from cells cultured at 80◦C is higher than that in tRNAPhe from cells cultured at 50◦C. The melting temperature of a tRNA mixture from cells cultured at 70◦C is 79.7◦C in the presence of 50 mM Tris–HCl (pH7.5), 5 mM MgCl<sup>2</sup> and 100 mM NaCl (Tomikawa et al., 2010); thus, the structure of tRNAPhe from cells cultured at 50◦C seems to be looser than that from cells cultured at 80◦C. Indeed, the gene disruption strain of ttuA encoding a component of sulfur-transfer complex for the m<sup>5</sup> s <sup>2</sup>U54 formation, results in a growth defect of the T. thermophilus strain at 80◦C (Shigi et al., 2006a). Thus, the m<sup>5</sup> s <sup>2</sup>U54 modification is essential for effective protein synthesis in T. thermophilus at high temperatures.

FIGURE 3 | Extents of m7G46, m1A58, Gm18, and m<sup>5</sup> s <sup>2</sup>U54 modifications in tRNAPhe from T. thermophilus. The extent of m7G46, m1A58, and Gm18 modifications was measured by a methylation assay with TrmB, TrmI, and TrmH, respectively. The extent of m<sup>5</sup> s <sup>2</sup>U54 modification was estimated from the peak areas of m5U54 and m<sup>5</sup> s <sup>2</sup>U54 on HPLC analysis. This figure is prepared from Figure 3 in a chapter "Regulation of Protein Synthesis via the Network Between Modified Nucleotides in tRNA and tRNA Modification Enzymes in T. thermophilus, a Thermophilic Eubacterium" of a book "Modified Nucleic Acids in Biology and Medicine," Springer Nature 2016 with permission (4517441319562) from the publisher.

#### m5s <sup>2</sup>U54 in tRNA From Other Thermophiles

In addition to T. thermophilus, m<sup>5</sup> s <sup>2</sup>U has been identified among the modified nucleosides in unfractionated tRNA from Thermotoga maritima (Edmonds et al., 1991) and at position 54 in tRNACys from Aquifex aeolicus (Awai et al., 2009). Furthermore, the m<sup>5</sup> s <sup>2</sup>U nucleoside has been identified among the modified nucleosides in unfractionated tRNA from some hyper-thermophilic archaea such as Thermococcus species (Edmonds et al., 1991) and Pyrococcus furiosus (Kowalak et al., 1994). Therefore, it is considered that these hyper-thermophiles also possess the m<sup>5</sup> s <sup>2</sup>U54 modification in tRNA.

#### Biosynthetic Pathway of m5s <sup>2</sup>U54 in Eubacteria and Archaea

Biosynthesis of m<sup>5</sup> s <sup>2</sup>U54 is accomplished in two steps, methylation of the C5 atom and the 2-thiolation. Although these steps occur independently at U54 in tRNA (Shigi et al., 2006a; Yamagami et al., 2016), the m5U54 modification in tRNA is almost fully formed in living cells even when T. thermophilus is cultured under nutrient-poor conditions (Yamagami et al., 2018). To date, therefore, an s2U54 modification has not been observed in tRNA from the T. thermophilus wild-type strain.

Formation of m5U54 is catalyzed by different tRNA methyltransferases in bacteria and archaea. A folate/FADdependent tRNA methyltransferase (TrmFO) catalyzes the methylation using 5, 10-methylenetetrafolate as a methyl donor in bacteria (Urbonavicius et al., 2005; Nishimasu et al., 2009; Yamagami et al., 2012; Hamdane et al., 2016), whereas an S-adenosyl-L-methionine (AdoMet)-dependent tRNA methyltransferase [RumA (or TrmA)-like enzyme] works in archaea (Urbonavicius et al., 2008). Notably, E. coli TrmA is an AdoMet-dependent tRNA (m5U54) methyltransferase (Ny and Björk, 1980), whereas E. coli RumA is an AdoMetdependent 23S rRNA (m5U1939) methyltransferase (Agarwalla et al., 2002; Madsen et al., 2003). Although these RNA m5U methyltransferases belong to the same cluster of orthologous proteins (group COG2265) (Urbonavicius et al., 2008), archaeal tRNA (m5U54) methyltransferases structurally resemble RumA rather than TrmA (Walbott et al., 2008). Therefore, archaeal tRNA (m5U54) methyltransferases for m<sup>5</sup> s <sup>2</sup>U54 formation might have evolved from a RumA-type rRNA methyltransferase.

The 2-thiolation in m<sup>5</sup> s <sup>2</sup>U54 of T. thermophilus is conferred by multiple proteins, namely TtuA, TtuB, TtuC, TtuD and IscS (or SufS) (Shigi et al., 2006a, 2008, 2016; Shigi, 2012; Chen et al., 2017). In addition, the mechanism of the sulfurtransfer reaction carried out by TtuA from T. maritima has been recently proposed based on crystal structures of the enzyme (Arragain et al., 2017). The protein factors involved in 2-thiolation of m<sup>5</sup> s <sup>2</sup>U54 in archaea have not been confirmed experimentally.

### tRNA MODIFICATION ENZYMES RECOGNIZES THE LOCAL STRUCTURE(S) IN tRNA

At the beginning of this century, the mechanisms of regulating the extent of modified nucleosides in tRNA from T. thermophilus were unknown. The transcriptional and/or translational regulations of amounts of tRNA modification enzyme(s) were assumed at the start of our studies, however, we noticed that this regulation might be explainable by the substrate tRNA recognition mechanisms of the tRNA modification enzymes. In general, tRNA modification enzymes recognize the local structure in tRNA. **Figure 4** shows the minimum substrate or positive determinants for different tRNA modification enzymes, which I describe in more detail below.

## m5U54 Formation by TrmFO

TrmFO can methylate a micro-helix RNA, which mimics the T-arm structure (Yamagami et al., 2012; **Figure 4A**). The positive determinants for TrmFO are the stem-loop structure, G53-C61 base pair, and U54U55C56 sequence. Therefore, the substrate RNA recognition mechanism of TrmFO is very simple. This simple structure is also observed in the anticodon-loop of tRNAPro from T. thermophilus; however, A38 in the anticodonloop prevents incorrect methylation by TrmFO (Yamagami et al., 2012). In some cases, therefore, there are negative determinants for tRNA modification enzymes.

#### s <sup>2</sup>U54 Formation by TtuA

In the case of TtuA (**Figure 4B**), only the modification patterns of tRNAAsp mutants expressed in T. thermophilus cells have been analyzed (Shigi et al., 2002); therefore, it is unknown whether TtuA can act on a micro-helix RNA. Nevertheless, it is clear that the positive determinants for the sulfur-transfer reaction of TtuA are also very simple. More recently, it was shown that the presence of m1A58 accelerates the velocity of sulfur-transfer (Shigi et al., 2006b). Thus, the sulfur-transfer reaction carried out by TtuA is regulated by another modification (m1A58).

## m1A58 Formation by TrmI

The tRNA m1A58 methyltransferase TrmI can methylate a micro-helix RNA very slowly (Takuma et al., 2015; **Figure 4C**). The presence of an aminoacyl-stem or variable region accelerates the rate of methylation of truncated tRNA by TrmI. Furthermore, the presence of an m7G46 modification also accelerates the rate of methylation by TrmI (Tomikawa et al., 2010). Thus, the extent of m1A58 modification in tRNA is also controlled by the presence of another modification (m7G46).

Among T. thermophilus tRNAs, tRNAThr GGU exceptionally possesses C60 instead of U60, which is one of the positive determinants for TrmI. The proportion of m1A58 and m<sup>5</sup> s <sup>2</sup>U54 in tRNAThr GGU (Kazayama et al., 2015) is lower than that in tRNAPhe (Takuma et al., 2015), indicating that the extent of m1A58 modification has an effect on the extent of m<sup>5</sup> s <sup>2</sup>U54 modification, which is consistent with the observation of Shigi et al. (2006b). Thus, the degree of m1A58 modification in tRNAs differs according to the sequence of each tRNA. The sequence of tRNAThr GGU seems to be disadvantageous for survival of T. thermophilus at high temperatures, however, the physiological reason for the presence of C60 in tRNAThr GGU is unknown.

### ψ55 Formation by TruB

Escherichia coli TruB can modify a micro-helix RNA, which mimics the T-arm structure (Gu et al., 1998). U54, U55, pyrimidine56 and A58 in the T-loop are important for the full activity of E. coli TruB (**Figure 4D**; Gu et al., 1998). The crystal structure of a complex of TruB and micro-helix RNA showed that other nucleotides in the T-loop contact with TruB (Hoang and Ferré-D'Amaré, 2001; Pan et al., 2003). The ribosephosphate backbone in the T-arm structure, which is formed by the U54-A58 reverse Hoogsteen base pair, is important for the reaction by E. coli TruB. Although the substrate tRNA

recognition mechanism of T. thermophilus TruB has not been confirmed experimentally, the high conservation of amino acid sequences between E. coli and T. thermophilus TruB proteins (Ishida et al., 2011) strongly suggests that these enzymes possess a common mechanism of tRNA recognition. Formation of ψ55 in T. thermophilus tRNAPhe transcript by TruB is very rapid at 55◦C (Ishida et al., 2011), suggesting that T. thermophilus TruB does not require other modifications in tRNA in order to function.

## m7G46 Formation by TrmB

Aquifex aeolicus TrmB can methylate a truncated tRNA (**Figure 4E**): its methylation speed for truncated RNA is comparable to that for the full-length tRNA transcript (Okamoto et al., 2004). In the truncated tRNA, the T-arm-like structure and five nucleosides corresponding to variable region are essential for methylation by TrmB. Thermophilic TrmB (A. aeolicus and T. thermophilus TrmB) share considerable amino acid sequence homology and exceptionally possess a long C-terminal region (Okamoto et al., 2004), which is involved in binding to AdoMet (Tomikawa et al., 2018). Therefore, the substrate tRNA recognition mechanism of thermophilic TrmB may be different from that of mesophilic TrmB (De Bie et al., 2003; Purta et al., 2005; Zegers et al., 2006; Zhou et al., 2009; Tomikawa, 2018).

## Gm18 Formation by TrmH

**Figure 4F** shows the minimum substrate RNA of A. aeolicus TrmH (Hori et al., 2003). There are some differences in substrate tRNA recognition mechanism between A. aeolicus TrmH and T. thermophilus TrmH. For example, A. aeolicus TrmH cannot methylate tRNAs with A17 (Hori et al., 2003), but T. thermophilus TrmH methylates tRNASer, which contains A17 (Hori et al., 1998). Furthermore, the size and sequence of the D-loop have effects on methylation by A. aeolicus TrmH (Hori et al., 2003), however, T. thermophilus TrmH methylates mutant tRNAs irrespective of these D-loop features (Ochi et al., 2010). In short, T. thermophilus TrmH has been found to methylate all tRNAs

tested so far (Hori et al., 1998). T. thermophilus TrmH can methylate a 5'-half fragment of tRNA (Matsumoto et al., 1990) but not a micro-helix that mimics the D-arm (Hirao et al., 1989). Therefore, T. thermophilus TrmH may recognize the bulge structure like A. aeolicus TrmH. This idea is consistent with observations that the cross-linking or chemical modification of s <sup>4</sup>U8 in substrate tRNA causes a decrease in methylation speed by TrmH (Hori et al., 1989). Full activity of TrmH requires the L-shaped structure formed by conserved nucleosides in tRNA (Hori et al., 1998). Furthermore, the speed of methylation by T. thermophilus TrmH for yeast tRNAPhe transcript is slower than that for native yeast tRNAPhe at 65◦C, showing that other modified nucleosides have a positive effect on the methylation by TrmH (Hori et al., 1998). Indeed, the presence of m7G46 in tRNAPhe transcript accelerates methylation speed by TrmH (Tomikawa et al., 2010).

#### s <sup>4</sup>U8 Formation by ThiI

Escherichia coli and T. maritima ThiI (tRNA 4-thiouridine synthetase) can modify several truncated tRNAs that mimic the aminoacyl-stem and T-arm (**Figure 4G**; Lauhon et al., 2004; Neumann et al., 2014). Although the substrate tRNA recognition mechanism of T. thermophilus ThiI has not been investigated, it is likely that T. thermophilus ThiI also recognizes the local structure in tRNA. The THUMP domain in ThiI recognizes the CCA terminus of substrate RNA (Neumann et al., 2014). This feature has been identified in another tRNA modification enzyme, archaeal Trm11, which also possesses a THUMP domain (Hirata et al., 2016). Furthermore, given that TrmN and archaeal Trm14 have a THUMP domain (Menezes et al., 2011; Fislage et al., 2012; Roovers et al., 2012), these enzymes may recognize the CCA terminus in tRNA.

### D20 and D20a Formations by DusA

In T. thermophilus, single Dus family protein, DusA synthesizes all D modifications (D20 and D20a) in tRNA (Kusuba et al., 2015); in E. coli, by contrast, three Dus family proteins share the modification sites in tRNA (Bishop et al., 2002; Bou-Nader et al., 2018). Among the T. thermophilus tRNA modification enzymes that act on the three-dimensional core in tRNA, DusA exceptionally recognizes the interaction between the T-arm and D-arm (Yu et al., 2011). For the reaction of DusA at high temperatures, therefore, stabilization of the L-shaped tRNA structure by other modified nucleosides is essential (Kusuba et al., 2015). Thus, D20 and D20a seem to be relatively late modifications in T. thermophilus tRNA.

### INITIAL BINDING AND INDUCED-FIT STEPS IN COMPLEX FORMATION BETWEEN tRNA MODIFICATION ENZYMES AND SUBSTRATE tRNA

As shown in **Figure 4**, many tRNA modification enzymes recognize the local structure in tRNA, while the interaction between the T-arm and D-arm in tRNA is not required for their activity. Indeed, in the case of TrmI, the methylation speed is faster for a mutant tRNA transcript, in which the interaction between the T-arm and D-arm is disrupted, than for the wild-type tRNA transcript (Takuma et al., 2015).

In many cases, the target modification site is embedded in the L-shaped tRNA structure. For example, G18 and U55 form a tertiary base pair and the uracil base in U55 is not localized at the surface of tRNA. Therefore, disruption of L-shaped tRNA structure is necessary for the reaction of TruB. Furthermore, to fit into the catalytic pocket of TruB, the uracil base must be flipped. These observations suggest that the reaction of tRNA modification enzyme comprises at least two steps, initial binding to the L-shaped tRNA, followed by a structural change (inducedfit) process in which the L-shaped tRNA structure is disrupted.

These steps have been monitored for complex formation between T. thermophilus TrmH and tRNAPhe transcript by using a stopped-flow fluorescence measurement system (**Figure 5**; Ochi et al., 2010, 2013). T. thermophilus TrmH is a member of the SpoU-TrmD (so-called SPOUT) methyltransferase superfamily (Anantharaman et al., 2002; Nureki et al., 2004; Hori, 2017), and site-directed mutagenesis studies (Nureki et al., 2004; Watanabe et al., 2005) suggest that arginine at position 41 (Arg41) in TrmH acts as the catalytic center. As shown in **Figure 5A**, TrmH is a dimeric enzyme with three tryptophan residues (Trp73, Trp126, and Trp191) in each subunit. The fluorescence intensity at 320 nm derived from these tryptophan residues was measured during complex formation between TrmH and tRNA. **Figure 5B** shows the one result obtained when 7.70 µM TrmH–AdoMet complex and 7.70 µmM tRNAPhe transcript were mixed by the stopped-flow system at 25◦C (Ochi et al., 2010). A very fast decrease in fluorescence was observed in the initial 10 ms, followed by relatively slow increase in fluorescence from 10 to 50 ms. In general, a decrease of tryptophan fluorescence intensity suggests an increase in accessibility of the residue to solvent water. The obtained data could be fitted to an equation, which showed that the reaction was bi-molecular binding reaction. In the case of TrmH, therefore, the initial decrease of fluorescence suggests that a tryptophan residue(s) is moved to the solvent during the initial binding process. Subsequently, this residue was confirmed as Trp126 by measurements on TrmH mutant proteins (Ochi et al., 2013). The slow increase in fluorescence from 10 to 50 ms reflects the structural change (induced-fit) process; detailed measurements on several concentrations of TrmH and tRNA indicate that the slow increase in fluorescence is fitted to a combination of unimolecular (first-order) reactions. The methylation was not completed within 50 ms, showing that the slow increase in fluorescence was not caused by dissociation of tRNA from the enzyme after the reaction. In the structural change process, Trp126 is moved to a hydrophilic environment. During the induced fit process, disruption of L-shaped tRNA structure, recognition of 6-oxygen in G18, and introduction of ribose into the catalytic pocket are occurred (Ochi et al., 2010).

The folding of tRNA and rigidity (flexibility) of the local structure in tRNA affect the speed of the initial binding and induced-fit processes. Thus, other modifications in tRNA, temperature, and RNA stabilization factors such as Mg2<sup>+</sup> ions

and polyamines are all likely to influence on the initial binding and induced-fit processes.

### A NETWORK BETWEEN MODIFIED NUCLEOSIDES IN tRNA AND tRNA MODIFICATION ENZYMES CONTROLS THE FLEXIBILITY (RIGIDITY) OF tRNA IN T. thermophilus AT A WIDE RANGE OF TEMPERATURES

A method for preparing gene disruptant strain of T. thermophilus was developed at the beginning of this century (Hoseki et al., 1999; Hashimoto et al., 2001). This gene-disruption system, coupled with biochemical studies, has been used to elucidate the regulatory network between modified nucleosides in tRNA and tRNA modification enzymes in T. thermophilus cells. **Figure 6** summarizes the network between modified nucleosides in tRNA and tRNA modification enzymes. Although each tRNA modification enzymes can act on unmodified tRNA transcript (or truncated tRNA transcript as shown in **Figure 4**), the presence of modified nucleosides often accelerates (or slows down) the speed of modification by other tRNA modification enzymes depending on the environmental temperatures.

T. thermophilus lives in hot springs. The temperature of hot spring water can change for several reasons including an influx of river water, snowfall, and eruption of hot water. Therefore, the ability of protein synthesis to adapt to temperature changes via the flexibility (rigidity) of tRNA is very important for survival of T. thermophilus. One of advantages of the network is that it does not require protein synthesis. As a result, it can respond rapidly. Furthermore, the network may be a survival strategy of eubacteria, which have a limited genome size.

## Network at High Temperatures (>75◦C)

At high temperatures (>75◦C), m7G46 modification by TrmB is one of the key modifications in the network (**Figure 6A**). It has been shown that the trmB gene deletion strain does not grow at 80◦C and has several hypo-modifications in tRNA (Tomikawa et al., 2010). When the culture temperature is shifted from 70 to 80◦C, tRNAPhe and tRNALys are degraded and protein synthesis is impaired in this strain. Particular, heat shock proteins are not synthesized efficiently in the trmB gene deletion strain. Thus, the m7G46 modification is essential for survival of T. thermophilus at high temperatures.

The positive effects of m7G46 on TrmH, TrmD, and TrmI activity have been confirmed by in vitro experiments. Because the m1G37 modification conferred by TrmD is not present in T. thermophilus tRNAPhe, yeast tRNAPhe transcript was used in these experiments (Tomikawa et al., 2010). TrmH can methylate a 5'-half fragment of tRNA (Matsumoto et al., 1990), but its full activity requires the three-dimensional core structure of tRNA (Hori et al., 1998). m7G46 forms a tertiary base pair with the C13-G22 base pair in the D-arm; thus, this tertiary base pair seems to have a positive effect on TrmH activity. TrmD can act on a truncated tRNA (Redlak et al., 1997) and micro-helix RNA (Takeda et al., 2006), but foot-printing analyses have shown that the D-arm and variable region are protected in addition to the anticodon-arm (Gabryszuk and Holmes, 1997). Furthermore, the crystal structure of the TrmD–tRNA complex revealed that the C-terminal domain of TrmD makes contacts with the D-arm in tRNA (Ito et al., 2015). Therefore, the positive effect of m7G46 on TrmD activity can be explained by stabilization of the D-arm via the formation of an m7G46-C13-G22 tertiary base pair.

In the case of TrmI, the presence of aminoacyl-stem or variable region increases the methyl-group acceptance activity of a truncated tRNA transcript (Takuma et al., 2015). Although there is no docking model of TrmI with tRNA, positivelycharged grooves, which are present on the surface of TrmI (Barraud et al., 2008), may capture the aminoacyl-stem and variable region in tRNA.

The positive effect of m5U54 on TrmI activity was also confirmed in vitro (Yamagami et al., 2012). However, the tRNA fraction from a trmFO gene deletion strain cultured at 70◦C contained the same amount of m1A nucleoside as that from the wild-type strain (Yamagami et al., 2016). Therefore, there seems to be a sufficient amount of TrmI to maintain the extent of m1A58 in tRNA in the T. thermophilus trmFO gene deletion strain. The

m7G46 modification (highlighted in red) is a key factor in this network. Its presence accelerates the speed of other tRNA modification enzymes such as TrmH, TrmD, and TrmI. In addition, the presence of m5U54 also increases the methylation speed of TrmI. The increase in m1A58 due to accelerated TrmI activity further increases the speed of sulfur-transfer by TtuA and that of related proteins, and results in an increased percentage of m<sup>5</sup> s <sup>2</sup>U54. The introduced modifications coordinately stabilize the L-shaped tRNA structure. (B) Network at low temperatures (<55◦C). In this network, the ψ55 modification stabilizes the local structure in tRNA and slows down the speed of tRNA modification enzymes. The m5U54 modification plays a role in maintaining the balance of modifications at the elbow region in tRNA. This figure is prepared from Figure 4 in a chapter "Regulation of Protein Synthesis via the Network Between Modified Nucleotides in tRNA and tRNA Modification Enzymes in T. thermophilus, a Thermophilic Eubacterium" of a book "Modified Nucleic Acids in Biology and Medicine" Springer Nature 2016 with permission (4517441319562) from the publisher.

positive effect of m5U54 on the TrmI activity can be explained by stabilizing effect of the m5U54-A58 reverse Hoogsteen base pair.

As described in the Section "tRNA Modification Enzymes Recognizes the Local Structure(s) in tRNA", the m1A58 modification conferred by TrmI is a positive determinant for the sulfur-transfer system (TtuA and related proteins) (Shigi et al., 2006b). Therefore, TrmI is essential for survival of T. thermophilus at high temperatures (Droogmans et al., 2003). Furthermore, the m<sup>5</sup> s <sup>2</sup>U54 modification is essential for stabilizing the tRNA structure, as described in the Section "The m5 s <sup>2</sup>U54 Modification in T. thermophilus tRNA is Essential for Protein Synthesis at High Temperatures." As a result, the sulfurtransfer system for s2U54 formation is also essential for the survival of T. thermophilus at high temperatures (Shigi et al., 2006a). In contrast, TrmFO is not essential and the trmFO gene deletion strain can grow at 80◦C (Yamagami et al., 2016). Collectively, these modified nucleosides in tRNA coordinately stabilize the tRNA structure at high temperatures.

Given that the formation of D20 by DusA requires the interaction between D-arm and T-arm (Yu et al., 2011), stabilization of the L-shaped tRNA structure is essential for D20 formation at high temperatures (Kusuba et al., 2015). Therefore, several modifications, including m<sup>5</sup> s <sup>2</sup>U54, m1A58, Gm18 and ψ55, seem to be required for sufficient activity of DusA at high temperatures.

Among the modified nucleosides in T. thermophilus tRNAPhe , m2G6 by TrmN, s4U8 by ThiI, i6A37 and ms<sup>2</sup> i <sup>6</sup>A37 by MiaA and MiaB, and ψ39 by TruA have not been investigated as yet. However, it is possible that some of them may affect the stability of tRNA in T. thermophilus at high temperatures. For example, it has been recently reported that the melting temperature of tRNA from E. coli thiI gene disruptant strain is lower than that from the wild-type strain (Nomura et al., 2016). Therefore, s4U8 may contribute to stabilization of the structure of tRNA. In the case of anticodon-loop modifications (ψ38, i6A37, and ms<sup>2</sup> i <sup>6</sup>A37), their deletion may severely impair protein synthesis because they are important for the structure of anticodon-loop and function directly in protein synthesis (Spenkuch et al., 2014; Grosjean and Westhof, 2016; Schweizer et al., 2017). Furthermore, i6A37 is required for the 2'-Omethylation conferred by TrmL at position 34 in E. coli tRNALeu (Benítez-Páez et al., 2010; Zhou et al., 2015). In the case of T. thermophilus tRNALeu, therefore, TrmL is probably included in the network.

## Network at Low Temperatures (<55◦C)

At low temperatures (<55◦C), the ψ55 modification conferred by TruB works as a key factor in the network (**Figure 6B**). In the truB gene deletion strain, excess amounts of Gm18, m1A58 and m<sup>5</sup> s <sup>2</sup>U54 are introduced into tRNAs at low temperatures and the melting temperature of tRNA mixture is increased by more than 8◦C (Ishida et al., 2011). This excess rigidity of tRNA results in a disorder of protein synthesis, and cold shock proteins are not synthesized efficiently in the truB gene deletion strain. Therefore, ψ55 in tRNA is required for survival of T. thermophilus at low temperatures. The m5U54 modification aids ψ55 in maintaining the balance of other modifications in tRNA (Yamagami et al., 2016). The ψ55 modification stabilizes the structure of elbow region in tRNA and slows down the formation speed of other modifications around ψ55 (Gm18, m1A58 and m<sup>5</sup> s <sup>2</sup>U54) (Ishida et al., 2011). The positive effect of m5U54 on m1A58 modification was confirmed both by the in vivo methylation profile and by in vitro experiments (Yamagami et al., 2016) and is probably due to the stabilization of the m5U54-A58 reverse Hoogsteen base pair.

### WHAT STABILIZES THE STRUCTURE OF UNMODIFIED PRECURSOR tRNA IN T. thermophilus AT 80◦C?

Unmodified tRNA transcript cannot maintain its L-shaped tRNA structure at high temperatures. Primary transcript (precursor tRNA), which is synthesized by RNA polymerase, is unmodified. Therefore, even though several tRNA modification enzymes from T. thermophilus can act on unmodified tRNA transcript, their activities cannot be measured at 80◦C due to the disrupted structure of substrate RNA. For example, T. thermophilus TrmH methylates tRNA effectively only at temperatures below the melting temperature (Matsumoto et al., 1987). These observations raise an important question, what stabilizes the structure of unmodified precursor tRNA in T. thermophilus at 80◦C? If there were no stabilization factors in living cells, tRNA modification enzymes from T. thermophilus would not be able to act on precursor tRNA at 80◦C.

### Unique Polyamines in T. thermophilus and Their Interaction With tRNA

In general, living organisms produce three standard polyamines (putrescine, spermidine and spermine) (Bae et al., 2018; Igarashi and Kashiwagi, 2018). In addition to these standard polyamines, T. thermophilus produces at least 16 polyamine species, including long and branched polyamines (**Figure 7A**; Hamana et al., 1991; Oshima, 2007; Oshima et al., 2011).

Because polyamines have positive charges and hydrophilic regions, they have the potential to interact with nucleic acids and phospholipids. Indeed, there have been several studies on the interaction between polyamines and tRNA. For example, addition of polyamines shifted the melting temperature of native tRNAPhe from Saccharomyces cerevisiae to higher temperatures in accordance with polyamine length (Terui et al., 2005). The crystal structure of the complex of yeast tRNAPhe and spermine revealed that two spermine molecules bind to two sites in one tRNAPhe molecule (Quigley et al., 1978). Furthermore, FT-IR analysis of tRNA in the presence of putrescine, spermidine, and spermine showed that similar to spermine, putrescine and spermidine bind to the connection region between the D-arm and anticodon-stem (Ouameur et al., 2010). Moreover, a <sup>13</sup>C-NMR study reported that 14 spermidinebinding sites were present in one tRNA molecule and that three spermidine molecules stably bound to tRNA in the

presence of Mg2<sup>+</sup> (Frydman et al., 1990). It has also been reported that a branched polyamine (**Figure 7A**), tetrakis(3 aminopropyl)ammonium (Taa), slightly stimulates the activity of archaeal Trm1 and TrmI (Hayrapetyan et al., 2009), which are archaeal tRNA methyltransferases for the formation of m<sup>2</sup> <sup>2</sup>G26 (or m2G26) (Constantinesco et al., 1999) and m1A57 and m1A58 (Roovers et al., 2004), respectively.

### TrmH Methylates Unmodified tRNA Transcript at 80◦C Only in the Presence of Long or Branched Polyamine

The effect of polyamine on the methyl-transfer speed of TrmH for yeast tRNAPhe transcript has been measured at various temperatures (**Figure 7B**; Hori et al., 2016). All polyamines were found to increase the speed of methylation by TrmH at appropriate temperatures, although the positive effect of putrescine was relatively weak and observed only at low temperatures. As the length of polyamine increased, the optimum temperature for methyl-transfer shifted to higher temperatures, however, standard polyamines did not work at 80◦C. By contrast, very weak but clear methyl-transfer activity of TrmH for unmodified tRNAPhe was observed at 80◦C in the presence of optimum concentration (1.5 mM) of caldohexamine, a long polyamine, (red arrow in **Figure 7B**). Addition of branched polyamine Taa had a stronger positive effect on TrmH activity at 80◦C (red arrow in **Figure 7B**). Thus, long and branched polyamines can support the methylation by TrmH at 80◦C.

If initial modifications are introduced into unmodified precursor tRNA, they stabilize the local structure in tRNA enabling the tRNA modification enzymes to function on the precursor tRNA. Introduction of initial modifications into tRNA probably occurs in the presence of polyamines. Among polyamines, long and branched polyamines are effective at very high temperatures.

Because TrmB, which confers the m7G46 modification, is a key enzyme in the network at high temperatures, the effects of polyamines on TrmB activity should be clarified. At present, however, this is not possible due to a technical problem: T. thermophilus TrmB, which is expressed in E. coli cells, is partially degraded in these cells and purification of intact TrmB is difficult.

### Long and Branched Polyamines Are Required for Maintenance of 70S Ribosome and Several tRNAs

The biosynthetic pathway from arginine to spermidine in T. thermophilus is different from that in eukaryotes, archaea, and other bacteria because the intermediate in T. thermophilus is N 1 -aminopropylagmatine (Ohnuma et al., 2005, 2011). S-Adenosyl-L-methionine decarboxylase-like protein 1 (SpeD1) is required for the biosynthesis of N 1 -aminopropylagmatine from arginine, and aminopropylagmatine ureohydrolase (SpeB) catalyzes the conversion of N 1 -aminopropylagmatine to spermidine. Because long and branched polyamines are synthesized from spermidine, the T. thermophilus speD1 or speB gene deletion strain cannot produce long and branched polyamines (Nakashima et al., 2017) and cannot grow at high temperatures (>75◦C) unless polyamines are added to the medium. When the speD1 and speB deletion strains were cultured at 70◦C in minimal medium until mid-log phase and then the culture temperature was shifted to 80◦C, they could survive for 10 h. Although abnormal modifications in tRNA were expected, at least the m5U54, m7G46, Gm18 and m1A58 modifications were present as normal in the tRNA mixture from these strains. Given that the transcription of tRNA and the introduction of major modifications into tRNA are expected to occur mainly before the mid-log phase, it seems that these modification can be introduced into tRNA at 70◦C without the presence of long and branched polyamines: the expression patterns of mRNAs in the wild-type strain can be obtained from the database (NCBI/GEO<sup>2</sup> ) (Shinkai et al., 2007). After the temperature shift, tRNAHis, tRNATyr and 70S ribosome were gradually degraded in the speD1 and speB deletion strains and protein synthesis was severely impaired (Nakashima et al., 2017). Thus, long and branched polyamines are required to maintain several tRNAs at high temperatures in T. thermophilus in addition to regulating the extent of modified nucleosides in tRNA.

## Other Regulatory Factors for tRNA Stability

RNA binding proteins, Mg2<sup>+</sup> ions and K<sup>+</sup> ions can stabilize tRNA structure.

In A. aeolicus, a hyper-thermophilic bacterium, tRNAbinding protein 111 (Trbp111) stabilizes the three-dimensional core of tRNA (Morales et al., 1999; Swairjo et al., 2000). However, Trbp111 is specific to A. aeolicus and not found in T. thermophilus. Archease is also an RNA-binding protein that changes the specificity of archaeal Trm4 (Auxilien et al., 2007) and is required for tRNA splicing (Desai et al., 2014; Popow et al., 2014). Although neither trm4 nor a tRNA gene with an intron-coding region is not encoded in the T. thermophilus genome, an archease-like protein gene (TTHA1745) exists. Therefore, it is possible that an archeaselike protein stabilizes tRNA structure at high temperatures in T. thermophilus cells.

Mg2<sup>+</sup> ions are required for folding of tRNA (Lorenz et al., 2017) and are important in considering the structural effects of several modifications in tRNA (Yue et al., 1994; Agris, 1996; Nobles et al., 2002). However, the precise concentrations of Mg2<sup>+</sup> ions in T. thermophilus cells are unknown. K<sup>+</sup> ions are also important for correct folding of tRNA. The concentrations of K<sup>+</sup> ions in the cells of several thermophilic archaea are reported to be extremely high (>700 mM) (Hensel and Konig, 1988). However, the intracellular concentrations of K<sup>+</sup> ions of T. thermophilus have not been reported. The concentrations of Mg2<sup>+</sup> and K<sup>+</sup> ions may change depending on the growth environment.

### PERSPECTIVE

2018 marked the 50th anniversary year of the first isolation of T. thermophilus. In these past 50 years, T. thermophilus has been studied as a model organism that can adapt to extremely high temperatures. In this regard, the modifications in tRNA have been studied mainly from the viewpoint of the stabilization of tRNA structure at high temperatures. In particular, the role of m<sup>5</sup> s <sup>2</sup>U54 in tRNA has been clarified, the genes responsible for almost all tRNA modifications in T. thermophilus have been annotated and the regulatory network between modified nucleosides and tRNA modification enzymes has been identified. Furthermore, numerous tRNA modification enzymes from T. thermophilus have been used in structural studies (reviewed in Hori et al., 2018).

Nevertheless, several enigmas remain, even today. For example, the functions of tRNA modifications in the anticodonloop at high temperatures have not been studied in detail. Studies on anticodon-loop modifications are difficult because in vitro protein synthesis of T. thermophilus does not work effectively at high temperatures (>75◦C). Although there have been many attempts to synthesize proteins at high temperatures using a T. thermophilus cell-free translation system (Ohno-Iwashita et al., 1975; Uzawa et al., 1993a,b; Zhou et al., 2012), it remains difficult to monitor protein synthesis at 80◦C. As a result, our

<sup>2</sup>https://www.ncbi.nlm.nih.gov/gds/

knowledge about the effects of tRNA modifications on codonanticodon interactions, on maintenance of reading frame and on dynamics of tRNA on ribosome at high temperatures is limited. Furthermore, recent studies on mesophiles reported that tRNA modifications occur in response to environmental stresses or function as stress resistance factors (Nawrot et al., 2011; Preston et al., 2013; Endres et al., 2015; Jaroensuk et al., 2016; Campos Guillen et al., 2017). As yet, however, there are no studies from this viewpoint for T. thermophilus. To understand the roles of tRNA modifications in totality, further studies will be required.

### REFERENCES


### AUTHOR CONTRIBUTIONS

HH determined the concept of this review and prepared the manuscript.

### FUNDING

This work was supported by a Grant-in-Aid for Scientific Research (16H04763 to HH) from the Japan Society for the Promotion of Science (JSPS).



base pairs with adenosine but not with guanosine. Proc. Natl. Acad. Sci. U.S.A. 107, 2872–2877. doi: 10.1073/pnas.0914869107


thermophilus HB27 encodes the methyltransferase forming N-methylguanosine at position 6 in tRNA. RNA 18, 815–824. doi: 10.1261/rna.030411.111


horizontal gene transfer. Mol. Microbiol. 67, 323–335. doi: 10.1111/j.1365-2958. 2007.06047.x


methyltransferase. Nucleic Acids Res. 34, 1925–1934. doi: 10.1093/nar/gk l116


conservation of modern and ancient translation components. Nucleic Acids Res. 40, 7932–7945. doi: 10.1093/nar/gks568

**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Hori. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Multi-Substrate Specificity and the Evolutionary Basis for Interdependence in tRNA Editing and Methylation Enzymes

Sameer Dixit† , Jeremy C. Henderson† and Juan D. Alfonzo\*

Department of Microbiology, The Ohio State Biochemistry Program, The Center for RNA Biology, The Ohio State University, Columbus, OH, United States

Among tRNA modification enzymes there is a correlation between specificity for multiple tRNA substrates and heteromultimerization. In general, enzymes that modify a conserved residue in different tRNA sequences adopt a heterodimeric structure. Presumably, such changes in the oligomeric state of enzymes, to gain multi-substrate recognition, are driven by the need to accommodate and catalyze a particular reaction in different substrates while maintaining high specificity. This review focuses on two classes of enzymes where the case for multimerization as a way to diversify molecular recognition can be made. We will highlight several new themes with tRNA methyltransferases and will also discuss recent findings with tRNA editing deaminases. These topics will be discussed in the context of several mechanisms by which heterodimerization may have been achieved during evolution and how these mechanisms might impact modifications in different systems.

#### Edited by:

Tohru Yoshihisa, University of Hyogo, Japan

#### Reviewed by:

Hiroyuki Hori, Ehime University, Japan Yohei Kirino, Thomas Jefferson University, United States

#### \*Correspondence:

Juan D. Alfonzo alfonzo.1@osu.edu †These authors have contributed

#### Specialty section:

equally to this work

This article was submitted to RNA, a section of the journal Frontiers in Genetics

Received: 30 November 2018 Accepted: 30 January 2019 Published: 14 February 2019

#### Citation:

Dixit S, Henderson JC and Alfonzo JD (2019) Multi-Substrate Specificity and the Evolutionary Basis for Interdependence in tRNA Editing and Methylation Enzymes. Front. Genet. 10:104. doi: 10.3389/fgene.2019.00104 Keywords: translation, tRNA modification, mitochondria, inosine, deaminase, methylation

## INTRODUCTION

Critical to substrate binding specificity is the fact that enzymes need to achieve "high" affinity for their targets while ignoring non-targets. What makes this an especially difficult problem in cells is that substrates and non-substrates often look very similar, which tests the limits of enzymesubstrate recognition. This tenet is especially true of RNA binding proteins where high-affinity binding usually involves indirect readout of the phosphate backbone of RNA that must be combined with base- and shape-specific contacts to enable substrate discrimination. Many enzymes follow these rules to achieve effective target specificity, yet some face an additional obstacle, in that they must also maintain the ability to turn over during a catalytic cycle in order to yield a productive reaction. This issue is exacerbated when a single enzyme must target different substrates within a pool of nearly identical ones, as is the case faced by most tRNA modification enzymes.

A growing trend in the modification field is that many of the enzymes which recognize multiple substrates are heteromultimeric (Guy and Phizicky, 2014), where partnering may contribute to higher binding affinity and enable discrimination between nearly identical substrates. In a previous model based on observations with tRNA deaminases, it was suggested that homodimerization was necessary for the bacterial adenosine to inosine (A-to-I) deaminase to recognize a single tRNAArg substrate (Ragone et al., 2011; Spears et al., 2011). On the contrary to accommodate multiple

**56**

tRNA substrates of different sequences, key recognition motifs in the eukaryotic tRNA deaminases are positioned further from the active site (Ragone et al., 2011). Critically, such evolutionary adaptations would not have been possible without a move toward heterodimerization. In general, one could imagine that many heterodimeric (heteromultimeric) enzymes arose by gene duplication, which then allowed the duplicated genes to accumulate mutations, aiding in the process of neo-functionalization. Such is the case of the eukaryotic tRNA deaminase ADAT2/ADAT3, for which the two subunits are very similar but not identical, strongly arguing for a gene duplication event that led to functional differentiation of each subunit. A similar explanation may be true of other heteromultimeric modification enzymes, especially of methyltransferases. In the following pages, we will discuss in greater detail both the nature and evolution of heteromultimeric enzymes, with a focus on methyltransferases and deaminases. A previous review touched on the topic of neofunctionalization in the context of pseudouridine synthases (Fitzek et al., 2018). Here neofunctionalization will be only discussed in passing. We will highlight current themes and concepts that have arisen from recently published structures of a number of methyltransferases and also introduce the new concept of enzyme co-activation, whereby seemingly unrelated enzymes inactive on a specific substrate become active after association; a new twist to the idea of neofunctionalization.

### INTERDEPENDENT METHYLTRANSFERASES

### Methyltransferases That Require Heterodimerization

Certain eukaryotic tRNA methyltransferases strictly function as two-subunit enzymes: where association between a structural and a catalytic subunit is required for tRNA methylation (**Figure 1**). While this review will detail these associations, many recent reviews more broadly cover the enzymology and biological role of tRNA methyltransferase enzymes (Guy and Phizicky, 2014; Swinehart and Jackman, 2015; McKenney and Alfonzo, 2016; Goto-Ito et al., 2017; Hori, 2017; Traube and Carell, 2017). Beyond improvement to overall enzyme efficiency, heteromultimeric methyltransferases show altered activity toward particular tRNA species, particularly in choice of nucleotides and/or sites that are modified in a specific tRNA. In general, and where they have been identified, the equivalent bacterial and archaeal enzymes for a particular methylation do not require a heterologous partner for activity. Many have speculated that as the non-coding RNA pool expanded, pairing a non-catalytic with a catalytic subunit evolved to increase tRNA substrate specificity and prevent rampant nonspecific methylation. Heteromultimerization may also integrate environmental and metabolic cues needed for optimal translation profiles that promote homeostasis; these possibilities will be explored further in the following sections.

### Trm61/Trm6

The nuclear Trm61/Trm6 complex responsible for N-1 methyl adenosine at tRNA position 58 (**Figure 1**; m1A58) was first isolated from S. cerevisiae (Anderson et al., 1998, 2000). Trm61/Trm6 homologs exist in eukaryotic genomes of other yeast, protist, plant, and animal species (Bujnicki, 2001). The catalytic subunit Trm61 binds to co-substrate methyl donor S-adenosyl-L-methionine (SAM) through extensive hydrophobic and hydrogen bonding interactions (Wang M. et al., 2016). Trm61 association with Trm6 ensures highaffinity binding of Trm61/Trm6 to target tRNA substrates (Anderson et al., 2000; Ozanick et al., 2005, 2007; Finer-Moore et al., 2015; Wang M. et al., 2016). Disruption of Trm61 SAM binding eliminates m1A<sup>58</sup> activity in vitro and in vivo, consistent with its assignment as the catalytic subunit

(Anderson et al., 2000). Trm6 must associate with Trm61 for methyltransferase activity in vitro (Anderson et al., 2000), and in vivo S. cerevisiae trm6 or trm61 mutants lack m1A58 modified tRNA (Anderson et al., 1998). Human Trm61/Trm6 homologs can rescue the function of S. cerevisiae trm61 or trm6 mutants but only when expressed together, and maintain activity in vitro only when purified as a complex (Ozanick et al., 2005).

High-affinity binding of Trm61/Trm6 to tRNA substrates, such as tRNA<sup>i</sup> Met or tRNALys,<sup>3</sup> UUU, was originally thought to depend primarily on Trm6 as the RNA recognition subunit (Anderson et al., 2000; Ozanick et al., 2005). However, detailed mechanistic enzymology and crystal structures have refined this view, it is now clear that the Trm61/Trm6 holo-enzyme makes specific contacts with tRNA substrates (Ozanick et al., 2007; Finer-Moore et al., 2015; Wang M. et al., 2016). Together Trm61 and Trm6 create an L-shaped pocket to accommodate the tRNA substrate (Finer-Moore et al., 2015; Wang M. et al., 2016). The methylation site, nucleobase A<sup>58</sup> is buried deep within the conventional L-shaped tRNA structure between D- and T9C- stem-loop elements (Robertus et al., 1974). Crystallographic structure determination of the human Trm61/Trm6 complex bound to tRNALys,<sup>3</sup> UUU, revealed numerous protein–RNA interactions favoring separation of D- and T9C- stem-loop elements, effectively allowing the enzyme to access the otherwise inaccessible N-1 of A<sup>58</sup> (Finer-Moore et al., 2015).

Trm61/Trm6 has been suggested to arise through duplication and divergence from an ancestral TrmI-family enzyme, a family that catalyzes m1A<sup>58</sup> formation in bacteria, and/or m1A<sup>57</sup> in archaea (Grosjean et al., 1995; Bujnicki, 2001; Roovers et al., 2004; Ozanick et al., 2005, 2007; Guy and Phizicky, 2014; Finer-Moore et al., 2015; Wang M. et al., 2016). Purification of native TrmI from Mycobacterium tuberculosis and Thermus thermophilus yields stable TrmI homo-tetramers (Gupta et al., 2001; Droogmans et al., 2003; Barraud et al., 2008), a four subunit stoichiometry conserved in Trm61/Trm6, which forms a dimer of heterodimers (Ozanick et al., 2007; Finer-Moore et al., 2015). Many of the contact residues between TrmI subunits are maintained between Trm61/Trm6 subunits (Ozanick et al., 2007; Finer-Moore et al., 2015; Wang M. et al., 2016). The catalytic subunit Trm61 has high similarity to TrmI-family proteins, whereas Trm6 shows no apparent similarity to sequences deposited in current archaeal or bacterial databases. It is possible that Trm6 diverged from its ancestor, losing its ability to bind SAM, and maintaining only its RNA binding character. Interestingly, the tomato homolog of Trm6 (alias Gcd10) interacts with the dual methyltransferase/guanylyl transferase of the tobacco mosaic virus replicase complex (Osman and Buck, 1997; Taylor and Carr, 2000). This could indicate that other biologically relevant associations of Trm6 remain to be discovered. A recent report shows evidence that Trm61/Trm6 complexes catalyze low abundance m1A formation in regions of mRNA that loosely mimic tRNA T9Cloops (Safra et al., 2017). A more detailed investigation of Trm61/Trm6 substrate specificity toward these mRNA structures is warranted.

### Trm8/Trm82

The nuclear complex of Trm8/Trm82, responsible for N-7 methyl guanosine at position 46 (m7G46) in S. cerevisiae, shares many themes introduced in discussion of Trm61/Trm6 (**Figure 1**). Trm8 is a SAM-dependent methyltransferase that requires Trm82, a tryptophan-aspartic acid repeat (WD-repeat) homolog, for efficient activity (Alexandrov et al., 2002, 2005; Matsumoto et al., 2007; Muneyoshi et al., 2007; Leulliot et al., 2008). Deletion of either gene in yeast results in loss of m7G<sup>46</sup> in tRNAs. Trm8/Trm82 can be purified as a stoichiometric complex, where co-expression is required for activity in cellfree wheat germ extracts (Matsumoto et al., 2008). Much like Trm61/Trm6, co-expression of the human homologs METTL1 (Trm8) and WDR4 (Trm82) restores m7G<sup>46</sup> formation in 1trm8 or 1trm82 yeast strains, however individual expression of either human gene alone fails to complement these yeast mutants (Alexandrov et al., 2002). Disease mutants of either METTL1 or WDR4 that contribute to a rare form of primordial dwarfism, a prenatal growth deficiency that persists after birth, also result in human tRNAs that are deficient in m7G modification (Shaheen et al., 2015).

Trm8/Trm82 homologs are readily identifiable in yeast, protist, plant, and animal species. No apparent Trm8 or Trm82 homologs are currently found in archaeal genomes, which generally lack m7G46. Bacterial genomes do not contain an obvious Trm82 homolog, however, Trm8 shares sequence similarity with the bacterial TrmB-family (De Bie et al., 2003; Okamoto et al., 2004). Expression of TrmB homologs from Aquifex aeolicus or Escherichia coli in 1trm82 or 1trm81trm82 yeast restores tRNA m7G<sup>46</sup> formation without the need of an obvious cognate Trm82 (Alexandrov et al., 2005). This argues that Trm8 and TrmB may derive from a common ancestral protein, thematically consistent with the case of Trm61 and TrmI. However, unlike Trm61/Trm6, which may have arisen through duplication and drift, the partner subunit, Trm82, is a WD protein unrelated to any characterized methyltransferase.

Comparison of apo-Trm8/Trm82 to its tRNA bound form revealed that, unlike Trm61/Trm6, Trm82 makes no apparent RNA contacts in the context of a Trm8/Trm82 complex (Leulliot et al., 2008). This was corroborated by chemical crosslinking and small angle X-ray scattering experiments (Alexandrov et al., 2005; Leulliot et al., 2008). Although not formally tested, it is possible that Trm82 stabilizes conformations of Trm8 that promote substrate binding and/or productive methyl transfer. In turn, in vivo Trm8 levels are greatly reduced in 1trm82 yeast (Alexandrov et al., 2005), suggesting that Trm82 stabilizes cellular pools of Trm8 by a hitherto unknown mechanism.

### Trm9/Trm112 and Trm11/Trm112

Trm112 can partner with either Trm9 or Trm11 to form separate SAM-dependent tRNA methyltransferase complexes, which act on separate pools of cytoplasmic tRNA substrates, at different nucleotide sites, and produce distinct chemical products (**Figure 1**). In S. cerevisiae Trm11/Trm112 forms N-2 methylguanosine at position 10 (m2G10) in a broad pool of tRNA species (Purushothaman et al., 2005; Okada et al., 2009). Trm9/Trm112 catalyze a more specific terminal methylation

in the multi-step biosynthesis of 5-methoxycarbonylmethyl-2-thiouridine (mcm<sup>5</sup> s <sup>2</sup>U34) at the wobble position (position 34) in tRNAArg UCU and tRNAGlu UUC in yeast (Kalhor and Clarke, 2003; Jablonowski et al., 2006; Begley et al., 2007), with additional tRNA isoacceptors in higher eukaryotes (Songe-Moller et al., 2010), including human tRNALys UUU and tRNASec UGA. Trm9/Trm112 can also methylate non-thiouridine substrates to form mcm5U. The biological significance of yeast mcm5Uwobble site modification was assayed in a 1trm9 strain through comparative analyses of transcriptome, ribosomal footprinting and proteome data sets (Deng et al., 2015). Consistent with loss of their decoding function, hypomodified Trm9/Trm112 substrates tRNAArg UCU and tRNAGlu UUC, resulted in significant repression of protein expression for transcripts enriched in AGA or GAA codons (Deng et al., 2015). Analogous to two-subunit complexes discussed already, single yeast mutants of trm112 or trm9 each lack mcm5U34-tRNAs (Kalhor and Clarke, 2003; Mazauric et al., 2010; Chen et al., 2011), while trm112 or trm11 mutants each lack m2G10-tRNAs (Purushothaman et al., 2005; Okada et al., 2009).

Work in E. coli showed that co-expression of S. cerevisiae Trm9 and Trm112 is necessary to produce an active enzyme, as singularly purified catalytic subunit Trm9 is inactive (Mazauric et al., 2010; Chen et al., 2011). A crystal structure of the Yarrowia lipolytica Trm9/Trm112 complex, which shares 50–60% identity to the S. cerevisiae enzyme, has been reported (Letoquart et al., 2015). Co-crystal structures with substrate SAM or tRNA have not been obtained, although putative SAM-binding residues of the catalytic subunit (Trm9) have been mapped (Letoquart et al., 2015). The yeast Trm9/Trm112 association is stabilized in part by a β-zipper formed between parallel β-sheets of Trm9 and Trm112. Additional inter-subunit contacts bury a large hydrophobic region of Trm9, likely improving its solubility, and thus the in vitro methyltransferase activity of Trm9/Trm112 (Letoquart et al., 2015).

The C-terminus of the multi-domain protein methyltransferase ALKBH8 contains a subdomain with high sequence similarity to Trm9, sufficient in formation of mcm5U34−style modifications at the tRNA wobble position. ALKBH8 contains additional functional domains: an N-terminal RNA recognition motif, an AlkB-related domain, and a zinc finger region upstream to the C-terminal Trm9 orthology region (Fu et al., 2010; Songe-Moller et al., 2010; Leihne et al., 2011). Additional proteins with similarity to ALKBH8 exist in plant and protozoans, but many lack the Trm9-like methyltransferase domain (Zdzalik et al., 2014). ALKBH8 homologs with Trm9 like domains in Mus musculus and Arabidopsis thaliana maintain strict requirement for Trm112 association to form mcm5U-tRNA (Songe-Moller et al., 2010; Leihne et al., 2011), whereas the requirement for human ALKBH8 partnering with a Trm112 ortholog for wobble site methyltransfer has not been formally tested. Binding experiments have been performed, where substrate mimetic anticodon stem loop sequences bind more tightly to ALKBH8 in the presence of Trm112 (Pastore et al., 2012). Curiously, full length in vitro transcribed tRNA sequences showed no enhancement of ALKBH8 binding in the presence of Trm112 (Pastore et al., 2012). Specific examination of what impact the RNA recognition motif has on human ALKBH8 tRNA substrate specificity versus just the Trm9-like domain with Trm112 may prove insightful, especially as the substrate tRNA pool appears to have expanded in higher order eukaryotes. More detailed descriptions of molecular interactions between Trm9/Trm112 complexes and mcm5U-tRNA substrates remain outstanding.

The requirement of the S. cerevisiae Trm11 catalytic subunit to associate with Trm112 for m2G<sup>10</sup> methyltransferase activity has been shown in recombinant, purified protein mixtures and wheat germ cell-free assays (Purushothaman et al., 2005; Okada et al., 2009) Currently, no structure of an intact Trm11/Trm112 complex has been reported. Eukaryotes and archaea have readily identifiable Trm11 homologs, which are otherwise absent from bacteria. An archaeal version of the catalytic subunit Trm11 from Thermococcus kodakarensis with bound SAM has been crystallized (Hirata et al., 2016). Detailed substrate interaction studies of yeast Trm11/Trm112 complexes showed that Trm112 improves Trm11 binding affinity for SAM and tRNA substrate (Bourgeois et al., 2017b). Whether Trm112 directly interacts with tRNA in the context of a Trm11/Trm112 complex, similar to Trm61/Trm6, or solely provides allosteric support for Trm11 tRNA binding, as proposed for Trm8/Trm82, remains an open question.

As previously noted Trm11 homologs are absent from bacteria, while Trm9 mcm5U-type modifications are exclusive to eukaryotes, yet Trm112 is broadly conserved across every domain (van Tran et al., 2018). Trm112 pairs with additional protein partners beyond Trm9 and Trm11 to form two-subunit methyltransferases whose substrates include ribosomal RNA or translation release factors, a topic well-reviewed elsewhere (Guy and Phizicky, 2014; Bourgeois et al., 2017a). In formation of active tRNA methyltransferase complexes it is likely that cellular Trm112 levels are limiting (Ghaemmaghami et al., 2003; Studte et al., 2008; Sardana and Johnson, 2012). Trm112 copurifies stoichiometrically with Trm11 (Bourgeois et al., 2017b), and overexpression of Trm11 in yeast decreases the amount of Trm112 that co-immunoprecipitates with Trm9 (Studte et al., 2008). Biologically relevant conditions that depend on the competition between these catalytic subunits for Trm112 await discovery.

### Trm7/Trm732 and Trm7/Trm734

The catalytic subunit Trm7, a 2<sup>0</sup> -O ribose methyltransferase, acts at nucleotides C<sup>32</sup> or N<sup>34</sup> dependent upon cytoplasmic association with partner subunits Trm732 or Trm734, respectively (**Figure 1**). Nm<sup>32</sup> and Nm<sup>34</sup> modifications are observed in all three domains. Eukaryotic tRNAs that contain Nm<sup>32</sup> and Nm<sup>34</sup> likely rely on Trm7/Trm732 or Trm7/Trm734 homologs, which are mostly but not entirely conserved throughout deposited sequences of eukaryotic genomes. Nm<sup>32</sup> and Nm<sup>34</sup> in archaea and bacteria are catalyzed by methyltransferases not obviously related to Trm7, Trm732 or Trm734. Nm<sup>32</sup> modification is catalyzed by TrmJ-family members in bacteria and archaea (Purta et al., 2006; Somme et al., 2014), while TrmL forms Nm<sup>34</sup> in bacterial tRNAs (Benitez-Paez et al., 2010) and box C/D small nucleolar ribonucleoprotein complexes form Nm<sup>34</sup> in archaeal tRNAs

(Clouet d'Orval et al., 2001; Nolivos et al., 2005; Joardar et al., 2011). One of these archaeal Nm<sup>34</sup> modification complexes, was shown to use the excised intron from a processed tRNATrp transcript as the guide RNA to direct Nm<sup>34</sup> of pre-tRNATrp substrates (Clouet d'Orval et al., 2001).

In yeast and other eukaryotes, certain tRNAs contain both Cm<sup>32</sup> and Nm<sup>34</sup> modifications in tandem. The most broadly conserved tandem methylated substrates are tRNAPhe species. In yeast lack of Trm7-modified tRNAPhe activates the general amino acid control starvation response (Han et al., 2018), whereas specific mutant lesions of the human ortholog FTSJ1 are linked to non-syndromic X-linked intellectual disability (Guy et al., 2015). Yeast form tandem Cm<sup>32</sup> and Nm<sup>34</sup> on additional substrates tRNALeu UAA and tRNATrp CCA (Guy and Phizicky, 2014). Evidence for the formation of separate Trm7/Trm732 or Trm7/Trm734 two-subunit complexes initially came from immunoblot pull down assays (Guy et al., 2012). Additional genetic and biochemical evidence support the hypothesis that Trm7/Trm732 or Trm7/Trm734 act as separate two-subunit complexes in 2<sup>0</sup> -O ribose methylation of C<sup>32</sup> or N<sup>34</sup> on yeast tRNA substrates, respectively (Guy et al., 2012). The predicted Trm7 ortholog in humans, FTSJ1, requires the cognate human homolog THADA, for Cm<sup>32</sup> activity. However, S. cerevisiae Trm732 is able to functionally complement FTSJ1 in the absence of THADA (Guy and Phizicky, 2015). The Trm732 partner protein contains a conserved domain of unknown function as well as multiple armadillo-like helical domains, a structural fold generally important for protein and nucleic acid interactions. The other Trm7 partner, Trm734, is a WD-repeat protein similar to Trm82 the partner subunit of Trm8 m7G<sup>46</sup> methyltransferase. More precise descriptions of interactions between subunits of Trm7/Trm732 and Trm7/Trm734 complexes, and of formed hetero-dimer complexes with substrate tRNAs, have yet to be articulated.

### Sequentially Ordered tRNA Methyltransferase Reactions

In most organisms, a tRNA sequence will contain well over a dozen post-transcriptional chemical modifications. An evergreen topic of discussion is whether modification enzymes act in a particular order, where modification at one site is informed by the modification status of other positions. In eukaryotes, compartmentalization of modification enzymes obviously results in the sequential order of some chemical transformations. For example, nascent tRNA transcripts are first modified in the nucleus; after export to the cytoplasm, tRNAs can be further modified. Because tRNAs can also be imported into organelles (chloroplast or mitochondria), these may receive other modifications in addition to those already obtained in the nuclear and cytoplasmic compartments. However, the sequential nature of certain tRNA methylation events cannot be explained by compartmentalization alone and may be an inherent property of enzymes that work in complexes or those who have achieved a heteromeric state. Alternatively, selection for increased specificity may be a leading factor in establishing sequentiallity, such may be the case of complex modification pathways such as those for wybutosine and threonylcarbamoyl synthesis.

With few exceptions, tRNA sequences encode for a purine at position 37 that is almost always post-transcriptionally modified. Guanosine at position 37 can be methylated, with more complex conversion to wybutosine (yW) in tRNAPhe (Thiebe and Poralla, 1973; Noma et al., 2006). When adenosine is located at position 37 it may also be modified. In all three domains, threonylcarbamoyl can be found on N-6 of adenosine at position 37 (t6A37), or similarly isopentenyl (i6A37) modification may occur. In eukaryotic tRNAPhe t <sup>6</sup>A<sup>37</sup> and i6A<sup>37</sup> are inhibitory of Trm7/Trm732 and Trm7/Trm734 Cm<sup>32</sup> and Gm<sup>34</sup> formation, and alternatively stimulate m3C<sup>32</sup> formation by Trm140, or related enzymes in certain tRNA substrates (Guy et al., 2012; Arimbasseri et al., 2016; Han et al., 2017). In E. coli i <sup>6</sup>A<sup>37</sup> similarly blocks the formation of Cm or Um at position 34, a reaction catalyzed by the TrmL family of enzymes (Han et al., 2017; Sokolowski et al., 2018). The extremophile Thermus thermophilus provides evidence for apparent temperature sensitive methylation circuits; at high temperatures the absence of m7G<sup>46</sup> negatively impacts methylation at two other sites Gm<sup>18</sup> and m1G<sup>37</sup> (Tomikawa et al., 2010). A thematically similar report showed pseudouridylation at position 55 (955) impacts the formation of Gm18, m1A58, and m5 s <sup>2</sup>U when cells are grown at lower temperatures (Ishida et al., 2011). Many more so-described sequential tRNA modification circuits exist beyond those covered here, and are reviewed elsewhere (Helm and Alfonzo, 2014; Maraia and Arimbasseri, 2017; Han and Phizicky, 2018; Sokolowski et al., 2018).

### tRNA EDITING BY DEAMINATION

### Adenosine to Inosine (A-to-I) Editing

Chemical deamination of A-to-I in RNA sequences was observed 30 years prior to the discovery of any hydrolytic deaminase responsible for A-to-I activity that could act on polynucleotides (Bass and Weintraub, 1988). Since then, RNA adenosine deaminases have historically been divided into two broad classes based on their substrates: adenosine deaminases acting on mRNAs (ADARs) or adenosine deaminases acting on tRNAs (ADATs). Here we will focus exclusively on the tRNA deaminases (**Figure 2**). Inosine containing tRNAs are present in all domains of life; often tRNA A-to-I editing is essential. Unlike ADARs, which are generally promiscuous in A-to-I deamination of their targets (Bajad et al., 2017), ADATs show a more restricted A-to-I editing specificity, limited to three different tRNA sites: position 34 (wobble-position), 37 (3<sup>0</sup> of the anticodon triplet) or 57 (in the T9C-loop).

Inosine at the first position of the anticodon (I34) is essential in both eukaryotes and bacteria, but it does not occur in archaea (**Figure 2**). In eukaryotes, depending on the organism, roughly seven to eight cytoplasmic tRNAs have I34, while bacteria use I<sup>34</sup> only in tRNAArg (Grosjean et al., 1996; Sprinzl et al., 1998). I<sup>34</sup> increases the decoding capacity of tRNAs, allowing a single tRNA to decode three different codon (ending in U, C or A) and thus minimizing the number of necessary tRNA sequences that

need to be genomically encoded. In addition, certain aminoacyl tRNA synthetases recognize and require the presence of I<sup>34</sup> in tRNA substrates for productive aminoacylation (Droogmans and Grosjean, 1991; Senger et al., 1997; Gerber et al., 1998; Sprinzl et al., 1998; Losey et al., 2006). In bacteria, nearly all C-ending codons are read by tRNAs with an encoded G at position 34, except for tRNAArg which contains I34. Thus, despite a significantly more restricted pool of substrate tRNAs, I<sup>34</sup> remains essential in bacteria.

A34-to-I<sup>34</sup> deamination of eukaryotic tRNAs is catalyzed by the heterodimeric enzyme of (ADAT2/ADAT3 or Tad2/Tad3), which requires association between two paralogous sub-units for activity, while bacteria rely on the homodimeric enzyme ADATa (or TadA) (Auxilien et al., 1996; Wolf et al., 2002). In vitro, ADATa can efficiently recognize and deaminate a minimal substrate derived from the tRNAArg anticodon arm, while ADAT2/ADAT3 requires the entire tRNA for activity (Elias and Huang, 2005; Kuratani et al., 2005; Kim et al., 2006). Of

special note, plant chloroplast also contains a single tRNAArg that undergoes A-to-I editing and relies on an ADATa-like enzyme, an observation consistent with the endosymbiotic theory of eukaryotic mitochondrial evolution (Delannoy et al., 2009; Karcher and Bock, 2009), but in general, mitochondria-encoded tRNAs do not contain inosine.

I<sup>37</sup> and I<sup>57</sup> are less widespread within organisms, where I<sup>37</sup> is found in tRNAAla of certain eukaryotes (Gerber et al., 1998; Maas et al., 1999), and I<sup>57</sup> has only been observed in tRNAs from archaea (Yamaizumi et al., 1982; Grosjean et al., 1995, 1996). Generally inosine at positions 37 and 57 can also be observed as methyl modified (m<sup>1</sup> I) as discussed later in Section "m<sup>1</sup> I Formation in Eukaryotes Versus Archaea." I<sup>37</sup> is formed by ADAT1, a homodimeric enzyme that shares key conserved residues with other deaminases: a conserved histidine and two cysteines that coordinate a catalytic Zn2+, as well as a conserved glutamate that participates in the final chemical step of inosine formation (Gerber et al., 1998; Gerber and Keller, 1999; Maas et al., 1999; Losey et al., 2006). I<sup>37</sup> editing does not expand the decoding capacity of target tRNAs and its biological significance remains unclear. However, because of the importance of modifications at position 37 of the anticodon loop for reading-frame maintenance during translation, it is safe to assume a similar role for m<sup>1</sup> I37. The biological role of I<sup>57</sup> in archaea is equally cryptic but its position in the backbone of the tRNA suggests a structural role.

All polynucleotide deaminases belong to the cytidine deaminase superfamily and require a Zn2<sup>+</sup> for activity. However, phylogenetic analysis has revealed that ADAT1 closely aligns with mRNA specific ADARs, while ADAT2/ADAT3, responsible for inosine formation at position 34, strikingly resembles nucleic acid cytidine deaminases such as AID (activation-induced deaminases) or APOBEC (the apolipoprotein B mRNA editing enzyme) (Gerber and Keller, 2001). Recent investigation of the fungus Fusarium graminearum provides evidence that ADARs may have evolved from ADAT-like enzymes, as F. graminearum lack an obvious ADAR homolog, yet still contain A-to-I edited mRNAs (Wang C. et al., 2016; Bajad et al., 2017). These authors propose that ADATs may be responsible for A-to-I during the sexual life cycle of F. graminearum, which further suggests the double-stranded RNA binding domain found in ADARs was gained during the evolution from an ADAT-like ancestor, which lacks this domain (Gerber et al., 1998; Gerber and Keller, 1999; Maas et al., 1999; Losey et al., 2006). If true, it is equally possible that certain annotated ADATs may deaminate RNA substrates other than tRNAs, perhaps in complex with additional factors.

### Cytidine to Uridine (C-to-U) tRNA Editing in Archaea and Eukarya

Certain archaea and eukaryotes contain C-to-U edited tRNAs, but in many instances the enzymes responsible remain to be identified (**Figure 3**) (Janke and Paabo, 1993; Marchfelder et al., 1996; Alfonzo et al., 1999; Fey et al., 2002; Randau et al., 2009; Grewe et al., 2011). In the archaeon Methanocryptus kandleri, CDAT8, a cytidine deaminase homolog, catalyzes C-to-U conversion at position 8 of tRNAs (Randau et al., 2009). For many tRNAs an encoded U at position 8 forms a Hoogsteen base pair with A<sup>14</sup> to assist in the proper folding of mature L-shaped tRNAs (Romby et al., 1985). However, 30 out of 34 tRNAs that contain A<sup>14</sup> in M. kandleri, are encoded with a cytidine at position 8 that must be deaminated by CDAT8 to form U<sup>8</sup> (Randau et al., 2009) and ensure proper tRNA folding. The reason for having such a large number of tRNAs that require C-to-U editing in M. kandleri, rather than encoding for U8 containing transcripts remains obscure. One probable answer might be the extreme environment in which M. kandleri lives, which favors G:C pairing in the DNA for optimal genome stability, yet still requires C-to-U editing for proper tRNA folding (Randau et al., 2009).

C-to-U editing of tRNAs in eukarya was first reported in protists, plants and marsupial mitochondria (Janke and Paabo, 1993; Lonergan and Gray, 1993a,b; Marechal-Drouard et al., 1996). However, the marsupial system provided the first example of C to U editing at the anticodon nucleotides. These organisms do not encode a tRNA for decoding mitochondrial aspartate codons. To solve this issue, the anticodon of tRNAGly GCC is C-to-U edited to tRNAAsp GUC, which is recognized by the mitochondrial aspartyl tRNA synthetase to produce a functional ortholog to tRNAAsp (Janke and Paabo, 1993; Borner et al., 1996). This quirk of marsupial mitochondria paves the way for identification of similar mechanisms in nature, where an encoded tRNA gene can be edited to function as an alternative aminoacyl acceptor.

Leishmania tarentolae and Trypanosoma brucei, representative kinetoplastids, offer the only other example of anticodon C-to-U editing in tRNA (Alfonzo et al., 1999; Charriere et al., 2006). As in other organisms, the mitochondrial genome lacks certain tRNA genes that must be actively imported from the cytoplasm (Paris et al., 2009). In some instances, the mitochondrial translational code is not compatible with nuclear encoded tRNAs, as is the case for decoding UGA as tryptophan in mitochondria, not a stop codon (Barrell et al., 1979). In L. tarentolae and T. brucei UGA is used as a tryptophan codon in mitochondria, while the nucleus only contains a single-copy tRNATrp CCA to decode the canonical UGG codons. After import of tRNATrp CCA into the mitochondria, position 34 is C-to-U edited to create tRNATrp UCA as part of the mechanism that reassigned the UGA codons from stop to tryptophan. The enzyme responsible for this essential editing event is still unknown, but one would guess a deamination mechanism (Alfonzo et al., 1999; Charriere et al., 2006). Interestingly, only approximately 40–50% of the tRNATrp is edited after transport to the trypanosome mitochondria, raising questions as to how this balance is kept. The answer partly rests on the unusual thiolation at U<sup>33</sup> in this tRNA, which negatively impacts C-to-U editing of C<sup>34</sup> (Wohlgamuth-Benedum et al., 2009).

In the mitochondria of dicotyledon plants, C-to-U editing takes place outside the anticodon region and does not directly influence the decoding capacity of tRNAs (Marechal-Drouard et al., 1996; Fey et al., 2002; Ichinose and Sugita, 2016). In potato mitochondria, C-to-U editing corrects a mismatch-encoded pair C4:A<sup>69</sup> in 5<sup>0</sup> processed pre-tRNAPhe GAA to U4:A<sup>69</sup> (Binder et al., 1994; Marechal-Drouard et al., 1996). After editing 5<sup>0</sup> processed

FIGURE 3 | Cytidine-to-uridine (C-to-U) editing of tRNAs. (A) Shows the different C-to-U edited nucleotide positions that have been described in different tRNAs of different organisms. The organism and the nucleotide position along with tRNA identity are shown in the right panel. The enzyme identity is presented over the arrow. The gray panel denotes all known C-to-U editing events occurring in the anticodon loop. (B) Shows the C-to-U deamination reaction. (C) Depicts the conserved active-site residues in the tRNA C-to-U deaminase from archaea (CDAT8). (D) Depicts the active-site domain of both the ADAT2/3 and Trm140a of T. brucei, these enzymes interdependently.

pre-tRNAPhe GAA can properly fold resulting in efficient removal of the 3<sup>0</sup> trailer by mitochondrial RNaseZ (Kunzmann et al., 1998). In a modeling study of quillwort, Isoetes engelmannii, extensive C-to-U editing was predicted at 43 possible tRNA sites (Grewe et al., 2011; Ichinose and Sugita, 2016). The author<sup>0</sup> s obtained cDNA sequence data for 36 such sites, and among them 29 showed C-to-U editing (Grewe et al., 2011). Interestingly four sites showed U-to-C conversion, invoking the necessity of a likely transamination reaction (Grewe et al., 2011). There are other thematically similar C-to-U editing events that occur for tRNAs in plant mitochondria, and these are well-reviewed elsewhere (Fey et al., 2002; Paris et al., 2012).

Position 32 of the anti-codon stem loop is another site where tRNAs from multiple domains contain C-to-U conversions. It was first described in T. brucei that all three tRNAThr isoacceptors undergo C-to-U editing at position 32 (Rubio et al., 2006; Gaston et al., 2007). Trypanosoma brucei ADAT2/ADAT3, previously discussed in the context of A-to-I editing at position 34, has been shown to perform C-to-U editing of tRNAThr AGU at position 32 (Rubio et al., 2007; Rubio et al., 2017). However, this editing

event first requires methylation at this site, discussed in more detail in Section "m3C-to-m3U." Within the ADAT2/ADAT3 heterodimeric complex, the C-terminal region of ADAT2 assists in tRNA binding, while ADAT3 provides a structural role in formation of the catalytic deaminase core (Rubio et al., 2007), similar in arrangement as the two subunit methyltransferases discussed in Section "Interdependent Methyltransferases." The biological significance of this editing is not exactly clear, but some evidence showed it is important for protein synthesis (Rubio et al., 2017), but C<sup>32</sup> methylation and editing do not impact aminoacylation efficiency. Given its position in the anticodon loop, likely such effects in protein synthesis may be due to some function in translational efficiency or accuracy. Similar editing events have been recently described in Arabidopsis thaliana, where tRNASer AGA and tRNASer GCU are C-to-U edited at position 32 in the nucleocytoplasmic compartment by a presently unknown enzymatic mechanism (Zhou et al., 2014).

### INTERDEPENDENT METHYLATION AND EDITING AND ITS BIOLOGICAL RELEVANCE

#### m<sup>1</sup> I Formation in Eukaryotes Versus Archaea

As discussed in Section "Adenosine to Inosine (A-to-I) Editing" A-to-I editing is found at three tRNA sites: position 34, 37, and 57. Intriguingly, inosines at position 37 and 57 can be methylated to form m<sup>1</sup> I<sup>37</sup> and m<sup>1</sup> I57, by distinct chemical pathways (Yamaizumi et al., 1982; Grosjean et al., 1995, 1996). At position 37, after ADAT1 converts A-to-I, SAM-dependent Trm5 can act directly on N-1 of inosine (Gerber et al., 1998; Maas et al., 1999; Brule et al., 2004; Macbeth et al., 2005). As Trm5 is essential in most organisms catalyzing the generation of m1G37, the biological significance of its involvement in the modification of edited I<sup>37</sup> remains unclear (Paris et al., 2013). The opposite order of events occurs at position 57 in archaea, where a SAMdependent TrmI-family member must first methylate A<sup>57</sup> before it becomes a substrate for deamination to inosine (Yamaizumi et al., 1982; Grosjean et al., 1995, 1996). The enzyme, or enzymatic complex, responsible for m1A57-to-m<sup>1</sup> I<sup>57</sup> has not been identified, and the biological significance of this modification remains to be articulated. However, methylation followed editing also occurs in specific m3C-to-m3U conversions.

## m3C-to-m3U

In trypanosomes, down regulation of ADAT2/ADAT3 expression reduces A34-to-I<sup>34</sup> editing in tRNAThr AGU at position 34 and C-to-U editing at position 32 (Rubio et al., 2007). It was originally hypothesized that ADAT2/ADAT3, which has clear sequence similarity with cytidine deaminases, may be responsible, however, initial attempts to reconstitute C-to-U editing with T. brucei ADAT2/ADAT3 were not successful (Rubio et al., 2007). Since C<sup>32</sup> of tRNAThr AGU is methylated to form m3C32, it suggested that C-to-U editing could require methylation prior to deamination (**Figure 4**) (Rubio et al., 2017), as discussed previously for archaeal m1A57-to-m<sup>1</sup> I57. The methyltransferase Trm140 was later identified as responsible for formation of m3C<sup>32</sup> (D'Silva et al., 2011; Noma et al., 2011) and true to expectations, Trm140 and ADAT2/ADAT3 were shown to work interdependently to convert m3C<sup>32</sup> to m3U<sup>32</sup> in vitro (Rubio et al., 2017). Further experimentation also showed that Trm140 and ADAT2/ADAT3 are likely to work sequentially and as a complex (Rubio et al., 2017). How these two enzymes converge simultaneously on a single RNA substrate and act at the same nucleotide position remains an open question. However, recent studies demonstrated that the two enzymes bind their tRNA substrate synergistically, whereby binding affinity increases significantly if both proteins are present in the reaction (McKenney et al., 2018).

ADAT2/ADAT3 can deaminate DNA in vitro and in vivo, this mutagenic activity is dampened through association with Trm140 (Rubio et al., 2007, 2017). By extension a similar mechanism, may inform how the cytidine deaminase, AID, specifically targets genes that encode immunoglobulin receptors while leaving the rest of the genome unaffected during B cell somatic hypermutation (Teng and Papavasiliou, 2007). It is likely that additional protein factors influence specificity of AID toward its substrate genetic loci.

## CONCLUDING REMARKS

Much has been written about the mechanisms that lead to, and determine the fate of, duplicated genes. It is clear that once gene duplication occurs, one copy is free to mutate via genetic drift, to accumulate mutations perhaps by a "neutral evolutionary ratchet" (Covello and Gray, 1993; Gray et al., 2010; Lukes et al., 2011). These mutations, in turn, can lead to total loss of function of the duplicated gene and the creation of pseudogenes. Alternatively, and more importantly, duplication may provide a powerful route to neofunctionalization or subfunctionalization of genes (Stoltzfus, 1999). The former makes the mutated duplicate gene acquire new functions different from that of the ancestral gene; the latter leads to a partitioning of labor so that each copy now carries a subset of the functions originally performed by the ancestral state. In this review, we have focused on various examples of modification enzymes, where evolution has pushed the system into one of the categories above. For example, in the case of the Trm61/Trm6 methyltransferase, sequence comparisons strongly suggest that these are paralogs. Here, the neutral acquisition of mutations without obvious gains in fitness led to a level of divergence that at some point may have become important in expanding substrate specificity, thus diversifying the number of targets the new enzyme could methylate.

In the case of Trm7/Trm732 and Trm7/Trm734, no evidence exists for gene duplication; neither Trm732 nor Trm734 have significant sequence conservation with known methyltransferases. Instead, they share similarity with other protein families, for example Trm734 with WD-domain family proteins. It may be that independent stochastic mutations accumulated in each gene for evolution of a dimerization interface that allowed each protein to interact with the Trm7

partner. In doing so, differential complex formation permitted a novel division of labor; one complex now targets position 32 of tRNAs and the other position 34. This is, of course, assuming that the catalytic subunit Trm7 derives from an ancestral gene that at some point was able to efficiently methylate both positions.

Similar arguments can be made with the tRNA A-to-I deaminase of trypanosomes, TbADAT2/TbADAT3, a heterodimeric enzyme comprised of two subunits encoded by clear paralogs (Rubio et al., 2007). Accumulation of mutations in one or both copies may have forced each paralog to strictly rely on the other for activity. Such cases echo themes observed in enzyme-prozyme complexes (Nguyen et al., 2013; Volkov et al., 2016), where in polyamine biosynthesis a catalytically dead paralog regulates the activity of an active paralogous enzyme. We (Gaston et al., 2007; Rubio et al., 2007), and others (Elias and Huang, 2005) have argued, that at least in the case of ADAT2/ADAT3 in eukaryotes, gene duplication led to the expansion in the specificity of the enzyme toward more molecular substrates. For example, ADATa, the homologous enzyme from bacteria is active as a homodimer, targets a single tRNAArg in vivo and a "minimalist" tRNAArg anti-codon stem loop is a sufficient substrate in vitro (Wolf et al., 2002). Not surprisingly the co-crystal structure of ADATa shows that residues near or at the active site are necessary for RNA binding (Losey et al., 2006). The eukaryotic counterpart ADAT2/ADAT3 recognizes seven to eight different tRNAs depending on the organism, but is only

Dixit et al. tRNA Editing and Modification

active on full length tRNAs (Auxilien et al., 1996). Years ago, we showed that one of the critical RNA binding domains of the T. brucei enzyme lies at the C-terminus of one subunit and that residues near the active site minimally contribute to substrate binding (Ragone et al., 2011). Thus, movement of critical binding residues away from the active site increased active site flexibility to allow for the observed expansion in substrate specificity. Again, such subtle, yet important, evolutionary changes in eukaryotic tRNA deaminases are only made possible by gene duplication and the previously proposed effects of "constructive neutral evolution" (Covello and Gray, 1993; Stoltzfus, 1999).

Finally let's consider m3C/m3U editing and methylation at position 32 of several tRNAs in T. brucei (**Figure 4**) (also described in plants). The TbTrm140 m3C methyltransferase does not result from an obvious duplication of a deaminase gene and vice versa. While both enzymes come together to form a stable, active complex in the nucleus, the TbADAT2/TbADAT3 heterodimer is also active in the cytoplasm as a free enzyme catalyzing essential A-to-I deaminations (Rubio et al., 2017). In this particular case, both enzymes accumulated mutations in a neutral fashion, likely independent of each other. The question is how could TbTrm140 accumulate mutations without causing deleterious effects on the organism. The answer may involve TbMtase37, a paralog of Trm140 within the T. brucei genome, of currently unknown function (Fleming et al., 2016). This paralog might provide the necessary duplicate and essential function, which allowed TbTrm140 to mutationally drift and neofunctionalize with a seemingly unrelated enzyme like TbADAT2/TbADAT3 to modify and edit new substrates. What makes this case unusual is the fact that both enzymes have all the conserved residues required for activity and both may be active on other substrates, yet by themselves are totally inactive for methylation and deamination of tRNAThr position 32 of T. brucei. We thus introduce the concept of "enzyme coactivation," whereby enzymes active with some substrates but inactive with others, gain new function upon their association and indeed co-activate each other.

### REFERENCES


Neo- or sub-functionalization includes a combination of nonadaptive and, later, adaptive mutations as originally suggested by Gray et al. (2010) Regardless of what factors or mechanisms are at play, the question then remains: Are there fitness gains to be made by organisms by the examples in this review? At least in trypanosomes, we have long appreciated the interdependent nature of RNA editing and modification, under the hypothesis that these events fine-tune translation to the ever-changing environmental conditions during the life cycle of these parasites (Paris et al., 2012). Interdependent modification and editing may serve to maintain levels of edited and unedited tRNAs in response to changes in environment or life stages. The same could be true of many other modifications and subsequent use of alternative substrates (Helm and Alfonzo, 2014); enzyme coactivation may provide an additional level of "tunability" to ensure fast responses to ever changing growth conditions, not only in response to stress, but also in maintenance of general cell homeostasis.

### AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

### FUNDING

This article was funded in part by grants R56AI131248 and RO1GM132254 to JA and a Center of RNA Biology Postdoctoral Fellowship to JH.

### ACKNOWLEDGMENTS

We thank all members of the Alfonzo laboratory for useful discussions and comments.



modifications of the tRNAPhe anticodon loop. RNA 18, 1921–1933. doi: 10. 1261/rna.035287.112


fgene-10-00104 February 12, 2019 Time: 19:32 # 13



and tRNA modification activities. PLoS One 9:e98729. doi: 10.1371/journal. pone.0098729

Zhou, W., Karcher, D., and Bock, R. (2014). Identification of enzymes for adenosine-to-inosine editing and discovery of cytidine-to-uridine editing in nucleus-encoded transfer RNAs of Arabidopsis. Plant Physiol. 166, 1985–1997. doi: 10.1104/pp.114. 250498

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Dixit, Henderson and Alfonzo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Codon-Specific Translation by m1G37 Methylation of tRNA

Ya-Ming Hou\*, Isao Masuda and Howard Gamper

Department of Biochemistry and Molecular Biology, Thomas Jefferson University, Philadelphia, PA, United States

Although the genetic code is degenerate, synonymous codons for the same amino acid are not translated equally. Codon-specific translation is important for controlling gene expression and determining the proteome of a cell. At the molecular level, codonspecific translation is regulated by post-transcriptional epigenetic modifications of tRNA primarily at the wobble position 34 and at position 37 on the 3<sup>0</sup> -side of the anticodon. Modifications at these positions determine the quality of codon-anticodon pairing and the speed of translation on the ribosome. Different modifications operate in distinct mechanisms of codon-specific translation, generating a diversity of regulation that is previously unanticipated. Here we summarize recent work that demonstrates codonspecific translation mediated by the m1G37 methylation of tRNA at CCC and CCU codons for proline, an amino acid that has unique features in translation.

#### Edited by:

Akio Kanai, Keio University, Japan

#### Reviewed by:

Hiroyuki Hori, Ehime University, Japan Kozo Tomita, University of Tokyo, Japan

#### \*Correspondence:

Ya-Ming Hou ya-ming.hou@jefferson.edu

#### Specialty section:

This article was submitted to RNA, a section of the journal Frontiers in Genetics

Received: 21 November 2018 Accepted: 20 December 2018 Published: 10 January 2019

#### Citation:

Hou Y-M, Masuda I and Gamper H (2019) Codon-Specific Translation by m1G37 Methylation of tRNA. Front. Genet. 9:713. doi: 10.3389/fgene.2018.00713 Keywords: synonymous codons, methyl transferases TrmD and Trm5, codon-anticodon pairing interaction, ribosomal +1-frameshifts and stalling, protein synthesis

## INTRODUCTION

The redundancy of the genetic code offers an opportunity for fine-tuning of protein synthesis, virtually at every codon position. Although the sequence of codons in an mRNA provides the template for translation into amino acid building blocks of the protein, synonymous codons are not translated equally. Codon bias, which is defined by the frequencies of usage among synonymous codons, is a specific feature unique to each genome and each gene and can impact the fitness of each organism (Plotkin and Kudla, 2011). The choice of synonymous codons determines how these codons are differentially translated at the molecular level. Translation is controlled by tRNA species with the anticodon that is cognate to each codon. The quality of a codon-anticodon pairing interaction is determined not only by the level of the tRNA available for the codon (Ikemura, 1982; Komar, 2009; Nedialkova and Leidel, 2015), but also and perhaps more importantly by posttranscriptional modifications to the tRNA anticodon, either at nucleobases or backbones (Takai and Yokoyama, 2003; Armengod et al., 2014; Grosjean and Westhof, 2016; Agris et al., 2018). These modifications are catalyzed by distinct enzymes that bear high relevance to human health and disease (Sarin and Leidel, 2014). Many of these enzymes are dedicated to modifications of the anticodon at position 34 (the wobble position) and at position 37 on the 3<sup>0</sup> -side of the anticodon. It is the nature of post-transcriptional modifications at these anticodon-associated positions that exert differential pairing with synonymous codons, resulting in differential translation among these codons. The mechanism of codon-specific translation is therefore the underlying basis that determines how codon bias impacts the fitness of each organism. It is an emerging new concept that directly influences the expression of the proteome at the cell level.

### THE m1G37-METHYLATION IN tRNA

While there are more than 100 post-transcriptional modifications in tRNA databases, the majority by themselves are not required for cell survival or growth. One exception is the N 1 -methylation of G37 on the 3<sup>0</sup> -side of the tRNA anticodon, generating m1G37 (**Figure 1A**), which as a single methylated nucleobase is not only essential for life but is also conserved in evolution present in all three domains of life (Bjork et al., 2001). In the bacterial domain, the biosynthesis of m1G37 is catalyzed by the tRNA methyl transferase TrmD (Hou et al., 2017), whereas in the eukaryotic and archaeal domains, it is catalyzed by Trm5 (Bjork et al., 2001). While both TrmD and Trm5 perform the same methyl transfer reaction, using S-adenosyl methionine (AdoMet) as the methyl donor, they are fundamentally different in structure, where TrmD is a member of the SpoU-TrmD family (Anantharaman et al., 2002; Ahn et al., 2003; Elkins et al., 2003; Ito et al., 2015; Hori, 2017) and Trm5 is a member of the Rossmann-fold family (Goto-Ito et al., 2008, 2009). TrmD and Trm5 also differ in virtually all aspects of the reaction mechanism (Christian et al., 2004, 2006, 2010a,b, 2013, 2016; Christian and Hou, 2007; Lahoud et al., 2011; Sakaguchi et al., 2012, 2014). Due to the dependence on m1G37 for cell survival, TrmD is required for growth in several bacterial species, including Escherichia coli and Salmonella (Gamper et al., 2015a), and Trm5 is required for growth in the single-cell eukaryote yeast Saccharomyces cerevisiae (Bjork et al., 2001), where it provides the important role of preventing mis-charging of tRNA (Perret et al., 1990). Additionally, the Trm5-dependent synthesis of m1G37 is present in both the cytosolic and mitochondrial compartments (Lee et al., 2007) and it is the initiation event that leads to further modifications to m<sup>1</sup> I37 (Brule et al., 2004) and to wyosine and derivatives in the cytosol (Urbonavicius et al., 2014). The molecular basis for how m1G37-tRNA is essential for cell survival is largely elucidated in E. coli and Salmonella (Gamper et al., 2015a).

### MAINTENANCE OF PROTEIN SYNTHESIS READING FRAME BY m1G37-tRNA

In bacteria, all three tRNAPro species, which read CCN codons, contain m1G37. Elimination of m1G37 by inactivation of TrmD leads to accumulation of ribosomal +1-frameshifts (+1 shifts), particularly at sites where mRNA sequences contain successive Cs in a row (Bjork et al., 1989). Unlike ribosomal mis-coding errors, which replace a correct amino acid with an incorrect one but maintain the protein synthesis reading frame, +1-shifts move the ribosome to the next nucleotide in the 5<sup>0</sup> to 3<sup>0</sup> direction of the mRNA, resulting in loss of the reading frame, premature termination of protein synthesis, and ultimately cell death. The ability of m1G37-tRNA to suppress ribosomal +1-shifts is an important finding that has activated in-depth mechanistic studies with high significance. mRNA sequences such as the pyrimidine-ending CCC and CCU codons for proline (Pro), followed by another pyrimidine, resulting in CC[C/U]-[C/U] sequence motifs, are particularly prone to +1-shifts (Yourno and Tanemura, 1970) and are often found within the first 15 codons of protein-coding genes (Gamper et al., 2015a), where ribosomal translation is highly sensitive to attenuation (Chen and Inouye, 1990). Additionally, some of the +1-shift prone sequences are directly adjacent to the start codon (Gamper et al., 2015a), at the second codon position of the reading frame, where ribosomal translation is in the unique stage of transitioning from the initiation to the elongation phase.

The maintenance of protein synthesis reading frame in normal cellular conditions is achieved with unexpectedly high fidelity. Despite the dynamics of each ribosome that successively move tRNA-mRNA complexes from the A- to P- to E-site, and despite the rapid rate of protein synthesis at 10–20 amino acids per second, +1-shifts typically occur in less than one per 30,000 amino acids (Jorgensen and Kurland, 1990), at least 10-fold lower relative to other types of translational errors. In a genetic reporter system developed in E. coli, m1G37-tRNA was found to have the strongest suppression power when the shift-prone CCC-C sequence was placed next to the start codon (Gamper et al., 2015a). Lack of m1G37 in tRNAPro at this secondcodon position increased +1-shifts by almost 10-fold in the reporter system, higher than increases of +1-shifts at any other position throughout the reporter gene. An in vitro assay was then developed using reconstituted E. coli ribosomes, with the goal to assess the rate of +1-shifts relative to the rate of peptide bond formation (Gamper et al., 2015a). Both fast and slow mechanisms were uncovered (Gamper et al., 2015a). The fast mechanism occurs on a timescale comparable to that of peptide bond formation. It takes place when tRNAPro, carrying the first peptide bond made at the A-site, is moving into the P-site by a process known as translocation (**Figure 1B**). Thus, due to the ability of a +1-shift to compete with peptide bond formation on a similar timescale, should this process happen, it would compromise the accuracy of the reading frame. The slow mechanism occurs on a timescale 100-fold slower relative to peptide bond formation (Gamper et al., 2015a). It takes place when tRNAPro, carrying the first peptide bond, is sitting at the P-site next to an empty A-site (**Figure 1C**), which is usually the case during nutrient starvation that depletes charged aminoacyl-tRNAs. The model that has emerged from these mechanistic studies is that the ribosome is most prone to +1-shifts after it completes the first peptide bond formation and is moving tRNAPro from the A- to the P-site. This is also the transitioning point of the ribosome from the initiation phase (where the first peptide bond is made) to the elongation phase (where the second peptide bond will be made when the third codon enters the ribosome A-site). The finding that m1G37 has the strongest suppression of +1 shifts at the second codon position indicates its pivotal role in maintaining the reading frame during the transition point of the ribosome. Should m1G37 be eliminated leading to ribosomal +1 shifts at the transition point, this would generate a prematurely

the +1-frameshift is based on similar base pairing stability with cognate codons CCC and CCU. The UGG isoacceptor contains the cmo5U34 modified base at the wobble position, which allows pairing with all four nucleobases.

terminated peptide in a non-productive and energetically costly translational error.

### CODON-SPECIFIC TRANSLATION BY m1G37-tRNA

Of the four CCN codons for Pro, CC[C/U] codons are most dependent on the presence of m1G37 (Jorgensen and Kurland, 1990) and the translation of these codons is noticeably slowed down upon deficiency of the methylation (Bjork and Nilsson, 2003). The codon-anticodon base-pairing interaction for CC[C/U] codons was thought to involve quadruplet basepairing as implicated in previous studies (Hohsaka et al., 2001; Taki et al., 2002). However, there is no structural evidence of quadruplet pairing at the ribosomal A-site (Maehigashi et al., 2014) and the isolation of suppressor mutations of +1-shifts that are not in tRNA sequence suggests that quadruplet pairing is unlikely (Qian et al., 1998; Farabaugh, 2000). More recent work favors a model of tRNA slippage by a triplet codonanticodon pairing interaction (Qian et al., 1998), which is supported by a detailed kinetic analysis (Gamper et al., 2015a). In the triplet slippage model, the +1-shift-prone sequences CC[C/U]-[C/U] are each read by two isoacceptors of tRNAPro: the GGG isoacceptor exploits wobble pairing without additional modifications, whereas the UGG isoacceptor requires the 5 carboxy-methoxy (cmo<sup>5</sup> ) modification to U34 at the wobble position (Nasvall et al., 2004). In all cases, the codon-anticodon pairing interaction in the unshifted 0-frame and in the shifted +1-frame is similar (**Figures 1D,E**), indicating minimum energetic penalty for the tRNA to shift. The role of m1G37 in suppressing the shift is to re-organize the structure of the anticodon loop to stabilize the pairing in the 0-frame (Maehigashi et al., 2014). A differential effect of m1G37 in suppressing the shift between the GGG and UGG isoacceptors is that the methylation by itself is insufficient in the GGG isoacceptor and requires

the assistance of the elongation factor EF-P (Gamper et al., 2015a), whereas it is dominant in the UGG isoacceptor (Gamper et al., 2015a,b). In contrast, the dependence on m1G37 for reading-frame accuracy is much reduced with the purine-ending CC[G/A] codons for Pro, which are read by two isoacceptors of tRNAPro: the CGG isoacceptor for the CCG codon and the UGG isoacceptor for both. Notably, the codon-anticodon pairing interaction in the +1-frame is highly unfavorable relative to the 0-frame, indicating that CC[G/A] codons are intrinsically more stable with pairing of the anticodon and are less likely to shift and less sensitive to m1G37 deficiency. This comparison illustrates the notion that synonymous codons have different requirements for pairing with the anticodon, and that such differences are the underlying basis of codon-specific translation.

The discovery that m1G37 has differential control of the translation of Pro codons is intriguing. Pro is a unique amino acid that contains an α-imino, rather than an α-amino group. It is the slowest substrate relative to all other amino acids for peptide bond formation, both as the acceptor or the donor of making a peptide bond by the ribosome (Pavlov et al., 2009). The aminoacyl group of Pro when charged to tRNAPro is also the least stable compared to others and it has the lowest affinity to the elongation factor EF-Tu (Peacock et al., 2014). Additionally, Pro is the only amino acid that can enable peptide backbones to make turns and change direction, so its position and distribution in a protein sequence directly impact on folding of the protein, particularly if trans-membrane domains are involved (Yohannan et al., 2004). Also, human disease-associated mutations are frequently found at Pro positions (Partridge et al., 2004). These considerations suggest that m1G37-tRNA has the ability to regulate a diverse process of cellular activities via differential translation of Pro codons at specific positions during gene expression.

Besides the universal association of m1G37 with all isoacceptors of tRNAPro, the CCG isoacceptor of tRNAArg for reading the Arg CGG codon, and the GAG isoacceptor of tRNALeu for reading the Leu CU[C/U] codons, also contain m1G37. The mechanism for why these isoacceptors carry m1G37 has not been investigated. It does not appear to involve suppression of +1-shifts, based on analysis of the codonanticodon pairing interaction, but may involve release of ribosomes from stalling at the specific codons. Notably, the m1G37-containing CCG tRNAArg is only one of the several isoacceptors of the Arg family, whereas other members of the family do not need m1G37. Similarly, while m1G37 is present in the GAG, CAG, and UAG isoacceptors of the Leu family, it is absent from the UAA isoacceptor. This provides additional evidence for codon-specific translation that is dependent on the presence of post-transcriptional modifications in the anticodon for pairing with a codon.

### CODON-SPECIFIC TRANSLATION IN Mg2<sup>+</sup> HOMEOSTASIS

An example of codon-specific translation mediated by m1G37 is the maintenance of Mg2<sup>+</sup> homeostasis in Gram-negative bacteria. Mg2<sup>+</sup> is the most abundant divalent cation in all living cells and is maintained at mM concentrations. For Salmonella enterica serovar Typhimurium (hereafter Salmonella), the etiologic agent of human gastroenteritis, the infection of the human gut into the metal-scarce macrophage compartment, requires Mg2<sup>+</sup> transport into cells for survival and virulence of the pathogen (Papp-Wallace and Maguire, 2008a; Groisman et al., 2013). This transport is activated upon expression of the major Mg2<sup>+</sup> transporter gene mgtA and is regulated at two levels: the transcriptional activation of the structural gene and the ribosomal translational control of the 5<sup>0</sup> -leader ORF ahead of the structural gene (**Figure 2A**). First, the initial transcription activation is by the membrane-bound twocomponent PhoPQ system, in which sensing of low external Mg2<sup>+</sup> by PhoQ promotes phosphorylation of PhoP, which activates transcription of mgtA and many virulence genes (Papp-Wallace and Maguire, 2008a; Groisman et al., 2013). Second, the subsequent transcriptional regulation of mgtA is determined by the speed of ribosomal translation of the 5<sup>0</sup> -leader ORF of the mgtA mRNA. Rapid translation of this ORF exposes the Rho-utilization (rut) sequence, resulting in attenuation of transcription before the mgtA gene (Cromie et al., 2006). In contrast, slow or stalled translation of the ORF induces a structural change in the 5<sup>0</sup> -leader mRNA that places the rut sequence inaccessible to Rho-dependent termination, thus allowing transcription through mgtA (Park et al., 2010; Hollands et al., 2014; **Figure 2B**). This translation-dependent attenuation of transcription is common to regulation of expression of aminoacid biosynthesis genes (Merino and Yanofsky, 2005). In addition, Salmonella has a second inducible Mg2<sup>+</sup> transporter gene mgtB expressed from the virulence operon mgtCBR under a similar control of transcriptional attenuation that is determined by the speed of ribosomal translation of its 5<sup>0</sup> -leader ORF (Papp-Wallace and Maguire, 2008a; Groisman et al., 2013). While Salmonella has a third and constitutively expressed Mg2<sup>+</sup> transporter gene corA (Papp-Wallace and Maguire, 2008b), it is the inducible expression of mgtA and mgtB that maintains Mg2<sup>+</sup> at virtually constant levels. While the external Mg2<sup>+</sup> level can change by 5 orders of magnitude, the internal level varies by less than fivefold (Papp-Wallace and Maguire, 2008b). Without this Mg2<sup>+</sup> homeostasis, Salmonella cannot survive in host cells. Thus, the translationdependent attenuation of transcription of mgtA is the major determinant of Mg2<sup>+</sup> homeostasis for Salmonella.

Analysis of the 5<sup>0</sup> -leader ORF of mgtA shows a strong bias for codons that are dependent on m1G37-tRNA for translation (Park et al., 2010). These include Pro codons CC[C/U] at positions 3, 5, and 7, Leu codons CU[C/U] at positions 8 and 15, and Arg codon CGG at position 17 (**Figure 2C**). These m1G37-dependent codons are highly conserved across diverse Gram-negative bacteria (**Figure 2D**), indicating significance under evolutionary pressure. The codon conservation raises the possibility of codon-specific translation, where the speed of ribosomal translation of these codons is controlled by the presence of m1G37. A strong support for this possibility is that TrmD, the bacterial methyl transferase that synthesizes m1G37, is strictly dependent on Mg2<sup>+</sup> for catalytic activity (Sakaguchi et al., 2014). While the requirement for Mg2<sup>+</sup> for several tRNA methyl transferases is known (Hurwitz et al., 1964; Kumagai et al., 1982),

FIGURE 2 | Codon-specific translation in Mg2<sup>+</sup> homeostasis. (A) Mg2<sup>+</sup> homeostasis in Salmonella is maintained by the membrane-bound two-component system PhoPQ sensing of the external low Mg2+, which activates transcription of the major transporter gene mgtA. Transcription of mgtA is determined by ribosomal translation of the 5<sup>0</sup> -leader ORF, which contains several m1G37-dependent Pro codons. (B) Low levels of Mg2<sup>+</sup> slow down ribosomal translation due to stalling at m1G37-dependent codons, resulting in a structure of the 5<sup>0</sup> -leader ORF that places rut in a stem-loop region inaccessible to Rho, thus allowing transcription through mgtA, whereas high levels of Mg2<sup>+</sup> enable rapid ribosomal translation of the 5<sup>0</sup> -leader ORF, which exposes the rut sequence ahead of the mgtA structure gene and attenuates transcription. (C) The codon sequence of the 5<sup>0</sup> -leader ORF is shown, where m1G37-dependent codons are highlighted. Asterisk "<sup>∗</sup> " indicates a termination codon. (D) The m1G37-dependent codons in the 5<sup>0</sup> -leader ORF are highly conserved across different species of Gram-negative bacteria. Asterisk "<sup>∗</sup> " indicates complete conservation among all the sequences, whereas a colon ":" indicates conservation between those with strongly similar properties. (E) Salmonella cells expressing the native trmD show a robust response of activation of transcription of mgtA from high to low Mg2<sup>+</sup> (6.3-fold), whereas cells expressing a mutant trmD show a diminished response (1.3-fold), consistent with codon-specific translation at m1G37-dependent codons in the 5<sup>0</sup> -leader ORF. Data are obtained from published work (Gall et al., 2016).

the requirement in TrmD is strictly at the transition state of the catalytic mechanism (Sakaguchi et al., 2014), which is an unexpected finding that links the synthesis of m1G37 to cellular concentrations of the metal ion.

In a genetic reporter system, where mgtA was fused to lacZ, the transcription of mgtA was monitored in Salmonella cells grown in media containing high or low Mg2<sup>+</sup> (1.6 vs. 0.016 mM) (Gall et al., 2016). It was found that cells expressing the native trmD showed more than a sixfold activation of transcription upon switching from high to low Mg2<sup>+</sup> media, whereas cells expressing a mutant trmD showed less than a twofold activation (**Figure 2E**). The mutant trmD harbors a mutation (S88L) near the AdoMet binding site (Masuda et al., 2013), which prevents the enzyme from binding to the methyl donor and from performing the Mg2+-dependent methyl transfer. The reported observation supports a model of codon-specific translation in the 5<sup>0</sup> -leader ORF. For cells expressing the native trmD, the level of Mg2<sup>+</sup> modulates the level of TrmD-dependent m1G37 tRNA synthesis, which in turn modulates the speed of ribosomal translation of m1G37-dependent codons in the 5<sup>0</sup> -leader ORF. At high Mg2+, TrmD is active and the abundantly synthesized m1G37-tRNA facilitates ribosomal translation through the 5<sup>0</sup> leader ORF, thus attenuating the transcription of mgtA. At low Mg2+, by contrast, TrmD is inactive, m1G37-tRNA synthesis is reduced, and ribosomal translation of m1G37-dependent codons is stalled, thus activating transcription through the mgtA gene and producing a robust response. This response to changes of Mg2<sup>+</sup> concentrations is reduced in cells expressing the mutant trmD, consistent with the observed lower level of activation of mgtA transcription.

### PERSPECTIVE

Codon bias has the ability to regulate protein expression by controlling the efficiency or accuracy of protein synthesis. Codon bias can be executed by post-transcriptional modifications of the tRNA anticodons that determine the quality of pairing interaction with the cognate codons at local positions. While it

### REFERENCES


is well documented that altering the codon usage synonymously can alter the expression levels of the manipulated genes (Kudla et al., 2009; Navon and Pilpel, 2011; Goodman et al., 2013; Zhou et al., 2013), much less is known how the alteration is correlated with post-transcriptional modifications of the tRNA anticodons in response to changes of the codon usage. While we present one example in the regulation of bacterial Mg2<sup>+</sup> homeostasis by the m1G37 modification of tRNA (Gall et al., 2016), there are increasing studies demonstrating the ability of post-transcriptional modifications in the anticodon region to alter protein expression. Examples include cmo5U34 (Chionh et al., 2016), mcm5U34 (Begley et al., 2007), and Q34 (Tuorto et al., 2018) to the wobble position, and t6A37 on the 3<sup>0</sup> -side of the anticodon (Lin H. et al., 2018), all of which are induced in response to stress, indicating the ability to reprogram the proteome during cellular adaptation to stress. More broadly, even modifications that are outside of the anticodon but are important for tRNA stability can regulate protein expression by altering the abundance of a specific tRNA, thus impacting on the progress of disease. Examples include m1A58 in the T loop (Richter et al., 2018) and m7G46 in the V loop (Lin S. et al., 2018). The diversity of post-transcriptional modifications of tRNA is a key feature of the importance of tRNA biology. We are only at the tip of the iceberg at the forefront of exciting new discoveries.

### AUTHOR CONTRIBUTIONS

Y-MH prepared the manuscript. IM prepared the figures. HG contributed to discussion. All authors made a substantial intellectual contribution to the work and approved its presentation.

### FUNDING

This work was supported by NIH grants GM108972 and GM114343 to Y-MH and a Japanese JSPS overseas postdoctoral fellowship to IM.




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Hou, Masuda and Gamper. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Recent Insights Into the Structure, Function, and Evolution of the RNA-Splicing Endonucleases

### Akira Hirata\*

Department of Materials Science and Biotechnology, Graduate School of Science and Engineering, Ehime University, Matsuyama, Japan

RNA-splicing endonuclease (EndA) cleaves out introns from archaeal and eukaryotic precursor (pre)-tRNA and is essential for tRNA maturation. In archaeal EndA, the molecular mechanisms underlying complex assembly, substrate recognition, and catalysis have been well understood. Recently, certain studies have reported novel findings including the identification of new subunit types in archaeal EndA structures, providing insights into the mechanism underlying broad substrate specificity. Further, metagenomics analyses have enabled the acquisition of numerous DNA sequences of EndAs and intron-containing pre-tRNAs from various species, providing information regarding the co-evolution of substrate specificity of archaeal EndAs and tRNA genetic diversity, and the evolutionary pathway of archaeal and eukaryotic EndAs. Although the complex structure of the heterothermic form of eukaryotic EndAs is unknown, previous reports regarding their functions indicated that mutations in human EndA cause neurological disorders including pontocerebellar hypoplasia and progressive microcephaly, and yeast EndA significantly cleaves mitochondria-localized mRNA encoding cytochrome b mRNA processing 1 (Cpb1) for mRNA maturation. This minireview summarizes the aforementioned results, discusses their implications, and offers my personal opinion regarding future directions for the analysis of the structure and function of EndAs.

Keywords: RNA-splicing endonuclease, intron-containing tRNA, broad substrate specificity, co-evolution of protein and RNA, archaea and eukaryote

## INTRODUCTION

Transfer RNAs (tRNAs) play a fundamental role as adapter molecules for mRNA translation. Maturation events in tRNAs, including removals of the 5<sup>0</sup> -leader, 3<sup>0</sup> -trailer, and intron sequences, modification, and addition of 3<sup>0</sup> -CCA sequences and amino acids are essential for protein synthesis. During tRNA maturation, tRNA splicing is one of the most significant processes in intron splicing and ligation of the two halves of exons in the precursor (pre)-tRNA. Pre-tRNA introns are either auto-catalytically or enzymatically cleaved out in the three domains of life. Group I introns found in pre-tRNA in some bacteria and higher eukaryotic plastids are auto-catalytically cleaved out with an external guanosine-5<sup>0</sup> -triphosphate (GTP) (Xu et al., 1990; Haugen et al., 2005). By contrast, the introns in cytoplasmic eukaryotic and archaeal pre-tRNAs are enzymatically cleaved out by an RNA-splicing endonuclease (EndA) (Abelson et al., 1998) and the two halves of the exon are

#### Edited by:

Tohru Yoshihisa, University of Hyogo, Japan

#### Reviewed by:

Naoki Shigi, National Institute of Advanced Industrial Science and Technology (AIST), Japan Yohei Kirino, Thomas Jefferson University, United States

> \*Correspondence: Akira Hirata hirata.akira.mg@ehime-u.ac.jp

#### Specialty section:

This article was submitted to RNA, a section of the journal Frontiers in Genetics

Received: 19 November 2018 Accepted: 30 January 2019 Published: 12 February 2019

#### Citation:

Hirata A (2019) Recent Insights Into the Structure, Function, and Evolution of the RNA-Splicing Endonucleases. Front. Genet. 10:103. doi: 10.3389/fgene.2019.00103

**79**

subsequently ligated by a tRNA ligase (Phizicky et al., 1986; Westaway et al., 1988; Englert et al., 2011; Popow et al., 2011; Tanaka et al., 2011). Eukaryotic EndA has been extensively identified and characterized in yeast, xenopus, and human. The yeast and human isoform comprise four distinct subunits, referred to as either Sen2, Sen15, Sen34, and Sen54 or αβγσ (Rauhut et al., 1990; Trotta et al., 1997, 2006; Paushkin et al., 2004), although the complete structure of the heterothermic form of eukaryotic EndAs remains unknown. The intron cleavage mechanism of eukaryotic EndAs has been demonstrated owing to early advancements by Dr. John Abelson's and Dr. Glauco Tocchini-Valentini's groups (Reyes and Abelson, 1988; Baldi et al., 1992; Bufardeci et al., 1993). Furthermore, archaeal EndAs are classified into three types [α4, α 0 <sup>2</sup>, (αβ)2] in accordance with the subunit components (Tocchini-Valentini et al., 2005b) until the ε<sup>2</sup> type of archaeal End is newly identified and characterized (Fujishima et al., 2011; Hirata et al., 2012). Currently, four types of EndAs are found in archaea. The general mechanism underlying the recognition and cleavage of pre-tRNA by archaeal EndA was previously reported by Dr. John Abelson's and Dr. Hong Li's groups (Li et al., 1998; Li and Abelson, 2000; Xue et al., 2006). Eukaryotic EndA follows a similar mechanism, implicating an evolutionary association between archaeal and eukaryotic EndAs. Furthermore, Calvin and Li (2008) reported the molecular mechanisms underlying complex assembly, substrate recognition, and catalysis in archaeal EndA. Their review article still provides robust evidence regarding the mechanisms underlying substrate recognition and introncleavage by archaeal EndAs. This mini-review is focused on recent advancements regarding the structure, function, and evolution of archaeal and eukaryotic EndAs and additionally provides a perspective for future studies on the structure and function of EndAs.

## STRUCTURE

Information regarding the four types of archaeal EndA structures, i.e., α4, α 0 <sup>2</sup>, (αβ)2, and ε2, has been obtained from extensive crystallographic studies (**Table 1**), whereas only the structure of one subunit (Sen15) of eukaryotic EndA has been determined by nuclear magnetic resonance (NMR) spectroscopy (Song and Markley, 2007). Initially, Dr. John Abelson's group determined the X-ray structure of the homotetrameric form (α4) of archaeal EndA in Methanocaldococcus jannaschii (Li et al., 1998) and of the homodimeric form (α 0 <sup>2</sup>) in Archaeoglobus fulgidus (Li and Abelson, 2000). The α 0 <sup>2</sup> type of EndA has also been determined in Thermoplasma acidophilum by another group (Kim et al., 2007). The overall structures of two types are suggestive of a rectangular parallelepiped conformation (**Figures 1A,B**). Briefly, the N-terminal domain of one α subunit in α<sup>4</sup> type of archaeal EndA consists of three α helices and a mixed antiparallel/parallel β sheet, and the C-terminal domain comprises two α helices and a central four-stranded mixed β sheet. Homotetramer formation is achieved by two significant interactions: interaction between two β–β strands at the domain interface between two α subunits and interaction between a negatively charged L10 loop of the α subunit with a positively charged pocket of the opposing α subunit. The interactions are conserved in the four types of archaeal EndAs. The α subunit of α 0 <sup>2</sup> type of EndA is considered the fusion protein of two α subunits of α<sup>4</sup> type of EndA because of the evolutionary association between the α<sup>4</sup> and α 0 <sup>2</sup> types, based on their sequence similarity, and the two α subunits are connected by a linker from the C-terminal domain of α subunit to the N-terminal domain of another subunit. X-ray structures of (αβ)<sup>2</sup> type of archaeal EndAs have been reported in Nanoarchaeum equitans (Mitchell et al., 2009), Pyrobaculum aerophilum (Yoshinari et al., 2009), Aeropyrum pernix (Hirata et al., 2011; Okuda et al., 2011), and Methanopyrus kandleri (Kaneta et al., 2018). The (αβ)<sup>2</sup> type EndA comprises two α catalytic subunits and two β structural subunits, and the four subunits are assembled into a heterotetramer (αβ)<sup>2</sup> through the aforementioned interactions. The overall structures are very similar to those of the α<sup>4</sup> and α 0 <sup>2</sup> types of EndAs, although the structure of P. aerophilum EndA is more compact than that of other EndAs because of the absence of the N-terminal domain of structural β subunit. Furthermore, a new type of ε<sup>2</sup> EndA was identified and characterized in Candidatus Micrarchaeum acidiphilum (ARMAN-2) (Fujishima et al., 2011; Hirata et al., 2012), which is deeply branched within Euryarchaeota. ARMAN-2 EndA forms an ε<sup>2</sup> homodimer through evolutionarily conserved interactions in the other three types of archaeal EndAs. The ε protomer is very unique and is separated into three units (α <sup>N</sup>, α, and β <sup>C</sup>) fused by two distinct linkers, although the overall shape of ARMAN-2 ε<sup>2</sup> EndA is similar to that of the other three types of archaeal EndAs. Structure-based sequence analysis suggests that all four types of archaeal EndAs evolved from a common ancestor.

Three catalytic residues (tyrosine, histidine, and lysine) are conserved in the four types of EndAs, and each subunit assembly of the archaeal EndAs leads to the formation of two intron cleavage sites at the active site (**Figures 1A–D**, green circle). Similarly, two sets of the two substrate recognition residues [two arginines in α subunit of α4 and α 0 <sup>2</sup> types or arginine and tryptophan residues in α subunit of (αβ)<sup>2</sup> and ε<sup>2</sup> types] are positioned at a similar location adjacent to the three catalytic residues. Thus, each multimeric conformation of archaeal EndAs is essential for catalysis and substrate tRNA recognition. In eukaryotes, yeast EndA is a heterotetramer (αβγσ) comprising two catalytic (Sen2 and Sen34) and two accessory (Sen15 and Sen54) subunits identified on the basis of homology with their human counterparts (Trotta et al., 1997). The Sen2 and Sen34 share homology with the α subunit of archaeal EndAs and employ the catalytic residues (histidine, tyrosine, and lysine) identical to their archaeal counterparts. Therefore, eukaryotic and archaeal EndAs are presumed to employ a molecular mechanism of cleavage similar to that of ribonuclease A, using the three catalytic residues (Raines, 1998; Calvin and Li, 2008). The complex structure of the heterodimer in eukaryotic EndA is unknown, although the NMR structure of human Sen15 is known (**Table 1**). The structural arrangement of human Sen15 is similar to that of the C-terminal domain of the α subunit in M. jannaschii α<sup>4</sup> EndA. Together, these findings implicate an evolutionary relationship between the eukaryotic and archaeal isoforms of EndA.

FIGURE 1 | Structures and characteristics of four types of archaeal RNA-splicing endonucleases (EndAs): (A) α<sup>4</sup> type Methanocaldococcus jannaschii EndA; (B) α 0 2 type Archaeoglobus fulgidus EndA; (C) (αβ)<sup>2</sup> type Aeropyrum pernix EndA; (D) ε<sup>2</sup> type ARMAN-2 EndA. Interactions among the subunits are represented by cartoon models on the left side. The β–β interaction responsible for inter/intra-unit formation, the L10 loop and pocket responsible for dimer/tetramer formation are highlighted. The catalytic triads are marked by green circles. The right panels show the ribbon models of EndAs. (E) Left, strict BHB motif; Right, relaxed BHB motifs (hBH and HBh<sup>0</sup> ). (F) Comparison of the ARMAN-2-specific loop (ASL) in the ARMAN-2 EndA and Crenarchaea-specific loop (CSL) in the Aeropyrum pernix EndA Left: superimposed structures of ARMAN-2 EndA and Aeropyrum pernix EndA. Ribbon diagram of the ARMAN-2 EndA and Aeropyrum pernix EndA are represented by colors similar to those in (C,D). Right: close-up view of the structure of the ASL region (pink) of ARMAN-2 EndA superimposed on the structure of the CSL region (gray) of Aeropyrum pernix EndA. The catalytic triad comprised three catalytic residues (Y236, H251, and K282), shown by a stick model (green). Structure-based sequence alignment is shown at the bottom of the superimposed structures. The conserved K161 in ASL and K44 in CSL are highlighted in red. (G) Gene recombination of three units in the ε protomer of ARMAN-2 EndA. Interactions among units are represented by cartoon models on the left side. The panels on the right side show the ribbon models of EndAs. The β–β interactions responsible for inter/intra-unit formation are altered for gene recombination (red). These figures are illustrated with some modifications using previous figures (Hirata et al., 2012) and reproduced with permission based on the copyright policy from Oxford University Press.

## SUBSTRATE SPECIFICITY

Initial studies on the substrate specificity of archaeal EndAs were conducted by Dr. Charles Daniels' and Roger Garrett's groups (Kjems and Garrett, 1988; Thompson and Daniels, 1988; Palmer et al., 1992; Kleman-Leyer et al., 1997; Lykke-Andersen and Garrett, 1997; Lykke-Andersen et al., 1997a,b). Archaeal EndAs are known to recognize a bulge-helix-bulge (BHB) motif (**Figure 1E**), which comprises two bulges (3 nt) separated by one helix (4 nt) located at the exon–intron boundary of pretRNAs (Marck and Grosjean, 2003). The canonical BHB motif is frequently present in the anticodon loop between position 37 and 38 (37/38) of archaeal pre-tRNA; however, in some cases, this motif is present in pre-mRNA and pre-rRNA for their maturation (Kjems and Garrett, 1991; Yoshinari et al., 2006). In contrast with this canonical BHB motif, two types of relaxed BHB motifs, non-canonical introns (hBH and HBh<sup>0</sup> ), are present in pre-tRNAs (**Figure 1E**). The relaxed BHB motifs of hBH and HBh<sup>0</sup> disrupt either 5<sup>0</sup> or 3<sup>0</sup> bulges in the canonical BHB motif. One of the bulges is often absent to form a relaxed bulge-helixloop (BHL). Furthermore, the unique features of disrupted tRNA genes include multiple (two or three) intron-containing tRNAs (Sugahara et al., 2008; Tocchini-Valentini et al., 2009), split and tri-split tRNAs, wherein tRNA fragments are encoded by two or three genes (Randau et al., 2005; Fujishima et al., 2009), and permuted tRNAs, wherein the sequences of 5<sup>0</sup> and 3<sup>0</sup> halves of tRNA genes are inverted (Chan et al., 2011). Remarkably, the canonical and relaxed BHB motifs are located not only at the


TABLE 1 | Structural and functional characterization of archaeal and eukaryotic EndAs.

anticodon loop position 37/38 but also at the D-loop, T-loop, and acceptor-stem of archaeal pre-tRNAs (Marck and Grosjean, 2003; Yoshihisa, 2014). Although introns with canonical and relaxed BHB motifs are distributed at the various positions in pre-tRNA, archaeal EndAs actually recognize and cleave introns. However, only two types, i.e., (αβ)<sup>2</sup> and ε<sup>2</sup> EndAs, can efficiently eliminate introns with relaxed BHB motifs, thereby displaying broad substrate specificity in the EndAs. Eukaryotic EndA recognizes and eliminates introns with a canonical BHB motif from archaeal pre-tRNA, although in most eukaryotic pre-tRNAs, the introns are located at the anticodon loop 37/38 and includes the BHL motif. To eliminate introns with a BHL motif, eukaryotic EndA requires a mature domain of pre-tRNA, wherein the interaction between the D- and T-loops yields a unique structure, the socalled "elbow" (Reyes and Abelson, 1988; Calvin and Li, 2008). The α 0 <sup>2</sup> type of archaeal EndA from Archaeoglobus fulgidus can eliminate introns with the BHL motif at position 37/38 in the case of full-length pre-tRNA (Tocchini-Valentini et al., 2005a).

### BROAD SUBSTRATE SPECIFICITY OF THE ARCHAEAL EndAS

The (αβ)<sup>2</sup> and ε<sup>2</sup> EndAs have broad substrate specificity, which can efficiently cleave not only the introns with canonical BHB motif but also those with a relaxed BHB motif. The molecular mechanism underlying the broad substrate specificity of (αβ)<sup>2</sup> EndA is unknown. To clarify the mechanism, structural and biochemical analyses of the (αβ)<sup>2</sup> type of EndA from hyperthermophilic crenarchaeon Aeropyrum pernix was performed (Hirata et al., 2011). At the time, (αβ)2-type EndAs were reported exclusively in crenarchaea and nanoarchaea, except for euryarchaeon Methanopyrus kandleri (Marck and Grosjean, 2003). Our studies on A. pernix EndA reported a Crenarchaeaspecific loop (CSL), which was conserved in crenarchaeal EndAs and located adjacent to the active site (**Figure 1F**). Furthermore, insertion of CSL in A. fulgidus α 0 <sup>2</sup> EndA conferred A. pernix EndA with broad substrate specificity, which originally had narrow substrate specificity. In the A. pernix EndA with a CSL insert, an alanine-substituted mutant of the conserved Lys residue of CSL disrupted the broad substrate specificity. Together, these findings suggest that the Lys residue of CSL plays a significant role as an RNA binding site and is responsible for the broad substrate specificity in the (αβ)<sup>2</sup> of crenarchaeal EndAs. Similarly, the ε<sup>2</sup> type of ARMAN-2 EndA possesses an ARMAN-2 specific loop (ASL), which confers broad substrate specificity, and the Lys residue of ASL functions as the RNA recognition site. Although the ASL conformation in ARMAN-2 EndA is markedly similar to that of CSL in A. pernix EndA, there are no obvious sequence similarities between the ASL and CSL, except for the conserved Lys residue, which functions as the substrate recognition site. Together, these findings indicate that the ASL was acquired by a distinctly independent evolutionary pathway toward the CSL (i.e., "convergent evolution"). However, it is still unknown why each Lys residue conserved in the CSL and ASL is required for intron cleavage, despite the presence of three catalytic residues in the EndAs. However, M. kandleri EndA was identified as the (αβ)<sup>2</sup> type lacking specific loops such as the ASL and CSL (Kaneta et al., 2018). While M. kandleri EndA slightly cleaves introns with a relaxed BHB motif in M. kandleri pre-tRNAGlu (UUC), it could not eliminate introns from a mini-helix RNA with a BHL motif. Therefore, the M. kandleri EndA is considered to be of the (αβ)<sup>2</sup> type with constrained substrate specificity.

## EVOLUTION

fgene-10-00103 February 9, 2019 Time: 17:9 # 5

The α<sup>4</sup> type of archaeal EndA, which encodes a single catalytic α subunit, is proposed to be the prototype of the EndAs (Tocchini-Valentini et al., 2005b), and the subsequent subfunctionalization of gene duplication and fusion has yielded the other three types [α 0 <sup>2</sup>, (αβ)<sup>2</sup> and ε2]. Intriguingly, ε2-type ARMAN-2 EndA appears to have undergone a genetic recombination of the three subunits, euryarchaeal α subunit, crenarchaeal α subunit, and crenarchaeal β subunit (Hirata et al., 2012), comprising three units (α <sup>N</sup>-α-β <sup>C</sup>) of the ε protomer (**Figure 1G**). Each unit is clearly divided into a domain structure, thus providing a good example of the so-called "domain shuffling" occurring naturally. Moreover, the C-terminal subdomain of the crenarchaeal β subunit may have been incorporated into the terminus of the crenarchaeal α subunit, which may have primarily led to changes in the structural location of β–β interaction responsible for subunit assembly.

The sequence of archaeal α subunit is locally conserved in the two catalytic subunits (Sen2 and Sen34) of the heterotetrameric form (αβγδ) of eukaryotic EndA with approximately 50 amino acid residues. Therefore, eukaryotic EndA is considered to have evolved from the archaeal (αβ)<sup>2</sup> EndA with the acquisition of new subunits (γ and δ). Remarkably, the primitive eukaryotic red alga Cyanidioschyzon merolae harbors many disrupted tRNA genes with a relaxed BHB motif as employed in Archaea (Soma et al., 2007, 2013; Soma, 2014). The C. merolae EndA is expected to comprise three subunits [cmSen2p, cmSen34p, and cmSen54p (αβγ)] for processing these pre-tRNAs; however, it does not contain the ASL and CSL. Thus, heterotrimer form of C. merolae EndA might be an intermediate in the evolutionary transition between the heterotetramer of archaeal EndA to heterotetramer of eukaryotic EndA. Furthermore, recent bioinformatics analysis has reported that archaeal species with specific loops such as the ASL and CSL in EndAs clearly represent a trend of increased intron-containing tRNA genes with BHB and relaxed BHB motifs, suggesting coevolution of tRNA gene diversity and broad substrate specificity (Kaneta et al., 2018). These findings further update the previous concept of co-evolution (Tocchini-Valentini et al., 2005b; Fujishima and Kanai, 2014).

### NEW ASPECTS OF EUKARYOTIC EndA

Vertebrate and Saccharomyces cerevisiae EndAs are localized in the nucleus (Paushkin et al., 2004) and on the mitochondrial outer membrane (Yoshihisa et al., 2003, 2007), respectively. A recent study reported that S. cerevisiae EndA cleaves the mitochondria-localized mRNA encoding Cbp1 (cytochrome b mrNA processing 1) and this cleavage requires a predicted stem-loop structure of the endonucleolytic cleavage-inducible sequence of Cpb1 with synergistic effects of other factors (Tsuboi et al., 2015). These significant findings provide evidence regarding the biological role of mitochondrial-localized S. cerevisiae EndA and suggest that the EndA has broad substrate specificity owing to specific recognition of the predicted stemloop structure without the BHB motif. Furthermore, the human EndA complex (TSen2, TSen15, TSen34, and TSen54) reportedly cleaves introns from pre-tRNAs, and the TSen2 subunit is involved in pre-mRNA<sup>0</sup> 3 end formation (Paushkin et al., 2004). These reports further expand the possibility that the substrates of EndA are non-coding RNAs involved in the regulation of gene expression. To confirm the possibility, crosslinking RNA-EndA complex using UV irradiation combined with immunoprecipitation and RNA sequencing could be a useful method to identify the non-coding RNAs as the substrate of EndA. More importantly, recessive mutations in the genes of three subunits (TSen2, TSen34, and TSen54) cause pontocerebellar hypoplasia (PCH) types 2A-C, 4, and 5 (Budde et al., 2008; Namavar et al., 2011a,b; Bierhals et al., 2013; Mara¸s-Genç et al., 2015). PCH2 is reportedly involved in progressive cerebral atrophy and microcephaly, dyskinesia, seizures and early childhood mortality. Furthermore, a recent study reported that three homozygous TSEN15 cause a milder version of the PCH2 related pathology (Breuss et al., 2016). Hence, appropriate EndA function is required for brain development in humans. However, the mechanism underlying the pathogenesis of PCH which is caused by human EndA mutations remains unclear because its complex structure is yet unknown.

### CONCLUSION AND FUTURE PERSPECTIVES

The mechanism underlying the recognition and cleavage of RNA introns by EndAs is known; however, they have gained increasing interest, since the evolutionary pathway from archaeal to eukaryotic EndA and the mechanism underlying the broad substrate specificity of archaeal and eukaryotic EndA warrant further investigation. The conserved Lys residue in CSL and ASL of the (αβ)<sup>2</sup> and ε<sup>2</sup> types of archaeal EndAs might function as the catalytic and RNA recognition residue. Eukaryotic EndAs probably possess broad substrate specificity, similar to the archaeal (αβ)2- and ε2- type EndAs, whereas the mechanism underlying the broad substrate specificity may vary between the eukaryotic and archaeal EndAs. Further structural analysis is required to elucidate the detailed mechanism underlying broad substrate specificity by archaeal and eukaryotic EndAs. In particular, the structural information of human EndA may be useful for drug design that improves the inadequate EndA function, which causes the developmental retardation of human brain described above.

### AUTHOR CONTRIBUTIONS

The author confirms being the sole contributor of this work and has approved it for publication.

### FUNDING

This work was supported by JSPS KAKENHI grant number JP18K06088 (to AH).

### REFERENCES

fgene-10-00103 February 9, 2019 Time: 17:9 # 6


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Hirata. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

fgene-10-00103 February 9, 2019 Time: 17:9 # 7

# tRNA Processing and Subcellular Trafficking Proteins Multitask in Pathways for Other RNAs

*Anita K. Hopper\* and Regina T. Nostramo*

*Department of Molecular Genetics, Center for RNA Biology, Ohio State University, Columbus, OH, United States*

This article focuses upon gene products that are involved in tRNA biology, with particular emphasis upon post-transcriptional RNA processing and nuclear-cytoplasmic subcellular trafficking. Rather than analyzing these proteins solely from a tRNA perspective, we explore the many overlapping functions of the processing enzymes and proteins involved in subcellular traffic. Remarkably, there are numerous examples of conserved gene products and RNP complexes involved in tRNA biology that multitask in a similar fashion in the production and/or subcellular trafficking of other RNAs, including small structured RNAs such as snRNA, snoRNA, 5S RNA, telomerase RNA, and SRP RNA as well as larger unstructured RNAs such as mRNAs and RNA-protein complexes such as ribosomes. Here, we provide examples of steps in tRNA biology that are shared with other RNAs including those catalyzed by enzymes functioning in 5′ end-processing, pseudoU nucleoside modification, and intron splicing as well as steps regulated by proteins functioning in subcellular trafficking. Such multitasking highlights the clever mechanisms cells employ for maximizing their genomes.

*Edited by:* 

*Akio Kanai, Keio University, Japan*

#### *Reviewed by:*

*Isidore Rigoutsos, Thomas Jefferson University, United States Naoki Shigi, National Institute of Advanced Industrial Science and Technology (AIST), Japan*

> *\*Correspondence: Anita K. Hopper hopper.64@osu.edu*

#### *Specialty section:*

*This article was submitted to RNA, a section of the journal Frontiers in Genetics*

*Received: 30 October 2018 Accepted: 29 January 2019 Published: 20 February 2019*

#### *Citation:*

*Hopper AK and Nostramo RT (2019) tRNA Processing and Subcellular Trafficking Proteins Multitask in Pathways for Other RNAs. Front. Genet. 10:96. doi: 10.3389/fgene.2019.00096*

#### Keywords: tRNA, nuclear export, nuclear import, RNA processing, tRNA splicing

## INTRODUCTION

Biogenesis of different categories of eukaryotic RNAs has been thought to proceed *via* distinct pathways. Indeed, eukaryotic cells employ separate DNA-dependent RNA polymerases, Pol I, II, and III, to transcribe precursors to ribosomal RNA (pre-rRNA), mRNA (pre-mRNA), and tRNA (pre-tRNA), respectively. Moreover, the cell biology for processing the various categories of pre-RNAs appears to be quite different. For example, mRNA splicing, capping, polyadenylation, and nuclear export all occur co-transcriptionally [Review: (Bentley, 2014)]. In contrast, tRNA biogenesis occurs post-transcriptionally at numerous distinct subcellular locations. In *S. cerevisiae* (budding yeast), pre-tRNA transcription by Pol III and 5′ maturation, catalyzed by RNase P, are located in the nucleolus [Review: (Hopper et al., 2010)], whereas particular tRNA modifications are added in the nucleoplasm, and other modifications are added at the inner nuclear membrane or in the cytoplasm after tRNA nuclear export [Review: (Hopper, 2013)]; moreover, pre-tRNA splicing occurs on the surface of mitochondria (Yoshihisa et al., 2003) (**Figure 1**). Despite the general requirement for separate DNA-dependent RNA polymerases and the different subcellular locations of major processing events, it is now appreciated that particular RNA processing/ modification enzymes and pathways for intracellular dynamics can be shared among distinct RNA categories to facilitate related, if not identical, functions. Here, we detail well-established

examples (**Figure 1**, example steps in red font) of proteins and RNP complexes involved in tRNA biology that also function in the biogenesis and subcellular trafficking of other categories of RNAs, RNPs, and proteins.

### Shared Subunits for RNPs Involved in 5**′** Pre-tRNA Processing, Pre-rRNA Processing, and Telomerase

RNase P is an endonuclease that removes 5′ leaders from pre-tRNAs (**Figure 1**). It is a ribonuclear protein (RNP) complex in bacteria, many archaea, and in the nucleus of budding yeast and metazoans; in these organisms, RNase P is comprised of a single small catalytic RNA and various numbers of proteins [Review: (Gopalan et al., 2018)]. Surprisingly, RNase P is a protein-only enzyme (PRORP) in plants and in the organelles of various organisms (Gutmann et al., 2012). In budding yeast, the RNase P complex that processes nucleus-encoded tRNAs is located in the nucleolus and the complex consists of RPR1, the RNA subunit, and nine essential proteins, Pop1, Pop3, Pop4, Pop5, Pop6, Pop7, Pop8, Rpp1, and Rpr2 [Review: (Xiao et al., 2002)]. Unexpectedly, RNase P shares protein subunits with RNPs that have different functions (**Figure 2**). All RNase P proteins except Rpr2 and the RPR1 RNA are shared with MRP, a nucleolar RNP that functions in the processing of pre-rRNA (Xiao et al., 2002; Lindahl et al., 2009; Gopalan et al., 2018). Moreover, Pop1, Pop6, and Pop7 are also components of telomerase, the RNP that is required for the replication of chromosome termini (Lemieux et al., 2016). So, several proteins of the RNase P RNP multitask in at least three separate processes. Structural studies have delineated how the various proteins of these three independent RNPs interact with their unshared RNA subunit (**Figure 2**), but interesting and important questions remain regarding the evolutionary selection for sharing of subunits among these RNPs with quite different functions.

### Numerous RNA Substrates for the "tRNA" Pseudouridine Synthetases

Throughout the steps of pre-tRNA maturation, tRNAs acquire nucleoside modifications. tRNAs are highly modified. There

are >100 nucleotide modifications known to occur on tRNAs [Reviews: (Phizicky and Hopper, 2010; Jackman and Alfonzo, 2013)]. Budding yeast has 25 different tRNA modifications, and each mature tRNA contains an average of ~12 modified nucleosides. Nearly all the genes encoding proteins required for tRNA modification in budding yeast have been identified and characterized (Phizicky and Hopper, 2010; Hopper, 2013). Pseudouridine (ψ) is one of the most abundant of the tRNA modifications. The enzymes that catalyze pseudouridylation of nucleus-encoded cytoplasmic tRNAs (Pus1, Pus3, Pus4, Pus6, Pus7, and Pus8) are protein enzymes (Phizicky and Hopper, 2010). Ribosomal 5S RNA, small nuclear RNAs (snRNA) that are mRNA splicing components, and small nucleolar RNAs (snoRNA) that function in pre-rRNA processing, and rRNAs all also contain pseudouridine modifications. 5S rRNA ψ modification is catalyzed by Pus7 in budding yeast (Decatur and Schnare, 2008). Pseudouridine modifications of snRNA and snoRNA are more complicated as, in addition to ψ modifications being catalyzed by particular Pus proteins, they are also catalyzed by H/ACA pseudouridylases. H/ACA pseudouridylases are RNA-dependent RNP complexes with guide RNAs (yeast H/ACA RNAs). In budding yeast, ψ modifications of snRNAs are catalyzed by Pus1 or Pus7 or by RNA-dependent RNP enzymes, and snoRNAs ψ modifications are catalyzed by Pus1–4 and Pus6,7 in addition to H/ACA pseudouridylases. In contrast, rRNAs pseudouridylation is catalyzed solely by H/ACA pseudouridylases [Reviews: (Karijolich and Yu, 2010; Rintala-Dempsey and Kothe, 2017)] (**Figure 3**).

New genome-wide technologies have led to the discovery that mRNAs also contain pseudouridine modifications and that the mRNA modifications are generated by both Pus proteins and the H/ACA RNP. Scores of budding yeast mRNAs are modified by the Pus proteins and each of the *PUS* genes that encode nuclear or cytoplasmic enzymes contributes to mRNA modifications at particular sites. Furthermore, some mRNA sites are modified only under particular stress conditions [(Carlile et al., 2014; Schwartz et al., 2014) Reviews: (Gilbert et al., 2016; Rintala-Dempsey and Kothe, 2017)]. Thus, the protein "tRNA" pseudouridylases multitask in the biogenesis of tRNAs, 5S RNA, snRNAs, snoRNAs, and mRNAs (**Figure 3**).

### Multitasking by Pre-tRNA Splicing Enzymes

In addition to 5′ and 3′ end-processing and addition of nucleoside modifications, some pre-tRNAs contain transcribed

short introns, located one nucleotide 3′ to the anticodon, that must be removed to generate functional tRNAs. All known eukaryotic genomes encode a subset of intron-containing tRNA genes, although the percentage of such genes varies considerably (Chatterjee et al., 2018). Pre-tRNA splicing is essential in most eukaryotes because generally all reiterated copies of at least one particular isoaccepting tRNA family are encoded by introncontaining genes. The tRNA splicing process is catalyzed endonucleolytic cleavage that generates an intron and two exons, each about half the size of the mature tRNA, followed by ligation of the two resulting exons.

### Splicing Endonuclease Complex Cleaves mRNAs in Addition to Pre-tRNAs

Introns in yeast and vertebrate pre-tRNAs are removed by the conserved heterotetrameric splicing endonuclease (SEN) (Trotta et al., 1997; Paushkin et al., 2004) (**Figure 4**). SEN is located on the cytoplasmic surface of mitochondria in budding and fission yeast, but it is in the nucleoplasm in *Xenopus* oocytes and HeLa cells (De Robertis et al., 1981; Yoshihisa et al., 2003, 2007; Paushkin et al., 2004; Wan and Hopper, 2018). The SEN complex fails to cleave tRNAs that possess inappropriately structured mature domains, and it will cut pre-tRNAs at inappropriate sites if the length of the anticodon stem is altered [(Reyes and Abelson, 1988) Review: (Yoshihisa, 2014)]. Although, to date, there is not a high-resolution co-structure of SEN with its pre-tRNA substrate, the data support the model that SEN interacts with the mature tRNA 3d structure and it "measures" the length of the anticodon stem. Therefore, reports documenting that non-tRNAs also serve as SEN substrates were surprising. The SEN complex has been reported to cleave an mRNA that encodes the unessential cytochrome b mRNA processing 1 protein, Cbp1 (Tsuboi et al., 2015) (**Figure 4**). Moreover, other studies showed that mitochondrial-located SEN catalytic activity in budding yeast is essential even when cells generate spliced tRNAs *via* alternate mechanisms (Dhungel and Hopper, 2012; Cherry et al., 2018), implicating the SEN complex in processing of additional essential RNAs. However, the essential non-tRNA substrate(s) remain unknown. Thus, SEN has the capacity to cleave mRNAs and perhaps other unidentified RNAs in addition to its essential and well-studied role in catalyzing intron removal from pre-tRNAs (**Figure 4**).

### "tRNA" Splicing Ligases have mRNA Substrates

There are two evolutionarily distinct mechanisms to ligate tRNA exons. Surprisingly, for both mechanisms, the ligase that joins pre-tRNA exons after cleavage also ligates a non-tRNA substrate. Pre-tRNA intron removal by all SEN complexes generates a 5′ exon bearing a 2′,3′ cyclic phosphate and a 3′ exon with a 5′ hydroxyl (Knapp et al., 1979; Peebles et al., 1983). In yeast and plants, ligation of the two tRNA exons is catalyzed by Rlg1/Trl1, that has 2′,3′ cyclic phosphodiesterase, 5' RNA kinase, and RNA ligase activities (Phizicky et al., 1986; Englert and Beier, 2005; Wang and Shuman, 2005). Ligation by Rlg1/Trl1

generates a splice junction with two phosphates; the extra 2′ phosphate is removed *via* a transferase activity catalyzed by Tpt1 (Culver et al., 1997).

Although the mechanism for tRNA exon ligation is similar in budding yeast and plants (Englert and Beier, 2005), the mechanisms for tRNA ligation are different in vertebrates and archaea. In vertebrates and archaea, tRNA ligation after removal of tRNA introns is catalyzed by an enzyme complex consisting of a RtcB-like ligase (Tanaka et al., 2011), HSPC117, that catalyzes direct joining of the cleaved 5′ exon bearing a 3′ phosphate (*via* 2′,3′ phosphodiesterase activity of the ligase) with the 5′ hydroxyl on the 3′ exon [(Popow et al., 2011) Review: (Popow et al., 2012)] (**Figure 4**).

Both the Rlg1/Trl1 and the RtcB-like ligases also catalyze exon joining of a particular mRNA. Rlg1/Trl1 catalyzes ligation of HAC1 exons generated *via* non-conventional cleavage by an endonuclease, Ire1, rather than by the spliceosomal mechanism (Sidrauski and Walter, 1997). Cleavage of HAC1 mRNA by Ire1 generates a 5′ exon possessing a 2′,3′ cyclic phosphate and a 3′ exon possessing a 5′ hydroxyl. Subsequent ligation of these exons by Rlg1/Trl1 is mechanistically identical to ligation of tRNA exons (**Figure 4**). Ligation of the exons thereby generates mature HAC1 mRNA that functions in the unfolded protein response (Sidrauski et al., 1996;

Gonzalez et al., 1999). So, the tRNA ligase, Rlg1/Trl1, is, in fact, also a ligase for HAC1 mRNA. Similarly, plant bZIP60, which also functions in the unfolded protein response, is generated by nonconventional mRNA splicing. Plant IRE endonuclease cuts pre-bZIP60, generating two exons that are subsequently ligated by tRNA ligase, RLG1 (Nagashima et al., 2016). Thus, both SEN and ligase of budding yeast and plants participate in maturation of mRNA substrates in addition to tRNA substrates.

In *Xenopus* oocytes and HeLa cells, pre-tRNA splicing occurs in the nucleoplasm (De Robertis et al., 1981; Lund and Dahlberg, 1998; Paushkin et al., 2004). Therefore, vertebrate SEN and HSPC117 ligase must also be located in the nucleoplasm. However, there is also a cytoplasmic pool of HSPC117/RtcB that generates a spliced mature mRNA that encodes a protein which functions in the unfolded protein response. HSPC117/ RtcB ligates XBP1 mRNA exons whose intron, like for HAC1, is removed in the cytoplasm by nuclease activity, rather than by the spliceosomal mechanism (Kosmaczewski et al., 2014). So, as for budding yeast and plants, vertebrate tRNA splicing ligase multitasks to ligate an mRNA in addition to its essential role in ligation of tRNA exons, even though the budding yeast and plant ligases are mechanistically different from the vertebrate ligase.

### Multitasking Nuclear Exporters/Importers for Nuclear-Cytoplasmic Dynamics of tRNAs As Well As Other RNAs, RNPs, and Proteins

In all eukaryotes, RNAs and proteins traffic between the nucleus and the cytoplasm. There are two distinct mechanisms by which macromolecules move between the nucleus and the cytoplasm. However, with the notable exception of most mRNAs, movement of macromolecules between the nucleus and cytoplasm generally employ the mechanism that depends upon the small GTPase, Ran, and the β-importin family of proteins [Reviews: (Gorlich and Kutay, 1999; Cook et al., 2007]. Those β-importin family members involved in import of macromolecules into the nucleus are termed importins and those functioning in export of macromolecules from the nucleus to the cytoplasm are termed exportins, although a few of the family members function both in nuclear import and export (Yoshida and Blobel, 2001; Aksu et al., 2018). The importins and exportins bind appropriate substrates, FG nuclear pore proteins, and Ran-GTP. Ran is primarily in the GTP-bound state in the nucleus and the GDP-bound state in the cytoplasm, thereby creating a Ran-GTP gradient between the nucleus and the cytoplasm. This Ran-GTP nuclear/cytoplasmic gradient determines the directionality of the movement of macromolecules between the nucleus and cytoplasm.

tRNAs move dynamically between the nucleus and the cytoplasm in yeast, protozoa, and vertebrate cells (Shaheen and Hopper, 2005; Takano et al., 2005; Shaheen et al., 2007; Huynh et al., 2010; Barhoom et al., 2011; Ohira and Suzuki, 2011; Watanabe et al., 2013; Dhakal et al., 2018; Kessler et al., 2018). Even though only a subset of eukaryotic tRNA-encoding genes contains introns, we focus on this category of tRNAs. This is because removal of introns from pre-tRNAs serves as a useful reporter for the steps of tRNA subcellular dynamics.

### Primary tRNA Nuclear Export

Nuclear-cytoplasmic tRNA dynamics consists of three steps: primary nuclear export, retrograde tRNA nuclear import, and tRNA nuclear re-export (Chatterjee et al., 2018). Here, we first describe the proteins that function in primary tRNA nuclear export followed by a description of the proteins functioning in the remaining two trafficking steps (**Figure 1**). In budding and fission yeast, it is possible to distinguish between primary nuclear export and tRNA nuclear re-export because pre-tRNA splicing takes place on the surface of mitochondria (Yoshihisa et al., 2003, 2007; Wan and Hopper, 2018) and therefore, defects in primary nuclear export cause nuclear accumulation of unspliced pre-tRNAs. In contrast, tRNAs that have been exported to the cytoplasm *via* primary nuclear export and subsequently spliced prior to import back to the nucleus will accumulate spliced tRNA in the nucleus if the cells are defective in the re-export step. As detailed below, primary nuclear export and tRNA nuclear re-export can also be distinguished by assessing the status of the m1 G37 and queuosine (Q34) nucleoside modifications in budding yeast and *T. brucei*, respectively.

### Los1/Exportin-t/Xpot/PSD

One member of the β-importin family, Los1 (budding yeast)/ Xpot (fission yeast)/Exportin-t (vertebrates)/PSD (plants), is dedicated to tRNA nuclear export (Hellmuth et al., 1998; Kutay et al., 1998; Sarkar and Hopper, 1998; Arts et al., 1998a). Los1/Exportin-t binds both intron-containing and intron-lacking tRNAs in a Ran-GTP-dependent fashion, and it does not require protein adaptors (Arts et al., 1998b; Lipowsky et al., 1999; Cook et al., 2009; Huang and Hopper, 2015). No other cellular RNAs have been reported to interact with Los1 or its various homologues. So, unlike the other proteins described in this review, it appears that Los1 and its homologues do not multitask in nuclear export of other RNAs or proteins (**Figure 5**).

Los1/Exportin-t is unessential in every tested organism, including budding yeast, fission yeast, plants, and haploid human cancer cells (Hurt et al., 1987; Hunter et al., 2003; Cherkasova et al., 2012; Blomen et al., 2015; Hart et al., 2015; Wang et al., 2015). Moreover, insects lack an Exportin-t homologue (Lippai et al., 2000). Since tRNAs must be efficiently delivered to the cytoplasm for their essential role in protein synthesis, other export pathways also function in tRNA nuclear export. Genome-wide studies with budding yeast identified candidate proteins that function in primary tRNA nuclear export (Wu et al., 2015): Crm1 (yeast)/Exportin 1 or Xpo1 (vertebrates) (Wu et al., 2015) and Mex67-Mtr2 (yeast)/NXF1- NXT1 or TAP-p15 (metazoans) (Wu et al., 2015; Chatterjee et al., 2017, 2018).

### Crm1/Exportin-1 – A **β**-Importin That Functions in Nuclear Export of A Variety of Macromolecules

The β-importin family member Crm1/Exportin-1 functions in nuclear export of proteins possessing leucine-rich nuclear export sequences (NES) (Fischer et al., 1995; Wen et al., 1995). Crm1 also functions in nuclear export of several types of RNA *via* interactions with adaptor proteins possessing the leucine-rich NES. In budding yeast, Crm1 mediates nuclear export of the large and small precursor ribosomal subunits [(Sengupta et al., 2010) Reviews: (Okamura et al., 2015; Chaker-Margot, 2018)], the RNA subunit of signal recognition particle (SRP) (but not in vertebrate cells) (Grosshans et al., 2001; Takeiwa et al., 2015), and TLC1, the RNA subunit of telomerase [(Gallardo et al., 2008) Review: (Vasianovich and Wellinger, 2017)], *via* interactions of the cargo RNA/RNPs with adaptor proteins containing the leucine rich motif (**Figure 5**). In vertebrate cells, Exportin-1 also functions in nuclear export of snRNAs and particular mRNAs involved in stress and particular viral RNAs [Reviews: (Kohler and Hurt, 2007; Delaleau and Borden, 2015)].

Crm1 is also implicated in primary tRNA nuclear export because yeast cells possessing a temperature sensitive (ts) mutation of *CRM1* accumulate unspliced tRNA at the nuclear rim at the non-permissive temperature (Wu et al., 2015). Furthermore, Crm1 and Los1 genetically interact as *crm1–1 los1Δ* double mutants have synthetic growth defects (Wu et al., 2015). However, to date, there are no publications that document direct interactions between Crm1 and intron-containing tRNA; thus, Crm1 could mediate primary tRNA nuclear export

indirectly. In summary, the exportin Crm1 is involved in nuclear export of numerous RNAs and RNPs, and perhaps also pre-tRNA nuclear export (**Figure 5**).

### Mex67-Mtr2/NXF1-NXT1—A Ran-GTP Independent Mechanism for Nuclear Export of Numerous RNAs

Although it is well established that nuclear export of small structured RNAs like tRNAs, snRNAs, SRP RNA, and TLC1 as well as ribosomal subunits employ one or more members of the β-importin family, nuclear export of mRNAs is largely independent of the Ran-GTP mechanism [Reviews: (Kelly and Corbett, 2009; Okamura et al., 2015)]. Instead, mRNA nuclear export occurs *via* sequential rearrangements of multiprotein RNA-binding complexes. In metazoan cells, a transcriptiondependent protein complex, TREX, is recruited near mRNA 5′ caps *via* interaction with the cap-binding complex and then a heterodimer, NXF1-NXT1 (TAP-p15) is recruited before nuclear export proceeds. The yeast heterodimeric homologue, Mex67-Mtr2, appears to interact with mRNAs *via* a transcriptiondependent 3′ end processing mechanism using several protein adaptors (Kelly and Corbett, 2009). The Mex67-Mtr2 heterodimer and the metazoan homologues also function in nuclear export of other RNAs (**Figure 5**). They function in nuclear export of 5S rRNA (Yao et al., 2007), the pre-60S ribosome (Yao et al., 2007), the pre-40S ribosome (Faza et al., 2012), TLC1 RNA (Wu et al., 2014), and, in vertebrates, type D unspliced retroviral RNAs (Ernst et al., 1997; Pasquinelli et al., 1997).

Recently, Mex67-Mtr2 in budding yeast has been shown to export tRNAs to the cytoplasm (Chatterjee et al., 2017). Incubation of yeast harboring temperature-sensitive mutations of the essential *MEX67* and *MTR2* genes at a non-permissive temperature results in nuclear accumulation of end-processed, intron-containing tRNAs, similar to the phenotype of yeast lacking *LOS1* (Wu et al., 2015; Chatterjee et al., 2017). Moreover, providing yeast cells lacking Los1 (β-importin) with a mere fivefold excess of ectopic Mex67-Mtr2 results in efficient suppression of both phenotypes of *los1Δ*–accumulation of unspliced tRNAs and accumulation of tRNA in nuclei (Chatterjee et al., 2017). So, Mex67-Mtr2 is able to substitute for Los1, if cells are provided with sufficient quantities of the heterodimer. *In vivo* biochemical studies documented that protein A-tagged Mex67 co-purifies with intron-containing pre-tRNA as well as spliced tRNA (Chatterjee et al., 2017). The data support the model that Mex67-Mtr2 functions directly in tRNA nuclear export in both the primary and the re-export steps. It is unknown whether vertebrate cells employ Mex67-Mtr2 for tRNA nuclear export. However, NXT1 has been reported to stimulate nuclear tRNA export in permeabilized HeLa cells (Ossareh-Nazari et al., 2000) (**Figure 5**).

How Mex67-Mtr2 interacts with tRNAs remains unknown as none of the previously described Mex67-Mtr2 adaptors were uncovered in the genome-wide screen for tRNA splicing defects (Kelly and Corbett, 2009; Wu et al., 2015). Therefore, if Mex67-Mtr2 interaction with tRNA occurs *via* an adaptor, the adaptor may be encoded by redundant genes or it may be novel. It is also feasible that Mex67-Mtr2 interacts with tRNAs without employing an adaptor, similar to reported interactions of NXF1- NXT1 with particular heat shock mRNAs (Zander et al., 2016). It is also unknown what tRNA structures are important for interaction with Mex67-Mtr2.

In summary, there are at least two (i.e., Los1 and Mex67-Mtr2) and perhaps three (i.e., Crm1) or more, pathways in budding yeast for primary tRNA nuclear export. Both Mex67-Mtr2 and Crm1 are involved in the nuclear export of numerous other RNAs; so, unlike Los1, they are not dedicated to tRNA nuclear export (**Figure 5**). It will be interesting to learn whether they have the same fidelity for exporting tRNAs that are appropriately structured and processed to the cytoplasm as does Los1.

#### Retrograde tRNA Nuclear Import

The quandary that the budding yeast SEN complex is located on the surface of mitochondria but spliced tRNAs accumulate in the nucleus under particular stress conditions (Sarkar and Hopper, 1998; Grosshans et al., 2000; Azad et al., 2001; Feng and Hopper, 2002; Takano et al., 2005) led the Hopper and Yoshihisa labs to consider the unorthodox possibility that tRNAs in the cytoplasm could travel in a retrograde direction back to the nucleus. Employing RNA florescence *in situ* hybridization (FISH), these labs demonstrated that ectopic "foreign" tRNAs encoded by one nucleus of a heterokaryon could travel to and accumulate in the nucleus that did not encode the tRNA (Shaheen and Hopper, 2005; Takano et al., 2005), providing strong evidence for tRNA retrograde nuclear import (**Figure 1**). Other studies of haploid yeast and rat hepatoma cells in culture showed that tRNAs accumulate in nuclei upon nutrient deprivation even when transcription of new tRNAs is inhibited by thiolutin or actinomycin D, respectively (Takano et al., 2005; Shaheen et al., 2007; Whitney et al., 2007). Employing nuclear import assays with permeabilized HeLa cells, the Fassati group demonstrated that tRNA nuclear import occurs in vertebrate cells and their studies also showed that tRNA retrograde traffic provides one mechanism by which the retrotranscribed HIV genome can access the nuclear interior in nondividing cells (Zaitseva et al., 2006). Subsequent RNA FISH studies in protozoa, brine shrimp, and vertebrate cells in culture, and, most recently, tagged-tRNAs injected into vertebrate live cells, demonstrate widespread conservation of nuclear import of cytoplasmic tRNAs, especially in response to nutrient and/or heat stress [(Huynh et al., 2010; Barhoom et al., 2011; Miyagawa et al., 2012; Watanabe et al., 2013; Chen et al., 2016; Dhakal et al., 2018) Review: (Huang and Hopper, 2016)].

The mechanism of tRNA retrograde nuclear import remains poorly understood. However, it is thought that tRNA retrograde nuclear import occurs constitutively as well as being upregulated in response to stress (Chatterjee et al., 2018). As detailed below, since m1 G37 modification of several budding yeast tRNAs that are encoded by intron-containing genes require tRNA retrograde nuclear import, retrograde tRNA nuclear import must occur constitutively to generate functional tRNAs for protein synthesis (**Figure 1**). However, as accumulation of nuclear pools of tRNA from the cytoplasm also occurs upon stress conditions, the import process may also be inducible (Shaheen and Hopper, 2005; Takano et al., 2005, 2015; Hurto et al., 2007; Shaheen et al., 2007; Whitney et al., 2007). Two proteins from budding yeast have been identified as possible tRNA nuclear importers, Mtr10 (Shaheen and Hopper, 2005; Murthi et al., 2010) and Ssa2 (Takano et al., 2015); the latter is more likely to participate in regulated retrograde tRNA nuclear import (**Figure 6**).

### Mtr10—A Ran-GTP Dependent Importer May Function in tRNA Nuclear Import

Budding yeast Mtr10 is a β-importin family member best characterized for its role in nuclear import of the protein Npl3 that is required for mRNA nuclear export (Senger et al., 1998). Mtr10 is also implicated in nuclear import of the TLC1 RNA subunit of telomerase that shuttles between the nucleus and the cytoplasm; cells lacking *MTR10* (*mtr10Δ*) have short telomeres, the levels of TLC1 are reduced and TLC1 does not normally accumulate in the nucleus. However, it is unknown whether Mtr10 is directly involved in TLC1 nuclear import (Ferrezuelo et al., 2002; Gallardo et al., 2008).

Three lines of evidence suggested that Mtr10 functions in tRNA retrograde nuclear import. First, tRNA nuclear import was reported to be dependent upon the Ran-GTP gradient, although this conclusion has been controversial (Shaheen and Hopper, 2005; Takano et al., 2005). Second, *mtr10Δ* cells fail to accumulate tRNA in the nucleus upon amino acid (aa) deprivation, in contrast to wild-type cells (Shaheen and Hopper, 2005). Third, *mtr10Δ los1Δ* mutants do not accumulate large nuclear pools of tRNAs (Murthi et al., 2010); as the level of nuclear accumulation in *los1Δ* cells is the result of defects in both tRNA primary nuclear export and re-export, reduced tRNA nuclear pools in the *mtr10Δ los1Δ* double mutant is most likely due to decreased import of tRNA from the cytoplasm. Despite these three lines of evidence, Mtr10 did not co-purify with tRNA under conditions in which Npl3 and Mtr10 interaction was readily detected (Huang and Hopper, 2015). Therefore, Mtr10 may affect tRNA nuclear pools indirectly. In vertebrate cells, the putative Mtr10 orthologue, TNPO3 (Transportin 3) serves to import serine-arginine-rich splicing factors into the nucleus (Maertens et al., 2014). No apparent role for TNPO3 in tRNA nuclear import has been detected; instead, it has been proposed that TNPO3 functions in disassembly of tRNA–capsid complexes from HIV pre-integration complexes after nuclear import (Zhou et al., 2011).

### Ssa2—A Protein Chaperone Implicated in Ran-GTP Independent tRNA Nuclear Import

Takano et al. reported that retrograde tRNA nuclear import is ATP-dependent (Takano et al., 2005). Thus, to identify proteins that may participate in tRNA retrograde nuclear import, the Yoshihisa group searched for proteins from budding yeast able to bind tRNA in an ATP-sensitive fashion and thereby

identified Ssa2. Ssa2 is a constitutively expressed chaperone and member of the heat shock protein 70 (HSP70) family with known functions in protein folding. However, members of the Hsp70 family also bind RNA AU-rich elements—sequences characteristic of unstable mRNAs (Henics et al., 1999; Takano et al., 2015) (**Figure 6**). Cells deleted for *SSA2* (*ssa2Δ*) fail to accumulate elevated tRNA nuclear pools when cells are deprived for aa (Takano et al., 2015). The results support a role for Ssa2 in nutrient-dependent regulated tRNA nuclear import. Ssa2 prefers unmodified to fully modified tRNA and it can bind a nuclear pore protein. Thus, Ssa2 may serve as a tRNA chaperone delivering defective tRNAs to the nucleus for quality control and as a cellular response to nutrient deprivation.

*ssa2Δ* cells that are also depleted for Mtr10 have significantly smaller nuclear pools of tRNA upon aa deprivation than either individual mutant, indicating that Ssa2 and Mtr10 independently affect tRNA retrograde nuclear import. The results provide evidence for both Ran-dependent and Ran-independent pathways for retrograde tRNA nuclear import and a role for each in interacting with multiple RNA and protein substrates (**Figure 6**).

### tRNA Re-export From the Nucleus to the Cytoplasm

Data from budding yeast document that tRNAs imported from the cytoplasm to the nucleus *via* retrograde tRNA nuclear import are re-exported to the cytoplasm, provided that cells are provided with ample nutrients. First, cells acutely deprived of aa, glucose, or phosphate accumulate tRNAs in the nucleus; tRNA nuclear accumulation is not due to defects in primary tRNA nuclear export because the tRNAs are efficiently spliced under these conditions (Shaheen and Hopper, 2005; Hurto et al., 2007; Whitney et al., 2007; Huang and Hopper, 2014). Upon re-feeding with the appropriate nutrient, the tRNA nuclear pools rapidly dissipate (Whitney et al., 2007). Similarly, tRNA nuclear accumulation occurs in rat hepatoma cells upon aa deprivation, and upon re-feeding, there is rapid movement of the tRNAs to the cytoplasm (Shaheen et al., 2007).

Second, wybutosine (yW) modification of G37 of budding yeast tRNAPhe requires both the primary tRNA nuclear export and re-export steps (Ohira and Suzuki, 2011). This is because the first step of yW modification is catalyzed by the Trm5 methyltransferase that acts only on spliced tRNAs but is located in the nucleus; so, tRNAPhe must first be exported to the cytoplasm by the primary tRNA nuclear export process to be spliced on the surface of mitochondria and then imported back into the nucleus to gain m1 G37, catalyzed by Trm5. The subsequent steps of yW modification are catalyzed by cytoplasmic enzymes and therefore to complete yW modification, m1 G37 modified tRNAPhe must be re-exported to the cytoplasm (Ohira and Suzuki, 2011) (**Figure 1**). Similarly, queuosine (Q) modification of tRNATyr in *T. brucei* requires that the pre-tRNATyr be exported to the cytoplasm where its intron is removed; following import to the nucleus, the spliced tRNA is modified by nucleus-located tRNAguanine transglycosylase, which has specificity for spliced tRNATyr. Q-modified tRNATyr is then re-exported to the cytoplasm to fulfill its function in protein synthesis (Kessler et al., 2018).

Which exporters function in the tRNA re-export step? Evidence supports the model that Los1 and Mex67-Mtr2 that function in primary tRNA nuclear export, also function in tRNA re-export to the cytoplasm. As detailed above, the vertebrate and *S. pombe* Los1 homologues, Exportin-t and Xpot, are able to bind both intron-containing and mature tRNA; similarly in budding yeast, both unspliced and spliced tRNAs co-purify with Los1 (Huang and Hopper, 2015). Therefore, Los1 also serves in the re-export step to deliver spliced tRNA to the cytoplasm. Likewise, Mex67 co-purifies with spliced tRNA (Chatterjee et al., 2017), and therefore, it is likely to also function in tRNA nuclear re-export in budding yeast (**Figure 5**). In addition to Los1/Exportin-t and Mex67-Mtr2/NXF1-NXT1, another protein, Msn5/Exportin-5, functions in tRNA nuclear re-export.

### Msn5/Exportin-5 - A **β**-importin That Participates

in tRNA Nuclear Export in Yeast and Vertebrates Yeast Msn5 and its exportin-5 vertebrate homologue have been shown to function in tRNA nuclear export. However, this β-importin family member serves additional roles in nuclearcytoplasmic dynamics. Vertebrate and plant Msn5 homologues, Exportin-5 and Hasty (HST), respectively, serve to export pre-micro RNA (miRNA) from the nucleus to the cytoplasm (Yi et al., 2003; Bohnsack et al., 2004; Lund et al., 2004; Park et al., 2005; Kim et al., 2016). This is accomplished by direct binding of Exportin-t/HST to the hairpin structure of miRNAs and other RNAs (Shibata et al., 2006). In addition, vertebrate Exportin-5 functions in SRP RNA (Takeiwa et al., 2015) and adenoviral VA1 RNA (Gwizdek et al., 2003) nuclear export. Budding yeast Msn5 functions to export numerous proteins to the cytoplasm. These include the HO protein, that is required for mating type switching, and phosphorylated forms of transcription factors Pho4, Mig1, Crz1, and Maf1 that travel to the cytoplasm in response to various environmental conditions [Reviews: (Hopper, 1999; Ciesla and Boguta, 2008)]. Furthermore, the interaction between Msn5 and phosphorylated Pho4 and a peptide from HO are direct because binding occurs *in vitro* with purified components in the presence of Ran-GTP (Kaffman et al., 1998; Bakhrat et al., 2008) (**Figure 5**). There is no evidence showing binding of vertebrate Exportin-5 with proteins.

tRNAs have been shown to be aminoacylated in the nucleus in both vertebrate cells and budding yeast (Lund and Dahlberg, 1998; Sarkar et al., 1999; Grosshans et al., 2000; Azad et al., 2001), even though the vast majority of the aminoacyl tRNA synthetases reside and function in the cytoplasm for protein synthesis. Both in cultured vertebrate cells and in *Xenopus* oocytes, Exportin-5 was shown to promote nuclear export of aminoacylated tRNA (aa-tRNA) *via* a complex with translation elongation factor 1A (eEF1A) in a Ran-GTPdependent mechanism (Bohnsack et al., 2002; Calado et al., 2002). The authors proposed that a major role for Exportin-5 in vertebrate cells is to prevent translation factors from accumulating in the nucleus. More recent studies document that SNAG-containing transcription factors can piggyback on the Exportin-5-aa-tRNA-eEF1A complex *via* eEF1A interaction so to provide a mechanism for nuclear export of this category of proteins (Mingot et al., 2013). Plant HST has no apparent role in tRNA nuclear export (Park et al., 2005).

The role of Msn5 in tRNA nuclear export in budding yeast was first demonstrated by FISH studies documenting tRNA nuclear accumulation in *msn5Δ* cells deprived for aa (Shaheen and Hopper, 2005; Murthi et al., 2010). However, as there is no defect in pre-tRNA splicing in *msn5Δ* cells, it was concluded that Msn5 does not participate in primary nuclear export for the 10 families of tRNAs that are encoded by intron-containing genes in budding yeast, but rather that Msn5 is dedicated to their tRNA nuclear re-export (Murthi et al., 2010). Studies to capture *in vivo* Msn5 with its tRNA cargo provide strong evidence that in a Ran-GTP-dependent mechanism, Msn5 interacts efficiently with spliced tRNA, but interacts very inefficiently with intron-containing tRNA (Huang and Hopper, 2015), supporting the selective role for Msn5 in the re-export of tRNAs encoded by intron-containing genes. Msn5 preferentially binds aa-tRNA in a quaternary complex with eEF1A and Ran-GTP (**Figure 5**).

The manner in which Msn5 assembles into a tRNA nuclear export complex provides explanations for earlier studies demonstrating roles for eEF1A, nuclear pools of aminoacyl synthetases, and the enzyme that adds CCA to tRNA 3′ termini (Cca1) in tRNA nuclear export (Sarkar et al., 1999; Grosshans et al., 2000; Azad et al., 2001; Feng and Hopper, 2002). Msn5 provides one mechanism for cells to regulate tRNA re-export in response to aa deprivation; it also serves to regulate the nuclear-cytoplasmic distribution of several transcription factors in response to environmental stress.

In summary, re-export of tRNAs in budding yeast employs at least three nuclear exporters: Los1, Mex67-Mtr2, and Msn5 (**Figure 5**). As Msn5 does not interact with intron-containing tRNA, it functions solely in the re-export step for the category of tRNAs encoded by intron-containing genes. As for the other transporters discussed, Msn5 has multiple different non-tRNA substrates and multitasks in their nuclear export.

### Function of RNA Subcellular Dynamics Between the Nucleus, Cytoplasm, and Mitochondria

Although discovery of the tRNA retrograde pathway was surprising and initially counter intuitive, subsequent studies document multiple functions for this complicated subcellular tRNA traffic. In budding yeast, tRNA retrograde dynamics function in translation for a subset of proteins (Chu and Hopper, 2013), in quality control of tRNA 5′ leader processing and modification [(Kramer and Hopper, 2013) Review: (Huang and Hopper, 2016)], and in addition of a particular tRNA nucleoside modification (Ohira and Suzuki, 2011). Other small RNAs, in particular, snRNA and TLC1 RNA, also shuttle between the nucleus to the cytoplasm and have been shown to have their processing steps occur both in the nucleus and the cytoplasm. So, RNA trafficking between the nucleus and the cytoplasm is not unique to tRNA. Furthermore, RNA processing on the surface of mitochondria is not restricted to pre-tRNA splicing (Chatterjee et al., 2018). In budding yeast cyclization of t6 A37 to ct6 A37 is catalyzed by Tcd1 and Tcd2 which are also located on the mitochondrial surface (Huh et al., 2003; Miyauchi et al., 2013). Likewise, Q34 tRNA modification catalyzed by the heteromeric transglycosylase in mouse occurs on the mitochondrial surface (Boland et al., 2009). Nor is RNA processing on mitochondria restricted to tRNA as the maturation of nucleases that catalyze processing of piRNA (PIWI-interacting RNA), 5′ Zucchini (MitoPLD) and/or 3′ trimmer (PARN-1), in mouse (Watanabe et al., 2011), flies (Huang et al., 2011), silk worm (Izumi et al., 2016) and *C. elegans* (Tang et al., 2016) reside on the mitochondrial cytoplasmic surface. Studies in budding and fission yeast documented a role for Tom70, a protein subunit of the TOM mitochondrial import complex, in localizing the SEN subunits to mitochondria (Wan and Hopper, 2018). It will be interesting to learn if Tom70 homologues in metazoans function to localize the other yeast and metazoan RNA processing enzymes to mitochondria.

### EPILOGUE

In stark contrast to the commonly held notion that separate gene products and mechanisms are employed to process different categories of RNAs, there are numerous examples of gene products or complexes involved in tRNA biology that multitask in the production and/or subcellular trafficking of other RNAs. Such multitasking likely arose as mechanisms for cells to streamline their genomes by having given gene products serve multiple tasks. There are other mechanisms in eukaryotic RNA biology that also have been selected to streamline genomes. For example, due to alternative starts of transcription and/or alternate translation starts, single genes encoding tRNA modification enzymes and aminoacyl synthetases generate multiple proteins with distinct subcellular

### REFERENCES


locations so to deliver the same catalytic activities to separate destinations; for example, *TRM1* encodes two proteins, one targeted to mitochondria and the other to the nucleus and *CCA1* encodes three proteins that are targeted to the mitochondria, nucleus, or the cytoplasm [Reviews: (Martin and Hopper, 1994; Danpure, 1995)]. Moreover, some proteins in tRNA biology serve additional unrelated roles. For example, budding yeast Mod5 that catalyzes modification of A37 to i 6 A37 of particular tRNAs also functions in regulation of sterol biogenesis in the cytoplasm (Benko et al., 2000) and transcription silencing in the nucleus (Pratt-Hyatt et al., 2013) and numerous budding yeast and vertebrate tRNA aminoacyl synthetases possess functions independent of translation [Reviews: (Guo and Schimmel, 2013; Yakobov et al., 2018)]. Further exploration will undoubtedly uncover new examples of multitasking in the eukaryotic world of tRNA biology.

### AUTHOR CONTRIBUTIONS

AH and RN each participated in the preparation of this review.

### FUNDING

The work from the AH lab is supported by a grant from the National Institutions of Health #122884.

### ACKNOWLEDGMENTS

The authors thank Dr. Kunal Chatterjee for insightful comments on drafts of this article.


in the nucleus of heat-treated HeLa cells. *Nucleic Acids Res.* 41, 4671–4685. doi: 10.1093/nar/gkt153


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Hopper and Nostramo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# RNA Polymerase II-Dependent Transcription Initiated by Selectivity Factor 1: A Central Mechanism Used by MLL Fusion Proteins in Leukemic Transformation

#### Akihiko Yokoyama\*

*Tsuruoka Meatabolomics Laboratory, National Cancer Center, Yamagata, Japan*

#### Edited by:

*Akio Kanai, Keio University, Japan*

#### Reviewed by:

*Yutaka Hirose, University of Toyama, Japan Daisuke Kaida, University of Toyama, Japan Florian Grebien, Ludwig Boltzmann Institute for Cancer Research (LBI-CR), Austria*

> \*Correspondence: *Akihiko Yokoyama ayokoyam@ncc-tmc.jp*

#### Specialty section:

*This article was submitted to RNA, a section of the journal Frontiers in Genetics*

Received: *29 October 2018* Accepted: *21 December 2018* Published: *14 January 2019*

#### Citation:

*Yokoyama A (2019) RNA Polymerase II-Dependent Transcription Initiated by Selectivity Factor 1: A Central Mechanism Used by MLL Fusion Proteins in Leukemic Transformation. Front. Genet. 9:722. doi: 10.3389/fgene.2018.00722* Cancer cells transcribe RNAs in a characteristic manner in order to maintain their oncogenic potentials. In eukaryotes, RNA is polymerized by three distinct RNA polymerases, RNA polymerase I, II, and III (RNAP1, RNAP2, and RNAP3, respectively). The transcriptional machinery that initiates each transcription reaction has been purified and characterized. Selectivity factor 1 (SL1) is the complex responsible for RNAP1 pre-initiation complex formation. However, whether it plays any role in RNAP2-dependent transcription remains unclear. Our group previously found that SL1 specifically associates with AF4 family proteins. AF4 family proteins form the AEP complex with ENL family proteins and the P-TEFb elongation factor. Similar complexes have been independently characterized by several different laboratories and are often referred to as super elongation complex. The involvement of AEP in RNAP2-dependent transcription indicates that SL1 must play an important role in RNAP2-dependent transcription. To date, this role of SL1 has not been appreciated. In leukemia, AF4 and ENL family genes are frequently rearranged to form chimeric fusion genes with *MLL*. The resultant *MLL* fusion genes produce chimeric MLL fusion proteins comprising MLL and AEP components. The MLL portion functions as a targeting module, which specifically binds chromatin containing di-/tri-methylated histone H3 lysine 36 and non-methylated CpGs. This type of chromatin is enriched at the promoters of transcriptionally active genes which allows MLL fusion proteins to selectively bind to transcriptionally-active/CpG-rich gene promoters. The fusion partner portion, which recruits other AEP components and SL1, is responsible for activation of RNAP2-dependent transcription. Consequently, MLL fusion proteins constitutively activate the transcription of previously-transcribed MLL target genes. Structure/function analysis has shown that the ability of MLL fusion proteins to transform hematopoietic progenitors depends on the recruitment of AEP and SL1. Thus, the AEP/SL1-mediated gene activation pathway appears to be the central mechanism of MLL fusion-mediated transcriptional activation. However, the molecular mechanism by which SL1 activates RNAP2-dependent transcription remains largely unclear. This review aims to cover recent discoveries of the mechanism of transcriptional activation by MLL fusion proteins and to introduce novel roles of SL1 in RNAP2-dependent transcription by discussing how the RNAP1 machinery may be involved in RNAP2-dependent gene regulation.

Keywords: RNA polymerase, SL1, transcription, leukemia, MLL, AEP, DOT1L

### EUKARYOTES HAVE THREE MAJOR RNA POLYMERASES

In prokaryotes, one RNA polymerase transcribes all genes. Eukaryotic cells contain multiple RNA polymerases, which transcribe different classes of genes (Roeder and Rutter, 1969; Thomas and Chiang, 2006; White, 2008; Vannini and Cramer, 2012; Turowski and Tollervey, 2016; Khatter et al., 2017; Zhang et al., 2017). Most genes are transcribed by three major RNA polymerases, RNA Polymerase I, II, and III (RNAP1, RNAP2, and RNAP3, respectively). RNAP1 transcribes pre-rRNA, which is later processed into three large rRNA species, 28S, 18S, and 5.8S. RNAP2 transcribes protein-coding genes to yield mRNAs. RNAP3 transcribes 5S rRNA and tRNAs. Small nuclear RNAs and small cytoplasmic RNAs are transcribed by either RNAP2 or RNAP3 (**Figure 1**).

### RNA POLYMERASE I

RNAP1 synthesizes the 47S pre-rRNA, which is processed into mature 28S, 18S, and 5.8S rRNAs (Goodfellow and Zomerdijk, 2013; Khatter et al., 2017; Zhang et al., 2017). The Human genome contains ∼400 rDNA repeats, about 50% of which are transcribed, accounting for up to 60% of the entire transcriptional activity (Birch and Zomerdijk, 2008; Schlesinger et al., 2009). Pre-initiation complex (PIC) formation of RNAP1 at the rDNA promoter is triggered upon association of selectivity factor 1 (SL1) with the core promoter, in the presence of UBF (Learned et al., 1985; Eberhard et al., 1993; Comai et al., 1994; Zomerdijk et al., 1994; Heix et al., 1997; Gorski et al., 2007) (**Figure 1A**). SL1 comprises TBP, TAF1A (also known as TAFI48), TAF1B (also known as TAFI63), TAF1C (also known as TAFI110), and TAF1D (also known as TAFI41) and recruits the RNAP1 complex to induce PIC formation. It is thought that all three polymerases employ similar mechanisms to start transcription (Naidu et al., 2011; Vannini and Cramer, 2012; Khatter et al., 2017), in which the universal role of TBP is to bend the template DNA, while TAF1B, TFIIB, and BRF1 proteins respectively recruit corresponding polymerases to the promoter.

### RNA POLYMERASE II

RNAP2 has been most rigorously studied and shown to collaborate with various associated factors in a step-wise manner to transcribe mRNAs (Roeder, 1996; He et al., 2013). Initiation of in vitro basal transcription on a model promoter starts with loading of the TATA binding protein (TBP) to the TATA box (Basehoar et al., 2004), which is positioned approximately 25 nucleotides upstream of the transcription start site (Roeder, 1996; He et al., 2013). TBP binding induces a bend in the double helix (Kim J. L. et al., 1993; Kim Y. et al., 1993) and recruits TFIIB to stabilize the DNA/protein complex (**Figure 1B**). TFIIB then recruits RNAP2 and TFIIF to form a PIC (Roeder, 1996; He et al., 2013). The initiation of transcription requires the recruitment of TFIIE and TFIIH. TFIIH unwinds DNA at the initiation site and phosphorylates the Ser 5 residue of the RNAP2 C-terminal domain heptapeptide repeat to release the polymerase from the PIC. In vivo RNAP2-dependent transcription is much more complicated. TBP binds to various TBP-associated factors (TAFs) to form a large complex called TFIID, which facilitates promoter recognition, especially at promoters lacking an obvious TATA box (Dynlacht et al., 1991; Pugh and Tjian, 1991). Gene promoters with a TATA box tend to be bound by the SAGA

complex which includes TBP, SUPT3H, and GCN5 (Basehoar et al., 2004; Rodríguez-Navarro, 2009). Therefore, it was thought that TATA-containing genes were mainly regulated by the SAGA complex, while TATA-less genes were independently regulated by TFIID (Pugh and Tjian, 1991; Basehoar et al., 2004). However, recent studies in yeast indicate that most genes utilize both TFIID and SAGA, and that the relative contribution of each complex likely depends on the individual context (Baptista et al., 2017; Warfield et al., 2017). The Mediator co-activator complex is also involved in transcription initiation for the expression of nearly all genes (Malik and Roeder, 2010; Warfield et al., 2017). Mediator disruption caused more severe defects than did the disruption of TFIID subunits, suggesting that there may be a low level of TFIID-independent transcription at many genes that is derived from PICs assembled with TBP and lacking TAFs. Nearly all RNAP2-regulated genes, with or without a TATA box in the promoter, are thought to use TBP for transcriptional activation.

### RNA POLYMERASE III

RNAP3 transcribes 5S rRNA, tRNAs, and various small noncoding RNAs (White, 2008; Vannini and Cramer, 2012; Turowski and Tollervey, 2016; Khatter et al., 2017). The clearest feature of RNAP3 transcripts is that they are all untranslated and less than 300 base pairs in length. tRNA gene transcription requires TFIIIB and TFIIIC (**Figure 1C**). TFIIIC binds to intragenic elements and positions TFIIIB onto the tRNA promoter. TFIIIB, which contains TBP and BRF1, then induces PIC formation to start RNAP3 transcription.

### RNA POLYMERASE I/SL1-DEPENDENT RIBOSOMAL RNA TRANSCRIPTION IN CANCER

In cancer cells, rRNA transcription is upregulated. This increases the cell's ability to produce proteins to meet the metabolic demands of quickly proliferating cancer cells (White, 2008). High level pre-rRNA expression is observed in cancer cells and is correlated with tumor stage (Williamson et al., 2006). MYC is highly expressed in most cancer cells and upregulates cell cycle-related genes and metabolism-related genes to promote cell division and anabolism (Dang, 2012). MYC also activates rRNA transcription directly (Arabi et al., 2005; Grandori et al., 2005; Shiue et al., 2009) and indirectly by activating UBF expression (Poortinga et al., 2004). PTEN, which is often inactivated in cancer, represses RNAP1-dependent transcription by disrupting the SL1 complex (Zhang et al., 2005). Thus, loss of PTEN facilitates cancer specific metabolism in part by enhancing SL1 mediated rRNA transcription.

### AF4 FAMILY/ENL FAMILY/P-TEFB COMPLEX

Unexpectedly, our group identified SL1 as a specific interactor of AF4 (also known as AFF1), which is involved in RNAP2 dependent transcriptional activation (Okuda et al., 2015). This result indicated that SL1 is involved in both RNAP1- and RNAP2-dependent transcription. AF4 is a member of the AF4 protein family that is composed of AF4, AF5Q31 (also known as AFF4), LAF4 (also known as AFF3), and FMR2 (also known as AFF2). Previously, we purified a protein complex nucleated by AF4 and identified AF5Q31, ENL (also known as MLLT1), CDK9, and CyclinT1 (also known as CCNT1) as its components (Yokoyama et al., 2010) (**Figure 2A**). CDK9 and CyclinT1 form a complex called P-TEFb, which activates transcription elongation by phosphorylation of the RNAP2 complex paused by negative elongation factors such as DSIF an NELF (Wada et al., 1998a,b; Yamaguchi et al., 1999; Peterlin and Price, 2006). We named this complex the AF4 family/ENL family/P-TEFb complex (AEP) (Yokoyama et al., 2010). ELL, which retains transcription elongation activity (Shilatifard et al., 1996), was also shown to associate with the AF4 family protein and other AEP components, and this complex is often referred to as the super elongation complex (Lin et al., 2010). Similar complexes were independently identified and characterized in several laboratories and have been shown to activate the RNAP2 dependent transcription elongation step for a subset of genes including heat shock protein genes and the HIV viral genome (He et al., 2010; Lin et al., 2010; Sobhian et al., 2010). Therefore, it is thought that AEP activates transcription by activating transcription elongation (Luo et al., 2012).

### AF4 FAMILY PROTEINS ACTIVATE TRANSCRIPTION VIA SL1

AF4 family proteins have been shown to activate RNAP2 dependent transcription (Prasad et al., 1995; Ma and Staudt, 1996; Morrissey et al., 1997; Hillman and Gecz, 2001). GAL4 dependent transactivation assays use reporter gene expression to indicate activation of transcription. Reporter gene transcription, such as that of the firefly luciferase gene, is driven by a minimum promoter tethered to multiple GAL4 responsive elements and is measured in the presence of the GAL4 DNA binding domain fused with the domain being tested. Using this approach, the serine-rich pSER domain of AF4 protein family (Nilson et al., 1997) was shown to activate transcription (Okuda et al., 2015) (**Figure 2B**). In contrast, modules presumed to activate transcription elongation by recruiting P-TEFb or ELL, did not exhibit transactivation activity. These results indicate that, in addition to recruiting elongation factors, AF4 family proteins possess transcriptional activation functions. Assuming that the pSER domain associates with transcriptional coactivators, our group purified proteins associated with the GAL4-pSER fusion protein and identified SL1 as a specific pSER domain binding factor. Chromatin immunoprecipitation coupled with deep sequencing (ChIP-seq) analysis in HEK293T cells showed that TAF1C of SL1 co-localizes with AEP components in the vicinity of transcription start sites of AEP target genes that are transcribed by RNAP2 (Okuda et al., 2017). Knockdown of TAF1C resulted in decreased expression of AEP target genes (defined as the genes whose expression is reduced by ENL knockdown) (Okuda et al., 2015). Deletion of the TATA box in the luciferase reporter cassette

FIGURE 2 | AEP-dependent transcriptional activation. (A) Composition of the AEP complex (AF4 family/ENL family/P-TEFb complex). AF4 and AF5q31 can form a tero dimer and associate with ENL, CyclinT1, and CDK9. (B) Schematic representation of the AF4 and AF5Q31 protein structures. Protein-protein interaction is shown by a dotted line. NHD, N-terminal homology domain; ALF, AF4/LAF4/FMR2 homology domain; pSER, poly-serine. A9ID, AF9 interaction domain; CHD, C-terminal homology domain; (C) A hypothetical model of AEP-dependent transcriptional activation. First, AEP recruits SL1 and loads TBP onto the promoter through the SDE and NKW motifs. TBP loading induces a bend. SL1 is then dismantled while TFIIB replaces TAF1B. AEP recruits Mediator through MED26 to facilitate transcription initiation. P-TEFb activates transcription elongation of the RNAP2 complex paused by negative elongation factors such as DSIF and NELF. DSIF: DRB sensitivity inducing factor. NELF: Negative elongation factor.

resulted in loss of pSER domain-mediated transactivation. Taken together, these results suggest that the pSER domain first recruits SL1 and then loads TBP to the TATA box to activate RNAP2 dependent transcription initiation.

### MED26 POTENTIATES AF4-DEPENDENT TRANSCRIPTION INITIATION BY SL1

RNAP2 PIC formation is facilitated by Mediator, a large protein complex composed of ∼30 subunits (Malik and Roeder, 2010). Mediator exists in a variety of subunit compositions, most of which are conserved from yeast to metazoans. MED26, one of a few metazoan-specific Mediator subunits, was shown to associate with AEP in addition to associating with other Mediator complex components (Takahashi et al., 2011). The pSER domain can be divided into three subdomains, each of which contains one evolutionarily conserved motif, DLXLS, SDE, and NKW (Okuda et al., 2016) (**Figure 2B**). The DLXLS motif is a binding platform for MED26 while the SDE motif is responsible for binding to SL1. Although the function of the NKW motif remains unclear, it is required for transactivation and is therefore postulated to play a role in RNAP2-dependent transcription processes (Okuda et al., 2015). The SDE and NKW motifs are necessary and sufficient to activate transcription in GAL4-dependent transactivation assay. The DLXLS motif is dispensable but can enhance SL1-mediated transcription, presumably by recruiting Mediator (Okuda et al., 2016). These results indicate that AEP activates transcription initiation primarily via SL1, which can be further potentiated by Mediator. This is contradictory to the current view that AEP is specialized to transcription elongation. Hence, I propose that AEP is a multi-functional transcriptional machinery that can activate both initiation and elongation of transcription (**Figure 2C**).

### AEP-DEPENDENT TRANSACTIVATION IS THE CENTRAL MECHANISM USED BY MLL FUSION PROTEINS IN LEUKEMOGENESIS

AEP components are frequent targets for chromosomal translocation with the MLL gene (also known as KMT2A, HRX, MLL1, HTRX, and ALL1) (Ziemin-van der Poel et al., 1991; Djabali et al., 1992; Tkachuk et al., 1992; Nakamura et al., 2002; Li and Ernst, 2014; Winters and Bernt, 2017; Yokoyama, 2017). MLL is a transcriptional regulator that retains transcriptional activation activity and histone methyltransferase (HMT) activity and involved in transcriptional maintenance of Homeobox (HOX) genes (Zeleznik-Le et al., 1994; Yu et al., 1995, 1998; Ernst et al., 2001; Milne et al., 2002; Nakamura et al., 2002). The resultant MLL-AEP fusion proteins cause aggressive acute leukemia (**Figure 3A**) (Krivtsov and Armstrong, 2007; Li and Ernst, 2014; Winters and Bernt, 2017). Leukemia involving MLL gene rearrangements (MLL-r leukemia) is the cause of 5–10% of all acute leukemia cases (Meyer et al., 2018) and is generally associated with poor prognosis (Rowley, 2008). MLL-r leukemia cells express a subset of genes including HOXA9 and MEIS1 whose expression is normally confined to immature hematopoietic cells such as hematopoietic stem cells (HSCs) (hereafter we refer to as HSC program genes) (Armstrong et al., 2002; Yeoh et al., 2002; Krivtsov et al., 2006;

protein. The PWWP domain of LEDGF recognizes di-/tri-methylated histone H3 lysine 36. The CXXC domain of MLL binds to unmethylated CpGs. The PHD finger 3 binds to di-/tri-methylated Histone H3 lysine 4 (H3K4me2/3). The minimum targeting module necessary for target recognition (MTM) comprises the PWWP and CXXC domains. HBM, HCF binding motif; PS, processing site; AD, activation domain; FYRN, FY-rich domain N-terminal; FYRC, FY-rich domain C-terminal. (B) Constitutive activation of HSC program genes in leukemogenesis. Expression of HSC program genes progressively decreases during normal differentiation. However, MLL fusion proteins constitutively activate HSC program genes to cause leukemia. (C) Ratio of fusion partners in MLL-r leukemia cases. Relative frequency of each fusion partner is shown in a pie chart (adopted from the report from Meyer et al., 2018). AEP components such as AF4, AF9, and ENL account for two-thirds of MLL-r leukemia. (D) A model of target recognition by MLL fusions proteins. First, the MLL fusion protein associates with MENIN, while LEDGF binds to nucleosomes containing H3K36me2/3 marks. Next, the MLL fusion/ MENIN complex forms a stable complex with LEDGF on the promoter of HSC program genes. The MTM-ENL fusion protein binds the same promoters as MLL fusion proteins.

Somervaille and Cleary, 2006) (**Figure 3B**). Sustained expression of HOXA9 and MEIS1 in hematopoietic progenitors causes leukemia in mouse models (Kroon et al., 1998), suggesting that these two genes are strong drivers of leukemogenesis. MLL fusion proteins directly bind the promoters of these HSC program genes and constitutively activate their transcription (Ayton and Cleary, 2003; Somervaille and Cleary, 2006; Garcia-Cuellar et al., 2016; Kerry et al., 2017; Okuda et al., 2017). Thus, the MLL fusion protein is a constitutively-active transcriptional machinery that transforms hematopoietic progenitors by aberrantly activating HSC program genes (Krivtsov et al., 2006; Yokoyama, 2017). Although MLL fuses with more than 100 different fusion partners (Meyer et al., 2018), AEP components constitute two-thirds of MLL-r leukemia cases (**Figure 3C**), indicating that merging the functions of MLL and AEP is the most efficient way to generate powerful leukemic oncogenes. Among the AEP components, AF4 is the most frequent fusion partner for MLL, while AF5Q31 and LAF4 also fuse with MLL in rare cases of leukemia (Ma and Staudt, 1996; Taki et al., 1999; Meyer et al., 2018). The second most frequent fusion partner is AF9 (also known as MLLT3) (Meyer et al., 2018), which is a homolog of ENL. ENL and AF9 constitute the ENL protein family and form a fusion with MLL in one-third of MLL-r leukemia cases. ELL, which also binds to the AF4 family protein, fuses with MLL in leukemia (DiMartino et al., 2000; Luo et al., 2001; Lin et al., 2010). These results strongly indicate that the transcriptional activation function of AEP is the central mechanism utilized by MLL fusion proteins in leukemogenesis.

### TARGET CHROMATIN OF MLL AND MLL FUSION PROTEINS

The role of the MLL portion of MLL fusion proteins is mainly target recognition. Genome-wide ChIP-seq analysis showed that MLL fusion proteins bound to the target chromatin of wildtype MLL near transcription start sites (Guenther et al., 2005, 2008; Wang et al., 2011; Okuda et al., 2017). Most of the target genes bound by MLL fusion proteins are included within the list of MLL target genes (Milne et al., 2005; Wang et al., 2011), and some genes are exclusively regulated by wild-type MLL (Artinger and Ernst, 2013; Li et al., 2013). Wild type MLL retains various chromatin reader modules including plant homeodomain (PHD) fingers and a Bromodomain, which are lost in MLL fusion proteins (**Figure 3A**). Thus, the mode of target recognition is expected to be somewhat different between wild type MLL and MLL fusion proteins. For example, PHD finger 3 has been shown to bind to di-/tri-methylated histone H3 lysine 4 (H3K4me2/3) (Wang et al., 2010), which plays a significant role in target recognition by wild type MLL (Milne et al., 2010). These observations indicate that the MLL portion retained by MLL fusion proteins confers the ability to recognize a subset of, but not all of, wild-type MLL target genes. Whether the presence of wild type MLL is required for MLL fusion-dependent leukemic transformation has been a topic of discussion. One study showed that the remaining wild type allele is required for leukemogenesis (Thiel et al., 2010). But recently, another study showed that MLL is dispensable, while MLL2 (also known as KMT2B), the closest homolog of MLL, plays a major role in sustaining leukemogenesis (Chen et al., 2017), indicating complex redundancy and independency within the target genes of MLL fusion proteins, MLL and MLL2.

### MECHANISM OF TARGET RECOGNITION BY MLL FUSION PROTEINS

The structural requirements of MLL fusion-dependent leukemic transformation can be evaluated using the ex vivo myeloid progenitor transformation assay (Lavau et al., 1997). In this assay, a retrovirus carrying an MLL fusion gene is transduced into murine bone marrow-derived immature hematopoietic progenitors and the cells are cultured ex vivo in semi-solid media containing the required cytokines. Transduction of a functional MLL fusion gene results in the cells expressing high Hoxa9 levels and continuing to proliferate after rounds of replating, while non-transduced cells stop proliferating during early passages (Ayton and Cleary, 2003). Transformed cells can be cultured for more than 5 months and are considered "immortalized." Immortalization is an important feature of leukemic transformation and reflects the aberrant self-renewal of cancer cells. Using this assay, domains within the MLL-ENL fusion that are required for transformation were identified (Lavau et al., 1997; Slany et al., 1998; Ayton et al., 2004). The Nterminal region upstream of the AT hooks was shown to be required for transformation (**Figure 3A**). This region contains a motif responsible for the strong association with MENIN (hMBM: the high affinity MENIN binding motif) (Yokoyama et al., 2004, 2005). MLL-MENIN association triggers further association with LEDGF through the LEGDF binding domain (LBD) (Yokoyama and Cleary, 2008; Huang et al., 2012). LEDGF contains the PWWP domain, which specifically binds the nucleosomes with di-/tri-methylated histone H3 lysine 36 (H3K36me2/3) (Eidahl et al., 2013; Okuda et al., 2014; Zhu et al., 2016) (**Figure 3D**). Mutant constructs of MLL-ENL lacking the hMBM or the LBD failed to transform myeloid progenitors because sequential association of MENIN and LEDGF is critical for leukemic transformation. However, an artificial construct tethering the PWWP domain to the MLL-ENL mutant lacking the hMBM transformed myeloid progenitors, indicated that MENIN's primary role is to incorporate the PWWP domain into the MLL-ENL complex and that other structures of MENIN and LEDGF are dispensable (Yokoyama and Cleary, 2008). Further structure/function analysis demonstrated that only three domains of the MLL-ENL complex are required for leukemic transformation: the PWWP domain of LEDGF, the CXXC domain of MLL, and the ENL portion (Okuda et al., 2014) (**Figures 3A,D**). The CXXC domain specifically binds to unmethylated CpGs (Birke et al., 2002; Allen et al., 2006; Cierpicki et al., 2010). Because an artificial construct composed of the PWWP and CXXC can target the promoters of HSC program genes, combination of these two domains is defined as the minimum targeting module (MTM) (**Figure 3A**).

### DI-/TRI-METHYLATED HISTONE H3 LYSINE 36

Many epigenetic modifiers possess PWWP domains. For example, BRPF1 has a PWWP domain, which also binds to H3K36me2 and H3K36me3 and can functionally substitute for that of LEDGF, indicating that the PWWP domain is a chromatin reader module for H3K36me2/3 in general (Vezzoli et al., 2010; Okuda et al., 2014). H3K36me2 marks are found in active gene promoters and are introduced by histone methyl transferases (HMTs) like ASH1L and NSD2 (Kuo et al., 2011; Zhu et al., 2016). When transcription is ongoing, further methylation on an H3K36me2 mark in the gene body region occurs to produce H3K36me3 by another HMT termed SETD2 complexed with elongating RNAP2 (Wagner and Carpenter, 2012). This H3K36me3 modification highlights transcribed regions and is required for efficient DNA damage response (Mar et al., 2017). Therefore, heterozygous loss of SETD2, which leads to blunt DNA damage response against chemotherapy, is often found in relapsed leukemia (Mullighan et al., 2011; Mar et al., 2014; Xiao et al., 2016). SETD2 was recently reported to physically interact with MLL fusion proteins and may also be implicated in the efficient targeting of MLL fusion proteins to the target promoters (Skucha et al., 2018).

### UNMETHYLATED CG DNA SEQUENCE

Unmethylated CG DNA sequence, which are specifically recognized by the CXXC domain, are enriched in gene promoters, and are linked to transcription initiation (Cedar and Bergman, 2009; Bird, 2011). If the cytosine of CpGs is methylated, the CXXC domain no longer binds to the CG sequence (Allen et al., 2006). Unmethylated CpGs are an epigenetic mark of non-silenced promoters because methylation of CpGs in the promoter are associated with transcriptional silencing. Through the PWWP and CXXC domains, the MLL fusion complex targets transcriptionally-active CpG-rich promoters (**Figure 3D**). During embryogenesis, MLL maintains segmentspecific expression of HOX genes (Yu et al., 1995). HOX genes are called "cellular memory" genes as their positionspecific expression patterns are maintained during development (Deschamps and van Nes, 2005; Wang et al., 2009). MLL is not required for initial activation of HOX gene expression, but is required for the maintenance of HOX gene expression (Yu et al., 1998), indicating that MLL is involved in maintaining an established expression pattern, rather than in determining the expression pattern itself. Therefore, it is likely that MLL targets CpG-rich promoters which were previously transcriptionallyactive in the maternal cell and re-activates transcription in daughter cells to maintain the HOX gene expression patterns.

### ADDITIONAL MECHANISMS OF TARGET RECOGNITION BY MLL FUSION PROTEINS

Resent research showed that the target chromatin of MLL fusion proteins is not confined to the promoter region. Localization of MLL fusion proteins spreads into the gene body at some MLL target genes, which are often hypomethylated and highly transcribed (Kerry et al., 2017). It has been suggested that this aberrant localization is implicated in disease progression. Moreover, in some genes MLL-ENL localizes near the transcription end site and activates gene expression predominantly at transcription elongation levels (Garcia-Cuellar et al., 2016). These results suggest that MLL fusion proteins can be involved in multiple facets of gene activation through binding outside of the promoter region even though their primary targets are CpG-rich promoter regions. Moreover, the structure juxtaposed to the CXXC domain of MLL has been shown to associate with the PAF1 complex, which is thought to bind elongating RNAP2 (Milne et al., 2010; Muntean et al., 2010). PAF1 association may be involved in the fine-tuning of target recognition and/or noncanonical targeting mechanisms mentioned above.

### SL1-MEDIATED TRANSCRIPTIONAL ACTIVATION BY MLL FUSION PROTEINS

While MLL fusion proteins target previously-active CpG-rich promoters through their MLL portions, the fusion partner portion confers the ability to activate transcription. The minimum domains within the fusion partner portion required for transformation have been identified for MLL-ENL and MLL-AF5Q31 (Slany et al., 1998; Yokoyama et al., 2010; Okuda et al., 2015). The ANC1 homology domain (AHD) of ENL and the C-terminal homology domain (CHD) of AF5Q31 are responsible for Hoxa9 transcriptional activation and for the transformation of myeloid progenitors (**Figure 4A**). These domains are the binding platforms for AF4, suggesting that aberrant recruitment of AF4 to MLL target promoters is essential for MLL fusion-dependent transformation. Therefore, functional modules retained within the AF4 portion were thought to be responsible for transformation. An artificial construct in which the pSER domain of AF4 is tethered to the MTM activated Hoxa9 expression and immortalized myeloid progenitors (**Figure 4B**), indicated that recruitment of the SL1 complex to MLL target promoters is necessary and sufficient for transformation. The minimum structure required for transformation was the region encompassing the SDE motif and the NKW motif, which recruit SL1 and activate transcription (Okuda et al., 2015, 2016). These results suggest that the MLL fusion proteins transform myeloid progenitors via SL1-mediated transcriptional activation through AF4. Surprisingly, other functional modules such as the Nterminal homology domain (NHD), which recruits P-TEFb, and the AF4/LAF4/FMR2 homology (ALF) domain (Nilson et al., 1997), which recruits ELL, were dispensable despite the much anticipated significance of these elongation factors. Taken together, these observations suggest that the major role of MLL fusion proteins in leukemic transformation is not to activate transcription elongation but to activate transcription initiation via SL1. Similarly, the DLXLS motif, which recruits Mediator, was also dispensable for transformation (Okuda et al., 2016), suggesting that direct recruitment of Mediator by the MLL fusion complex appears not critical either. Hence, SL1-mediated transcriptional activation by RNAP2 is the rate-limiting step for MLL fusion-dependent gene activation (**Figures 5A,B**).

### MAINTENANCE OF MLL FUSION-DEPENDENT TRANSCRIPTION BY DOT1L

Transcriptional maintenance is also required for MLL fusionmediated leukemogenesis. MLL target genes are prone to gene silencing by transcriptional repressors such as SIRT1 histone deacetylase (Chen C. W. et al., 2015). To maintain gene expression, MLL fusion proteins utilize the DOT1L HMT. DOT1L is an epigenetic modifier that produces mono-, di-, and tri-methylated histone H3 lysine 79 marks (H3K79me1/2/3) (Feng et al., 2002; Jones et al., 2008). DOT1L forms a complex with AF10 family proteins (AF10/AF17) and ENL family proteins (ENL/AF9) (Okada et al., 2005; Mueller et al., 2007; Mohan et al., 2010) (**Figure 4A**). The association of DOT1L with AF10 increases DOT1L HMT activity (Deshpande et al., 2014). The AF10 family genes also form a MLL fusion gene to cause leukemia (DiMartino et al., 2002). MLL-ENL and MLL-AF10 directly recruit DOT1L to the target promoters, suggesting that aberrant DOT1L recruitment contributes to leukemogenesis (Okuda et al., 2017). Moreover, MLL fusion-transformed myeloid progenitors lose their clonogenicity upon acute loss of the Dot1l gene (Chang et al., 2010; Bernt et al., 2011; Jo et al., 2011; Nguyen et al., 2011; Chen et al., 2013). These observations indicated that the continuous presence of DOT1L is required for leukemic

FIGURE 4 | Structural requirements of MLL fusion proteins for leukemic transformation. (A) Schematic representation of the structures of various MLL fusion constructs. MLL fusion proteins recruit AF4 and ENL through different mechanisms to immortalize hematopoietic progenitors. The minimal construct of MLL-ENL which immortalizes hematopoietic progenitors (MTM-ENL') is composed of MTM and the AHD of ENL, while that of MLL-AF5Q31(MTM-AF5-4) is composed of MTM and the CHD of AF5Q31, indicating that recruitment of AF4 to the MLL target chromatin confers transforming activity. By contrast, the minimal transforming construct of MLL-AF10 (MTM-TRX2-AF10') is composed of MTM, the TRX2 domain, and the OMLZ domain of AF10, indicating that MLL-AF10 requires the TRX2 domain for AF4 recruitment and the OMLZ domain for ENL recruitment. In concordance with this notion, the MTM-TRX2-DOT1L construct immortalizes hematopoietic progenitors whereas deletion of the ENL binding domain (MISD) results in loss of transformation. Dotted lines indicate protein-protein interaction. Associated properties of each construct, such as the ability to immortalize myeloid progenitors, and the binding abilities to AF4, DOT1L, and ENL are shown on the right. Immortalizing ability and AF4 binding ability through the TRX2 domain are highlighted in blue and red, respectively. MISD: minimum interaction site for DOT1L. One MISD, located at the residues 628–653, was omitted because of its very weak affinity (Kuntimaddi et al., 2015). N.A., not applicable. (B) Schematic representation of the structures of various MTM-AF4 fusion constructs. MTM-AF4 fusion proteins recruit SL1 through the SDE motif and activate transcription through the NKW motif to immortalize hematopoietic progenitors. Binding modules for P-TEFb or ELL did not confer transforming ability. Associated properties of each construct, including the ability to immortalize myeloid progenitors, binding abilities to SL1, MED26, P-TEFb, and ELL are shown on the right.

transformation and led to the development of a DOT1L HMT inhibitor for the treatment of MLL-r leukemia (Daigle et al., 2011). EPZ-5676 (Pinometostat), a potent DOT1L inhibitor, showed remarkable efficacy in rodent xenograft models (Daigle et al., 2013), confirming that DOT1L-dependent transcriptional maintenance is required for MLL fusion proteins. However, the mode of DOT1L recruitment is somewhat controversial. Some studies suggest that AF4 proteins form a stable complex with DOT1L (Bitoun et al., 2007; Lin et al., 2016). Our biochemical data suggest that AF4 family proteins do not directly associate with DOT1L (Yokoyama et al., 2010). AF4 family proteins and DOT1L bind to the AHD of ENL family proteins in a mutually exclusive manner (Mueller et al., 2007; Yokoyama et al., 2010; Okuda et al., 2017). Structural studies showed that similar

hydrophobic motifs in DOT1L and AF4 bind to the same groovelike structure in AHD (Leach et al., 2013; Shen et al., 2013; Kuntimaddi et al., 2015). Therefore, AF4-ENL association and DOT1L-ENL association via AHD should be mutually exclusive. I postulate that AF4 proteins cannot form a complex with DOT1L because of this structural restraint, but do not exclude the possibility that AF4 and DOT1L can be tethered in alternative indirect manners.

### MECHANISMS OF GENE ACTIVATION BY MLL-ENL AND MLL-AF10

The minimum domain structure of the MLL portion required for transformation by MLL-AF10 differs from that required by MLL-ENL (**Figure 4A**). As for MLL-ENL, the MTM and the AHD, which are sufficient to recruit both AF4 and DOT1L to the MLL target genes, confer transforming ability (Okuda et al., 2014, 2015). In contrast, the MTM fused with the OMLZ domain of AF10 (MTM-AF10′ ), which recruits DOT1L but not AF4, exhibited relatively high Hoxa9 expression in first round colonies but could not maintain its expression for a longer period and was unable to immortalize myeloid progenitors (Okuda et al., 2017). This indicated that DOT1L recruitment to the MLL-target promoter upregulates target gene expression insufficiently for immortalization. However, incorporation of the TRX2 domain of MLL into this MTM-AF10′ fusion construct (MTM-TRX2- AF10′ ) resulted in full transformation, indicating that the TRX2 domain confers additional functions to achieve full leukemic transformation. Given that one function retained by the AHD, but missing from the OMLZ domain, is the ability to recruit AF4 family proteins, we examined whether the TRX2 domain associates with AF4 family proteins on chromatin. To this end, we used the fractionation-assisted chromatin immunoprecipitation (fanChIP) method (Okuda et al., 2014), which enables us to capture protein complexes bound to chromatin. Indeed, the TRX2 domain associated with AF4 proteins (Okuda et al., 2017). Thus, MLL-AF10 recruits AF4 and DOT1L through the TRX2 domain and the OMLZ domain, respectively, to immortalize hematopoietic progenitors (**Figure 5C**).

Moreover, artificial MLL-DOT1L constructs exhibited similar structural requirement (**Figure 4A**). An artificial construct in which the MTM is fused to the entire DOT1L coding sequence (MTM-DOT1L), exhibited the same compromised transforming property as the MTM-AF10′ construct, while incorporation of the TRX2 domain (MTM-TRX2-DOT1L) conferred full transforming ability. These results confirmed that recruitment of both AF4 and DOT1L is required for MLL-AF10-dependent transformation. Deletion of the ENL binding domains (MISD: the minimum interaction site of DOT1L) from the MTM-TRX2- DOT1L construct (MTM-TRX2-DOT1L dMISD) resulted in loss of transformation, indicating that the ENL-DOT1L association is required for MLL-AF10-dependent transformation. This result contradicts a previous report which showed that an artificial fusion of MLL and the HMT domain of DOT1L (MLL-DOT1L HMT), lacking the ENL binding domains, transformed myeloid progenitors in a similar setting (Okada et al., 2005). However, we consistently obtain no transformation readout using this MLL-DOT1L HMT construct (Yokoyama et al., 2010), supporting our conclusion that simply recruiting DOT1L HMT activity to the MLL target chromatin is insufficient for leukemic transformation. Based on these results, I propose a model in which MLL-AF10 promotes AEP formation on nearby chromatin through AF4 recruitment by the TRX2 domain and ENL recruitment by DOT1L (**Figure 5C**). Thus, both MLL-ENL and MLL-AF10 appear to activate transcription in an AEP/SL1 dependent manner.

### THE ROLE OF TRX2 DOMAIN OF MLL

It is unclear how AF4 proteins associate with the TRX2 domain of MLL. Interaction between MLL and AF4 was not detected in conventional IP analysis (Yokoyama et al., 2010). Thus far, this association has only been detected in the chromatin context (Okuda et al., 2017). It is possible that some other chromatin-bound factors mediate the interaction between MLL and AF4 through the TRX2 domain. It has been reported that the region containing the TRX2 domain also associates with SHARP1, which may be involved in interaction between MLL and AF4 (Numata et al., 2018). Although the MLL 5′ portion retains the TRX2 domain, it cannot activate transcription of Hoxa9 and transform myeloid progenitors without its fusion partner portion (Lavau et al., 1997; Slany et al., 1998; Okuda et al., 2017), indicating that AF4 bound with the TRX2 domain is transcriptionally inactive and needs to form an AEP complex with ENL on nearby chromatin to become functional (**Figure 5C**). Supporting this hypothesis, ENL knockdown in MLL-AF10-transformed cells resulted in loss of colony forming ability (Okuda et al., 2017).

### MECHANISM OF TARGET RECOGNITION BY THE DOT1L COMPLEX

Whether DOT1L must be directly recruited by MLL fusion proteins also remains unclear. MLL-ENL directly associates with DOT1L and AF4 through the AHD (Mueller et al., 2007; Yokoyama et al., 2010; Leach et al., 2013; Kuntimaddi et al., 2015). ChIP-seq analysis of HB1119 cells (which express MLL-ENL) showed remarkable overlap of the ChIP-signals of MLL-ENL, AF4, and DOT1L (Okuda et al., 2017). However, it is unclear whether MLL-AF4 directly recruits DOT1L to the target chromatin. Several reports have demonstrated interaction between DOT1L and MLL-AF4 or AF4 by immunoprecipitation (Bitoun et al., 2007; Lin et al., 2016). Moreover, H3K79me2 marks produced by DOT1L are associated with MLL-AF4 target genes (Krivtsov et al., 2008; Kerry et al., 2017), suggesting a mechanism that MLL-AF4 might directly recruit DOT1L to its target chromatin. However, our biochemical data demonstrated that AF4 proteins do not form a stable complex with DOT1L (Yokoyama et al., 2010; Okuda et al., 2017). This indicates that DOT1L may autonomously target similar chromatin to that targeted by MLL and AEP without the help of MLL fusion proteins. The DOT1L complex retains its own chromatin binding modules. AF10 family proteins specifically bind to unmodified histone H3 lysine 27 (H3K27 me0) through their PHD finger-Zn knuckle-PHD finger (PZP) domain (Chen S. et al., 2015) (**Figure 4A**). The YEATS domain of ENL binds to acetylated histone H3 lysine 9/18/27 (Li et al., 2014; Erb et al., 2017; Wan et al., 2017). With those chromatin reader modules, the DOT1L complex possibly binds to its target chromatin by itself. Because AEP and the DOT1L complex both share the ENL family proteins as a component, they should target the same chromatin containing acetylated histone H3 K9/18/27. Consistent with this hypothesis, AEP and the DOT1L complex colocalized at the promoter-proximal regions of ENL target genes in HEK293T cells (Okuda et al., 2017). Binding affinity of ENL family proteins to AF4 family proteins is stronger than that to DOT1L in vitro (Leach et al., 2013). Therefore, the DOT1L complex likely provides ENL family proteins to AF4 family proteins nearby. As such, the DOT1L complex promotes the chromatin association of AEP. AEP/SL1-mediated transcription may in turn stimulate recruitment of the DOT1L complex in a feedback loop mechanism as mono-ubiquitination of histone H2B, which is coupled with transcription, stimulates DOT1Ldependent methylation activity of Histone H3 (McGinty et al., 2008). Such interplay between AEP and the DOT1L complex appears to be present and needs to be investigated in more detail in the future.

### DIRECT RECRUITMENT OF AEP NOT DOT1L IS REQUIRED FOR LEUKEMIC TRANSFORMATION BY MLL FUSION PROTEINS

Structure/function analysis data using the myeloid progenitor transformation assay indicate that modules that recruit AEP, but not DOT1L, are necessary and sufficient for transformation by MLL fusion proteins (**Figure 4A**). For instance, MLL fused with the CHD of AF5Q31, which binds to AF4 but not DOT1L, transformed myeloid progenitors (Okuda et al., 2015, 2017). The MLL-Af4 fusion gene, in which the human MLL gene is fused to murine Af4 gene, was shown to transform hematopoietic progenitors to develop leukemia in vivo (Lin et al., 2016). An artificial construct in which MLL is fused to the CHD portion of murine AF4 successfully induced leukemia in vivo (Lin et al., 2017). These observations favor the model that direct recruitment of AF4 family proteins but not DOT1L is the critical step for MLL fusion-dependent gene activation, and that the DOT1L complex targets to similar chromatin autonomously to maintain transcription (**Figure 5B**). Formation of the MLL fusion/MENIN complex can be inhibited by a small compound (MI-2, MI-2-2, MI-463, MI-503) (Grembecka et al., 2012; Shi et al., 2012). Simultaneous inhibition of MENIN-MLL interaction and DOT1L HMT activity synergistically induces differentiation of MLL fusion-associated leukemia cells (Dafflon et al., 2017; Okuda et al., 2017), supporting the notion that the MLL fusion complex and the DOT1L complex collaborate to induce leukemia.

### HOW DOES SL1 ACTIVATE RNAP2-DEPENDENT TRANSCRIPTION?

Multiple lines of evidence support the notion that SL1 is involved in AEP-dependent gene activation. For example, genome-wide ChIP-seq analysis demonstrated that TAF1C colocalizes with AF4 and RNAP2 at transcription start sites (Okuda et al., 2015, 2017). Taf1c knockdown causes downregulation of AEP-target genes in mouse embryonic fibroblasts (Okuda et al., 2015). Moreover, the pSER domain, which is the binding platform for SL1, can be functionally replaced by a transcriptional activation domain for RNAP2. However, the precise mechanism by which SL1 activates RNAP2-dependent transcription is still unknown. A GAL4-pSER fusion protein activates RNAP2-dependent transcription on an artificial GAL4 responsive promoter containing a TATA box. This activation can be abolished if the TATA box is removed, indicating that the pSER domain promotes loading of TBP onto the TATA box in the form of SL1. After loading of TBP, SL1 must be dismantled and then TAF1B needs to be replaced by TFIIB, which share structural and functional similarities (Naidu et al., 2011). This leads RNAP2 to form an RNAP2-PIC (**Figure 2C**). The NKW motif of the pSER domain is required for transactivation and transformation and therefore is expected to play an important role. I speculate that the NKW motif binds to SL1 to induce conformational changes that facilitate either TBP loading and/or TFIIB exchange.

### WHY DOES MLL FUSION PREFER AEP COMPONENTS AS ITS FUSION PARTNER?

It is odd that MLL predominantly prefers AEP components as its fusion partner. AEP/SL1-dependent gene activation seems much less potent compared to that of other activation domains that recruit TFIID (Okuda et al., 2015). Yet, AEP components are preferentially chosen by MLL. This suggests

### REFERENCES

Allen, M. D., Grummitt, C. G., Hilcenko, C., Min, S. Y., Tonkin, L. M., Johnson, C. M., et al. (2006). Solution structure of the nonmethyl-CpG-binding CXXC domain of the leukaemia-associated MLL histone that AEP/SL1-dependent transcriptional activation has some advantage over TFIID-dependent transcriptional activation. Transcription oscillates during the cell cycle (Gottesfeld and Forbes, 1997; Liu et al., 2017). The metaphase Cyclin/CDK complex phosphorylates SL1, which hinders the interaction between SL1 and UBF, to shut off rRNA transcription during metaphase (Heix et al., 1998). UBF is also inactivated by phosphorylation during metaphase (Klein and Grummt, 1999). Interestingly, SL1 is re-activated in early G1 while UBF remains inactive, suggesting a role for SL1 in addition to rRNA transcription. Therefore, it is possible that AEP/SL1 dependent RNAP2 activation starts in early G1 phase. Unlike many sequence-specific transcription factors, MLL is tethered to chromatin during mitosis, setting the stage for efficient transcription in the next early G1 phase (Blobel et al., 2009). Perhaps transcription of those MLL target genes starts as soon as the next G1 begins, potentially explaining the preference for AEP components as MLL fusion partners. Such biological property may be suited for persistent transcriptional activation of HSC programs genes by the MLL/AEP axis. Given the important roles of SL1 in RNAP1-dependent transcription, gene knockout technologies are not applicable to address these questions. Emerging rapid degradation technology used in the studies of essential factors for viability (Baptista et al., 2017; Warfield et al., 2017) may need to be applied for SL1 components to provide further insights. Because MLL fusion proteins heavily rely on AEP/SL1-dependent gene activation, compounds that inhibit this gene activation process could be used as drugs to treat MLL-r leukemia patients.

### CONCLUDING REMARKS

In conclusion, accumulating evidence indicates that RNAP2 dependent transcription mediated by SL1 is a central mechanism used by MLL fusion proteins. However, much of its molecular mechanism remains undocumented and needs to be investigated. Hopefully, precise mechanisms of transcriptional activation by AEP and SL1 will be revealed in greater detail over the next decade.

### AUTHOR CONTRIBUTIONS

The author confirms being the sole contributor of this work and has approved it for publication.

## ACKNOWLEDGMENTS

This study was supported by a Japan Society for the Promotion of Science (JSPS) KAKENHI grant (16H05337) to AY.

methyltransferase. Embo J. 25, 4503–4512. doi: 10.1038/sj.emboj. 7601340

Arabi, A., Wu, S., Ridderstråle, K., Bierhoff, H., Shiue, C., Fatyol, K., et al. (2005). c-Myc associates with ribosomal DNA and activates RNA polymerase I transcription. Nat. Cell Biol. 7, 303–310. doi: 10.1038/ncb1225


stimulates transcription of rRNA genes by RNA polymerase I. Nat. Cell Biol. 7, 311–318. doi: 10.1038/ncb1224


specific collaboration with Meis1a but not Pbx1b. Embo J. 17, 3714–3725. doi: 10.1093/emboj/17.13.3714


in MLL fusion-dependent transcription. Cell Cycle 15, 2712–2722. doi: 10.1080/15384101.2016.1222337


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Yokoyama. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Emerging Evidence of Translational Control by AU-Rich Element-Binding Proteins

*Hiroshi Otsuka1 , Akira Fukao2 , Yoshinori Funakami2 , Kent E. Duncan3 \* and Toshinobu Fujiwara2 \**

*1 Graduate School of Frontier Sciences, University of Tokyo, Kashiwa, Japan, 2 Kindai University, Higashi-osaka, Japan, 3 University Medical Center Hamburg-Eppendorf, Hamburg, Germany*

#### *Edited by:*

*Tohru Yoshihisa, University of Hyogo, Japan*

#### *Reviewed by:*

*Hyouta Himeno, Hirosaki University, Japan Naoyuki Kataoka, Departments of Applied Animal Sciences and Applied Biological Chemistry, The University of Tokyo, Japan*

#### *\*Correspondence:*

*Kent E. Duncan kent.duncan@zmnh.uni-hamburg.de Toshinobu Fujiwara tosinobu@phar.kindai.ac.jp*

#### *Specialty section:*

*This article was submitted to RNA, a section of the journal Frontiers in Genetics*

*Received: 18 December 2018 Accepted: 28 March 2019 Published: 02 May 2019*

#### *Citation:*

*Otsuka H, Fukao A, Funakami Y, Duncan KE and Fujiwara T (2019) Emerging Evidence of Translational Control by AU-Rich Element-Binding Proteins. Front. Genet. 10:332. doi: 10.3389/fgene.2019.00332*

RNA-binding proteins (RBPs) are key regulators of posttranscriptional gene expression and control many important biological processes including cell proliferation, development, and differentiation. RBPs bind specific motifs in their target mRNAs and regulate mRNA fate at many steps. The AU-rich element (ARE) is one of the major cis-regulatory elements in the 3′ untranslated region (UTR) of labile mRNAs. Many of these encode factors requiring very tight regulation, such as inflammatory cytokines and growth factors. Disruption in the control of these factors' expression can cause autoimmune diseases, developmental disorders, or cancers. Therefore, these mRNAs are strictly regulated by various RBPs, particularly ARE-binding proteins (ARE-BPs). To regulate mRNA metabolism, ARE-BPs bind target mRNAs and affect some factors on mRNAs directly, or recruit effectors, such as mRNA decay machinery and protein kinases to target mRNAs. Importantly, some ARE-BPs have stabilizing roles, whereas others are destabilizing, and ARE-BPs appear to compete with each other when binding to target mRNAs. The function of specific ARE-BPs is modulated by posttranslational modifications (PTMs) including methylation and phosphorylation, thereby providing a means for cellular signaling pathways to regulate stability of specific target mRNAs. In this review, we summarize recent studies which have revealed detailed molecular mechanisms of ARE-BP-mediated regulation of gene expression and also report on the importance of ARE-BP function in specific physiological contexts and how this relates to disease. We also propose an mRNP regulatory network based on competition between stabilizing ARE-BPs and destabilizing ARE-BPs.

Keywords: RNA-binding proteins, AU-rich element, ARE-binding proteins, translational control, mRNA decay

## INTRODUCTION

Transcribed pre-mRNAs are subject to RNA processing in the nucleus, such as capping, polyadenylation, and splicing. Subsequently, processed mRNAs are exported to the cytoplasm (Bjork and Wieslander, 2017). In some cases, mRNAs are immediately translated, but they can also be transported to various subcellular compartments prior to translation. mRNAs are also turned over in the cytoplasm through regulated decay (Garneau et al., 2007). All of these posttranscriptional regulatory steps are important for proper gene expression and are themselves

**115**

highly regulated. Interaction between RNA-binding proteins (RBPs) and specific cis-regulatory elements in target transcripts is the basis for most posttranscriptional regulation of gene expression (**Figure 1**; Moore, 2005).

Although there are a variety of cis-regulatory elements, for example, the cytoplasmic polyadenylated element (CPE) and the iron responsive element (IRE) (Charlesworth et al., 2013; Theil, 2015), we focus here on the AU-rich element (ARE), one important cis-element for RNA regulation, which is typically found in the mRNA 3′ untranslated region (UTR). AREs are contained in 5–8% of human mRNAs coding factors involved in various biological functions such as proliferation, differentiation, signal transduction, apoptosis, and metabolism (Barreau et al., 2005; Bakheet et al., 2006). Originally identified as a sequence inducing mRNA decay (Chen and Shyu, 1995), the ARE was subsequently found to be more broadly involved in RNA processing, transport, and translation (Garcia-Maurino et al., 2017).

Many ARE-binding proteins (ARE-BPs) have been identified that bind to this element and mediate its function in posttranscriptional control (**Table 1**). Most ARE-BPs characterized to date recognize specific AREs in target mRNAs *via* canonical RNA-binding domains (RBDs), for example, the RNA recognition motif (RRM), CCCH tandem zinc finger domain, and KH domain (Hall, 2005; Clery et al., 2008; Nicastro et al., 2015). However, recently developed techniques have identified many new *bona fide* RBPs and revealed the surprising finding that about half of them do not have a conventional RBD (Castello et al., 2012; Beckmann et al., 2015, 2016). Intriguingly, these noncanonical RBPs include several ARE-BPs and analysis of their potentially contributions to ARE function is underway (Garcin, 2018).

FIGURE 1 | Posttranscriptional regulations of gene expression by RBPs. After transcription, RBPs bind pre-mRNA and regulate RNA processing in the nucleus. Mature mRNA is transported to cytoplasm by other RBPs. In the cytoplasm, various RBPs control the different mRNA fates, which include localization, translation, and degradation. Collectively, these effects achieve proper gene expression within specific cell types and in response to specific biological regulatory signals. They can also lead to pathological conditions when regulation is compromised, for example, due to mutations in the gene encoding a specific RBP.

A common characteristic of many ARE-BPs is that they shuttle between nucleus and cytoplasm, but they exhibit different functions depending on their localization to control gene expression (Gama-Carvalho and Carmo-Fonseca, 2001). ARE-BP localization and function are both tightly regulated by posttranslational modifications (PTMs) and interactions with other factors (Chen and Shyu, 2014; Shen and Malter, 2015).

In this review, we summarize (1) the function of ARE-BPs to control mRNA stability or translation in the cytoplasm and RNA processing in the nucleus, (2) the biological and pathological importance of gene regulation by ARE-BPs, and (3) the regulation of ARE-BP function, particularly through PTM.

### mRNA STABILITY AND TRANSLATIONAL CONTROL BY ARE-BPs IN THE CYTOPLASM

AUF1, also known as heterogeneous nuclear ribonucleoprotein D (hnRNP D), was the first identified ARE-BP that can destabilize mRNA (Brewer, 1991). AUF1-KO mice exhibit symptoms of severe endotoxic shock due to excessive production of tumor necrosis factor-α (TNF-α) and interleukin-1β (IL-1β), which results from failure to degrade these mRNAs (Lu et al., 2006b). AUF1 was also found to destabilize mRNAs encoding c-fos and c-myc (Brewer, 1991; Loflin et al., 1999), although another study reported that AUF1 stabilizes these mRNAs (Xu et al., 2001). These apparently conflicting results suggest that the function of AUF1 is not fixed, but can be differentially regulated depending on the cell type and specific conditions (Gouble et al., 2002). AUF1 forms the AUF1- and signal transductionregulated complex (ASTRC) with several factors [eIF4G, poly(A)-binding protein (PABP) C1, Hsp27, and Hsp70] (Laroia et al., 1999; Lu et al., 2006a; Sinsimer et al., 2008). This complex is required for AUF1-mediated mRNA decay, but its molecular mechanism of action is still unknown.

TTP is a destabilizing ARE-BP with a well-characterized molecular mechanism. This protein has a tandem zinc finger RBD and binds the 3′UTR of mRNAs coding TNF-α and granulocyte macrophage colony-stimulating factor (GM-CSF), and induces mRNA decay (Lai et al., 1999; Lai and Blackshear, 2001). The mRNA coding for TTP also contains AREs in its 3′UTR, and thus, TTP regulates its own expression levels by a negative feedback (Brooks et al., 2004; Tchen et al., 2004). TTP recruits the CCR4-NOT complex to target mRNAs *via* direct binding to its subunits, CNOT1 and CNOT9 (Fabian et al., 2013; Bulbrook et al., 2018). TTP also interacts with the Dcp1a/Dcp2 complex involved in decapping and a component of the exosome, Rrp4, to degrade mRNA (Lykke-Andersen and Wagner, 2005). Furthermore, TTP represses translation by recruitment of 4EHP to target mRNAs through interaction between its PPPPG motif and GYF2 (**Figure 2A**; Tao and Gao, 2015; Fu et al., 2016). 4EHP has affinity for the 5′-end cap structure like eIF4E, but does not bind eIF4G. Therefore, 4EHP represses translation by competing with eIF4E for the cap (Morita et al., 2012). The TIS11 family, to which TTP belongs, also contains two other members, ZFP36L1 and ZFP36L2.



*Blue colors show canonical ARE-BPs, and red color shows noncanonical ARE-BPs.*

Although these factors differ from each other in their tissue distribution and target mRNAs, they have about 70% homology, including the CNOT1 binding site, and both induce mRNA decay (Sanduja et al., 2011).

K-homology splicing regulatory protein (KSRP) was initially identified as a nuclear factor involved in transcription and splicing (Davis-Smyth et al., 1996; Min et al., 1997). Subsequently, it was reported that KSRP binds the ARE using two of four KH domains, KH3 and KH4 (Gherzi et al., 2004), and destabilizes target mRNAs by recruitment of poly(A)-specific ribonuclease (PARN) and exosome to mRNAs (Chen et al., 2001; Chou et al., 2006). Furthermore, it was shown that KSRP interacts with the enterovirus 71 internal ribosomal entry site (IRES) and behaves as an IRES trans-acting factor (ITAF) to negatively regulate viral translation (Lin et al., 2009).

Unlike the ARE-BPs introduced so far, Hu proteins are ARE-BPs that stabilize their target mRNAs. The Hu protein family consists of four members. HuR is ubiquitously expressed, whereas HuB, HuC, and HuD are mainly expressed in neurons. All members of Hu proteins have three RRMs. RRM1 and RRM2 recognize ARE, and RRM3 binds the poly(A) tail (Ma et al., 1997). HuR binds to the ARE in the mRNAs encoding c-fos, Cox 2, and TNF-α in competition with TTP or KSRP and stabilizes these mRNAs (Fan and Steitz, 1998b; Katsanou et al., 2005). Furthermore,

HuR associates with eIF2 alpha kinase 4 and may temporally define translation in the developing neocortex (Kraushar et al., 2014). Neuronal Hu proteins are thought to regulate and induce neuronal differentiation through stabilizing target mRNAs (Okano and Darnell, 1997; Mobarak et al., 2000; Akamatsu et al., 2005). Fukao et al. previously showed that HuD stimulates translation initiation *via* direct binding to the poly(A) tail and eIF4A (Fukao et al., 2009). Furthermore, Fujiwara et al. demonstrated that physical interaction between HuD and the active form of Akt/ PKB is required for morphological alterations such as neurite outgrowth in PC12 cells undergoing a neuronal differentiation program (Fujiwara et al., 2012). Akt/PKB directly phosphorylates eIF4B, whose phosphorylation stimulates the RNA helicase activity of eIF4A (Rozen et al., 1990; Altmann et al., 1993; van Gorp et al., 2009). Thus, it is possible that HuD recruits Akt/PKB to the translation initiation complex to stimulate eIF4A activity on its ARE-containing mRNAs (**Figure 2B**).

### NUCLEAR FUNCTION OF ARE-BPs

Regulation of mRNA stability, localization, and translation is a cytoplasmic function of ARE-BPs, yet most ARE-BPs shuttle between nucleus and cytoplasm, thereby suggesting that these proteins also have nuclear functions. Indeed, several nuclear functions for ARE-BPs have been identified. For example, in recent years, it was shown that KSRP has a novel nuclear function involved in maturation of a subset of microRNAs (Ruggiero et al., 2009; Trabucchi et al., 2009). KSRP binds to a terminal loop of miRNA precursors and promotes both steps of biogenesis: conversion of pri-miRNAs to pre-miRNAs in the nucleus by Drosha and pre-miRNA processing to mature miRNAs in the cytoplasm by Dicer (Trabucchi et al., 2009).

Hu proteins have a domain regulating nuclear-cytoplasmic shuttling located in a linker region between RRM2 and RRM3 (Fan and Steitz, 1998a; Kasashima et al., 1999). A recent study showed that AREs are abundant in introns of human genes and that HuR regulates expression of genes containing these intronic AREs (Bakheet et al., 2018). The pre-mRNAs coding for HuR undergo alternative polyadenylation leading to transcript variants with different lengths of 3′UTR and stability (Al-Ahmadi et al., 2009). Because HuR impairs neuronal differentiation by promoting cell proliferation, neuronal Hu proteins decrease HuR expression by binding to the pre-mRNA of HuR at the polyadenylation site to produce a less stable mRNA bearing the long 3′UTR (Mansfield and Keene, 2012). Neuronal Hu proteins are also involved in neuron-specific alternative splicing by utilizing AUF1 as a co-factor (Fragkouli et al., 2017).

TIS11 family proteins have a potential nuclear localization signal within the zinc finger domain (Murata et al., 2002; Phillips et al., 2002; Twyffels et al., 2013). In the nucleus, TTP in association with poly(A)-binding protein nuclear 1 (PABPN1) inhibits poly(A) tail synthesis on mRNAs which contain AREs, such as TNF-α, GM-CSF, and IL-10, thereby promoting degradation of these transcripts (Su et al., 2012). Under hypoxia, ZFP36L1 has been reported to reduce expression level of Deltalike 4 (Dll4) involved in cell fate determination in angiogenesis by inhibiting cleavage at the polyadenylation site of the *Dll4* mRNA (Kume, 2009; Desroches-Castan et al., 2011).

## NONCANONICAL ARE-BPs

Recently, systematic investigation of RBPs has been performed in various cell types (yeast and cultured cells) by interactome capture assays (Castello et al., 2012; Beckmann et al., 2015, 2016). Protein-RNA interactions are immobilized by conventional UV crosslinking (cCL) by 254 nm UV irradiation or photoactivatable ribonucleoside-enhanced (PAR-) CL by 365 nm UV irradiation using cells by which photoactivatable 4-thiouridine (4 SU) is taken up. Then, mRNA-RBP complexes are captured by oligo(dT) beads, and the proteins are analyzed by mass spectrometry after digestion of mRNAs. As a result, many novel RBPs were detected. Surprisingly, about half of these have no conventional RBD (Castello et al., 2012; Beckmann et al., 2015, 2016). Many well-known metabolic enzymes are among these noncanonical RBPs. A typical example of a metabolic enzyme that has been identified as noncanonical RBP is ACO1/IRP1. When iron levels are in the normal physiological range, ACO1/IRP1 functions as a cytoplasmic aconitase in the TCA cycle. However, in iron-deficient conditions, ACO1/IRP1 behaves as a sequence-specific RBP that recognizes a certain stem-loop structure, the iron-responsive element (IRE) (Constable et al., 1992). ACO1/IRP1 binds the 3′UTR of the mRNA coding transferrin involved in iron uptake and stabilizes this mRNA (Casey et al., 1988; Mullner and Kuhn, 1988). It also binds to an IRE in the 5′UTR of the mRNA encoding ferritin, a protein involved in iron storage. In this case, it inhibits translation (Hentze et al., 1987), thereby regulating the intracellular iron level. This classic example of a metabolic enzyme moonlighting as an RBP illustrates how cellular metabolic states can be intimately connected with posttranscriptional regulation of gene expression (Castello et al., 2015).

Further evidence to support this principle is found in the glycolytic enzyme, glyceraldehyde-3-phosphate dehydrogenase (GAPDH), which is a noncanonical ARE-BP (Nagy and Rigby, 1995). GAPDH binds the ARE *via* a Rossmann fold which binds NAD+ /NADH, and thus, NAD+ abundance affects binding activity of GAPDH to the ARE (Nagy and Rigby, 1995; Rodriguez-Pascual et al., 2008; Ikeda et al., 2012). Indeed, a switch from oxidative phosphorylation to aerobic glycolysis when T lymphocytes are activated promotes dissociation of GAPDH from the ARE in the mRNA coding for interferon-γ (IFN-γ) and increases expression of IFN-γ (Chang et al., 2013). GAPDH also binds to mRNAs containing AREs, such as those encoding colony stimulating factor-1 (CSF-1), cyclooxygenase-2 (Cox-2), and endothelin-1 (ET-1), and regulates stability or translation of these mRNAs (Rodriguez-Pascual et al., 2008; Zhou et al., 2008; Ikeda et al., 2012). Likewise, lactate dehydrogenase (LDH) M, which is a glycolytic enzyme, also binds an ARE in the mRNA coding for GM-CSF by a Rossmann fold in an NAD+ concentration-dependent manner (Pioli et al., 2002). Moreover, LDHM directly interacts with AUF1. This interaction is thought to complement low binding specificity of AUF1, which also binds various RNAs even without AREs (Kiledjian et al., 1997; Eversole and Maizels, 2000), and to be utilized for recruitment of AUF1 to target mRNAs (Pioli et al., 2002).

### BIOLOGICAL FUNCTIONS OF ARE-BPs IN HEALTH AND DISEASE

The fact that AREs are found mainly in mRNAs coding for inflammatory cytokines and growth factors suggests the potential for coordinated regulation of specific biological processes by ARE-BPs (Barreau et al., 2005; Khabar, 2017; Turner and Diaz-Munoz, 2018).

ZFP36L1 and ZFP36L2 have redundant functions in T-cell and B-cell maturation (Hodson et al., 2010; Galloway et al., 2016). During T-cell maturation, ZFP36L1 and ZFP36L2 limit the cell cycle and repress the DNA damage response induced by double-strand DNA breaks (Vogel et al., 2016). Moreover, ZFP36L1 promotes monocyte/macrophage differentiation by controlling mRNA stability of CDK6 (Chen et al., 2015). It was reported that mice that lack the N-terminal 29 amino acids of ZFP36L2 are infertile (Ramos et al., 2004; Ramos, 2012), due to failure to control expression of luteinizing hormone receptor (LHR) by ZFP36L2 (Ball et al., 2014). More recently, oocyte-specific KO of ZFP36L2 in mice showed that this protein controls expression of histone demethylases targeting H3K4 and H3K9 and induces global transcriptional silencing in the oocyte, which is important for the oocyte-to-embryo transition (Dumdie et al., 2018).

Neuronal Hu proteins are involved in alternative splicing of amyloid precursor protein (APP) (Fragkouli et al., 2017). The APP gene contains 18 exons and 3 isoforms: APP770 contains all exons, APP751 lacks exon 8, and APP695 lacks exons 7 and 8. In the brain of Alzheimer's disease patients, APP695 is decreased, whereas APP770 is increased (Moir et al., 1998; Matsui et al., 2007). Neuronal Hu proteins promote expression of APP695 instead of APP770 (Fragkouli et al., 2017). On the other hand, HuD stabilizes the mRNAs for APP, as well as β-site APP-cleaving enzyme 1 (BACE1), which induces processing from APP to amyloid-β (Kang et al., 2014).

ARE-BPs are also implicated in other neurological disorders. A human genetics study identified TIA1 mutations in amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD) patients (Mackenzie et al., 2017). Interestingly, this same study showed that these mutations promote phase separation of TIA1 protein and affect the dynamics of stress granules, which are themselves suggested to be important in ALS pathology (Li et al., 2013; Zhang et al., 2018).

HuD may also be involved in ALS and another neurological disorder, spinal muscular atrophy (SMA). Direct evidence for a contribution of HuD to ALS is lacking, but it was found to form insoluble aggregates in the cytoplasm with TDP-43, an RBP heavily implicated in ALS and FTD (Fallini et al., 2011), thereby raising the possibility of pathological interactions between these two RBPs. In addition to nuclear functions in pre-mRNA processing, TDP-43 also represses translation of specific mRNAs in *Drosophila* ALS models and cultured mammalian cells (Majumder et al., 2012, 2016; Coyne et al., 2014, 2017), although the exact connection between these mRNAs and ALS remains unclear. More recently, TDP-43 was also shown to function as an mRNA-specific translational enhancer for the mRNAs encoding CAMTA1 and DENND4, both of which are directly linked to ALS and neurodegenerative disease (Neelagandan et al., 2019). Whether HuD contributes to this regulation remains to be determined. However, the *Camta1* and *Dennd4a* mRNAs both contain many AREs based on *in silico* analyses (Fallmann et al., 2016). This observation, taken together with HuD's ability to function as an mRNAspecific translational enhancer *via* AREs (Fukao et al., 2009), raises the possibility that HuD might potentially function as a co-factor in TDP-43-driven translational enhancement of *Camta1* and *Dennd4a* mRNAs.

SMA is caused by lack or mutation of survival of motor neuron protein (SMN) (Burghes and Beattie, 2009). SMN interacts with HuD on mRNAs such as the one coding for candidate plasticity-related gene 15 (cpg15), and forms an RNA granule (Akten et al., 2011; Fallini et al., 2011; Hubers et al., 2011). The tudor domain of SMN is important for interaction between SMN and HuD, and an SMN mutant from severe SMA patients bearing a mutation in the Tudor domain cannot interact with HuD (Buhler et al., 1999; Fallini et al., 2011; Hubers et al., 2011). What does this interaction between HuD and SMN mean? We previously reported translation stimulation by HuD (Fukao et al., 2009). Another group showed that SMN represses translation of certain mRNAs, and the Tudor domain mutant of SMN is not able to repress translation (Sanchez et al., 2013). Repression of ectopic translation and induction of translation initiation in response to local stimulatory cues are important components of local translation in neuronal compartments. Moreover, SMN is closely involved in axonal translation (Bernabo et al., 2017). Therefore, an interesting possibility is that SMN and HuD could have opposite, but complementary, roles in the context of neuronal mRNP transport and translation. According to this view, SMN could function as a brake to suppress ectopic translation while mRNPs are transported to sites where local protein synthesis would occur. Conversely, HuD's role would be to promote translation initiation at these sites in response to local neuronal cues. Future studies in primary neuronal cultures could examine this possibility.

### REGULATION OF ARE-BP FUNCTION

As can be seen from the examples of SMN and HuD, functional regulation of ARE-BPs is strongly related to biological functions and diseases. Thus, function of ARE-BPs is controlled by several factors such as long noncoding (lnc) RNA, other proteins, and PTMs.

H19 is an lncRNA expressed in embryo and skeletal muscle (Bartolomei et al., 1991). A recent study showed that H19 directly binds KSRP and promotes destabilization of the mRNA for myogenin by KSRP, thus favoring myogenic differentiation (Giovarelli et al., 2014). Overexpressed in colon carcinoma-1 (OCC-1), an lncRNA binds HuR and enhances binding of the β-TrCP1 E3-ubiquitin ligase, thereby promoting destabilization of the HuR protein (Pibouin et al., 2002; Lan et al., 2018).

Arginine methylation is a common feature of a large population of RGG box proteins, which are involved in many aspects of mRNA metabolism (Rajyaguru and Parker, 2012). In some cases, arginine methylation has been shown to regulate the function of ARE-BPs containing RGG boxes. For example, HuR is methylated at R217 by coactivator-associated arginine methyltransferase 1 (CARM1) and methylated HuR binds the mRNA encoding the histone deacetylase, Sirtuin 1 (SIRT1) to stabilize it (Calvanese et al., 2010). Many hnRNP proteins that contain RGG boxes are also subject to arginine methylation, thereby potentially affecting their localization and RNA-binding activity (Yu, 2011). However, while it was reported that hnRNP D/AUF1 is methylated, this did not seem to affect either localization or RNA-binding activity of AUF1 (DeMaria et al., 1997; Sarkar et al., 2003). Recently, it was shown that arginine methylation of AUF1 is involved in translational repression of the mRNA coding for vascular endothelial growth factor (VEGF) (Fellows et al., 2013). Furthermore, this PTM also affects AUF1's role as a host factor during the replication of the West Nile virus genome (Friedrich et al., 2014) In this case, arginine methylation of AUF1 by protein arginine N-methyltransferase 1 (PRMT1) promotes AUF1 function as an RNA chaperone (Friedrich et al., 2016).

Many ARE-BPs are phosphorylated. In some cases, the regulatory effects of phosphorylation, as well as the signaling pathways and kinases responsible, have been determined. For example, KSRP has two independent phosphorylation sites in its C-terminal and KH1 domains (Briata et al., 2005; Diaz-Moreno et al., 2009). The C-terminal Thr692 of KSRP is phosphorylated by p38/MAPK to promote destabilization of target mRNAs (Briata et al., 2005). On the other hand, Ser193 in the KH1 domain is phosphorylated by Akt/PKB to localize KSRP in the nucleus *via* binding of 14-3-3 proteins, thereby inhibiting mRNA decay in the cytoplasm (Diaz-Moreno et al., 2009). TIS11 family proteins, TTP, and ZFP36L1 and ZFP36L2 are also phosphorylated by p38/MAPK or Akt/PKB and recognized by 14-3-3 proteins (Chrestensen et al., 2004; Schmidlin et al., 2004; Benjamin et al., 2006). Phosphorylation of TTP at Ser52 and Ser178 reduces interaction with the CCR4-NOT complex and thereby upregulates mRNA stability (Marchese et al., 2010). Conversely, phosphorylation at Ser334 of ZFP36L1 also decreases interaction with the CCR4-NOT complex, but increases affinity to Dcp1a to promote mRNA decay (Rataj et al., 2016).

HuD is subject to phosphorylation by PKC to promote its mRNA stabilizing activity (Lim and Alkon, 2012). On the other hand, we previously demonstrated that HuD interacts with Akt/PKB, although Akt/PKB does not lead to HuD phosphorylation (Fujiwara et al., 2012). This interaction might recruit Akt/PKB, which phosphorylates and inactivates destabilizing ARE-BPs such as KSRP and TIS11 family proteins, to ARE-containing mRNAs. This suggests that HuD can not only compete with destabilizing ARE-BPs but also potentially inactivate them on the same mRNA through phosphorylation by Akt/PKB to stabilize mRNA. We also showed that HuD attenuates translational repression by the miRNA-induced silencing complex (miRISC), which leads to mRNA decay as well as destabilizing ARE-BPs (Fukao et al., 2014). These observations support a central role for HuD in stabilizing mRNA and promoting translation (**Figure 2B**).

### CONCLUSION AND PERSPECTIVES

The ARE has been studied for a long time, and about 20 ARE-BPs have been identified since discovery of first ARE-BP, AUF1 (Brewer, 1991; Garcia-Maurino et al., 2017). The specific target mRNAs for different ARE-BPs, as well as their molecular functions on these mRNAs, and contribution of this regulation to specific biological processes are gradually being uncovered. However, with a few exceptions, the molecular mechanisms used by ARE-BPs to regulate their targets are still unknown. In particular, the mechanism to recognize and control specific targets from the large number of transcripts that have AREs is an open question. Recently, Ball et al. revealed that ZFP36L2, but not ZFP36L1, recognizes one of three AREs in 3′UTR of mRNA coding LHR, and this ARE is located within a hairpin structure (Ball et al., 2014, 2017). This indicates that not only the ARE sequence but also proximal RNA secondary structure affects the binding specificity of ARE-BPs. Future experimental and *in silico* approaches to understand the determinants of ARE recognition by specific ARE-BPs' analysis will thus be needed to incorporate RNA structure, as well as sequence. Moreover,

### REFERENCES


as shown in the example of LDHM and AUF1 (Pioli et al., 2002), it will also be necessary to study the influence of the interaction between ARE-BPs on specific ARE recognition and molecular regulatory mechanisms on the same transcripts. Finally, systematic studies have shown that the relative spacing of 3′UTR cis-elements and associated regulatory proteins can have strong contextual effects on regulation (Pique et al., 2008). Thus, to understand fully ARE-BP function and mechanism, it will be important to examine interplay between AREs, ARE-BPs, and other neighboring cis-elements within specific 3′UTRs.

### AUTHOR CONTRIBUTIONS

HO, AF, YF, KD, and TF wrote and discussed the article.


Otsuka et al. Emerging Evidence of Translational Control

dependent nucleocytoplasmic shuttling. *J. Biol. Chem.* 277, 11606–11613. doi: 10.1074/jbc.M111457200


AUF1 protein complexes, functions in AU-rich element-mediated mRNA decay. *Mol. Cell. Biol.* 28, 5223–5237. doi: 10.1128/MCB.00431-08


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Otsuka, Fukao, Funakami, Duncan and Fujiwara. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Recent Progress on the Molecular Mechanism of Quality Controls Induced by Ribosome Stalling

Ken Ikeuchi, Toshiaki Izawa and Toshifumi Inada\*

Gene Regulation Laboratory, Graduate School of Pharmaceutical Sciences, Tohoku University, Sendai, Japan

Accurate gene expression is a prerequisite for all cellular processes. Cells actively promote correct protein folding, which prevents the accumulation of abnormal and nonfunctional proteins. Translation elongation is the fundamental step in gene expression to ensure cellular functions, and abnormal translation arrest is recognized and removed by the quality controls. Recent studies demonstrated that ribosome plays crucial roles as a hub for gene regulation and quality controls. Ribosome-interacting factors are critical for the quality control mechanisms responding to abnormal translation arrest by targeting its products for degradation. Aberrant mRNAs are produced by errors in mRNA maturation steps and cause aberrant translation and are eliminated by the quality control system. In this review, we focus on recent progress on two quality controls, Ribosome-associated Quality Control (RQC) and No-Go Decay (NGD), for abnormal translational elongation. These quality controls recognize aberrant ribosome stalling and induce rapid degradation of aberrant polypeptides and mRNAs thereby maintaining protein homeostasis and preventing the protein aggregation.

#### Edited by:

Akio Kanai, Keio University, Japan

#### Reviewed by:

Hyouta Himeno, Hirosaki University, Japan Yoshihiro Shimizu, Riken, Japan

\*Correspondence: Toshifumi Inada tinada@m.tohoku.ac.jp; tinada@mail.pharm.tohoku.ac.jp

#### Specialty section:

This article was submitted to RNA, a section of the journal Frontiers in Genetics

Received: 11 October 2018 Accepted: 22 December 2018 Published: 17 January 2019

#### Citation:

Ikeuchi K, Izawa T and Inada T (2019) Recent Progress on the Molecular Mechanism of Quality Controls Induced by Ribosome Stalling. Front. Genet. 9:743. doi: 10.3389/fgene.2018.00743 Keywords: ribosome, ribosome stalling, no-go mRNA decay, ribosome-associated quality control, ribosome ubiquitination

### INTRODUCTION

Protein synthesis is a fundamental step of gene expression in all organisms. Translation elongation is perturbed by unique sequences, for instance poly-adenosine tract (Dimitrova et al., 2009), tandem rare codons such as yeast arginine CGA rare codon (Doma and Parker, 2006; Chen et al., 2010; Letzring et al., 2013), inhibitory di-codon pairs (Gamble et al., 2016), oxidized RNA (Simms et al., 2014) and robust higher order mRNA structures (Doma and Parker, 2006; Tsuboi et al., 2012), or by the severe cellular conditions including amino acid starvation (Guydosh and Green, 2014), tRNA deficiency (Ishimura et al., 2014), oxidative stress (Simms et al., 2014) and genetic mutations (LaRiviere et al., 2006). Ribosome stalling on the specific sites result in the perturbation of ribosome recycling as well as the production of aberrant truncated proteins. Cells have quality control systems to recognize ribosome stalling and eliminate the aberrant mRNAs and proteins. The stalled ribosome by the tandem CGA codons or KKK codon cluster is subjected to Ribosomeassociated Quality Control (RQC) that induces co-translational degradation of the arrest products (Dimitrova et al., 2009; Bengtson and Joazeiro, 2010; Brandman et al., 2012). RQC machinery is well conserved from yeast to human cells and related not only in cytosolic protein quality control but also in mitochondrial function, protein aggregation, and neurodegeneration (Wilson et al., 2007; Chu et al., 2009; Bengtson and Joazeiro, 2010; Brandman et al., 2012; Brandman and Hegde, 2016;

Shao and Hegde, 2016; Garzia et al., 2017; Matsuo et al., 2017; Sitron et al., 2017; Sundaramoorthy et al., 2017; Juszkiewicz et al., 2018; Kuroha et al., 2018; Szadeczky-Kardoss et al., 2018; Verma et al., 2018; Zurita Rendon et al., 2018). No-go mRNA decay (NGD) is a eukaryotic mRNA quality control system and triggered by endonucleolytic cleavage in the vicinity of the stalled site followed by the exoribonucleolytic decay (van Hoof et al., 2002; Doma and Parker, 2006). In this review, we mainly describe the mechanism of RQC and NGD in yeast because it has been extensively investigated in yeast.

### RIBOSOME-ASSOCIATED QUALITY CONTROL

### Ribosome Ubiquitination Is Required for the Subunit Dissociation in RQC

Dissociation of stalled ribosomes into 40S and 60S subunits is an essential step to initiate RQC. Recent studies have demonstrated that ubiquitylation of stalled ribosomes triggers the subunit dissociation. An E3 ubiquitin ligase Hel2 in yeast and its mammalian homolog ZNF598 play a crucial role in this process (Brandman et al., 2012; Letzring et al., 2013; Saito et al., 2015; Garzia et al., 2017; Matsuo et al., 2017; Sitron et al., 2017; Sundaramoorthy et al., 2017; Juszkiewicz et al., 2018). A current model of RQC-trigger pathway induced by ribosome stalling is shown in **Figure 1**. Hel2 was initially identified as a Histone E3 ligase 2 required for cytosolic excess histone proteins (Singh et al., 2012). Hel2 recognizes the ribosomes stalled at the CGA rare codon cluster or poly(A) stretches on the mRNA sequences and mediates K63-linked poly-ubiquitination of ribosomal small subunit uS10 at K6 and K8 (Matsuo et al., 2017; Ikeuchi et al., 2019). The resulting ubiquitinated ribosomes are split into the subunits and then subjected to the downstream RQC pathway (Matsuo et al., 2017). ZNF598 in mammals also recognizes stalled ribosomes at the poly(A) stretches. ZNF598 is located in the head region of the 40S subunit and ubiquitinates both uS10 (at K4 and K8) and eS10 (at K138 and K139) (Garzia et al., 2017; Sundaramoorthy et al., 2017; Juszkiewicz et al., 2018). Asc1 in yeast and its mammalian homolog RACK1 are also required for stall-mediated early RQC pathway (Kuroha et al., 2010; Letzring et al., 2013; Sitron et al., 2017). RACK1 is responsible for ZNF598-mediated ribosome ubiquitination in human cells (Sundaramoorthy et al., 2017; Juszkiewicz et al., 2018), and Asc1/RACK1 locates on the 40S head near the Hel2/ZNF598 target proteins, uS10, eS10, and uS3. Notably, Asc1 is not essential for non-stop RQC (Matsuda et al., 2014; Ikeuchi and Inada, 2016), because Asc1 is vital for stalling on poly(A) sequence but not for Dom34-mediated ribosome splitting at the 3<sup>0</sup> end of the non-stop mRNA(Ikeuchi and Inada, 2016).

### A Disome as a Structural Unit for RQC

Recent studies strongly suggest that a di-ribosome (disome) is a structural unit for RQC (Ikeuchi et al., 2019; Juszkiewicz et al., 2018). Hegde and co-workers demonstrated that ZNF598 preferentially ubiquitinates the disome with the mammalian in vitro translation system (Juszkiewicz et al., 2018). The collided di-ribosome is a minimal unit for RQC by ZNF598. The collided di-ribosome structure reveals a broad 40S – 40S interface where the ubiquitination target of ZNF 598 is present (Juszkiewicz et al., 2018). It was proposed that the use of ribosomal collisions on behalf of stall makes it possible to adjust the degree of acceptable deceleration by the initiation rate for that mRNA (Juszkiewicz et al., 2018).

In yeast, CGACCG repeat induces RQC and NGD quality controls in vivo (Ikeuchi et al., 2019). The disomes formed by the CGACCG repeat are preferred as targets for Hel2-mediated uS10 ubiquitination over monosomes in vitro translation system (Ikeuchi et al., 2019). The Cryo-EM structure of the disome revealed that the leading ribosome is stalled in the classical POST-translocation state with an empty A-site and occupied Pand E-sites. The second ribosome is locked in an incomplete translocation step, and in a hybrid state with A/P and P/EtRNAs. The interface between the leading and the colliding ribosomes is mainly formed by the small 40S subunit (Ikeuchi et al., 2019). The Cryo-EM analysis of Hel2-bound ribosome revealed that Hel2 preferentially binds to the rotated ribosome with the hybrid tRNAs (Matsuo et al., 2017). Therefore, it is possible that Hel2 binds to the colliding ribosome because it is the rotated form with the hybrid tRNAs. Importantly, the 40S interribosomal contact interface brings all proteins targeted by Hel2 during quality control nearby. Moreover, both Asc1 (RACK1 in humans) molecules are in direct contact forming one of the inter-ribosomal interaction sites in a disome. It may represent the ideal substrate for Hel2, thereby specifically recognizing a prolonged translation stall to initiate RQC by its E3 ubiquitin ligase activity.

### The RQT Complex Has Potential to Dissociate the Subunit in RQC

Recent studies have identified the RQC trigger (RQT) complex that is associated with translating ribosomes and essential for dissociation of stalled ribosomes (Matsuo et al., 2017). The RQT complex is composed of double RecA helicase domain-containing protein Slh1/Rqt2, CUE domain-containing ubiquitin-binding protein Cue3/Rqt3 and C2HC5-type zincfinger (ZnF) protein Ykr023w /Rqt4 (Matsuo et al., 2017; Sitron et al., 2017). The RQT function depends on the ATPase hydrolysis motif of Slh1/Rqt2 and moderately accelerated by the CUE domain of Cue3, indicating that the RQT complexmediated recognition of the ubiquitin on the ribosome promotes ribosome splitting. It remains elusive how the RQT complex acts to induce splitting of the stalled ribosome and how helicase activity of Slh1 works on the stalled complex. Additionally, the function of Rqt4 and the role of its unique ZnF domain are still undefined. In a mammal, ASCC3 has similarity to yeast Slh1/Rqt2 and is required for mammalian RQC induction (Matsuo et al., 2017). Human orthologs of Rqt3-4 remain to be clarified. It has been reported that ASCC3 forms ASC-1 complex (ASCC) with CUE domain containing protein ASCC2, RNA ligase-like protein ASCC1 and C2HC5-Znf protein TRIP4/ASC-1 (Jung et al., 2002). Moreover, ASCC with demethylase ALKBH3

forms complex to repair DNA alkylation in nuclear foci, and ASCC2 binds to K63-linked ubiquitin via its CUE domain, is critical for ASCC-ALKBH3 recruitment in the repairing foci (Brickner et al., 2017). At the moment, ASCC2 and TRIP4 are the potential mammalian orthologs of Cue3/Rqt3 and Rqt4, respectively, and might act as the mammalian RQTcomplex.

### Quality Control of Nascent Proteins on the 60S Subunit

Stalling of ribosome can generate faulty proteins with cytotoxic properties. To prevent accumulation of such toxic proteins, cells have evolved the RQC complex that targets them for proteasomal degradation (Bengtson and Joazeiro, 2010; Brandman et al., 2012; Defenouillere et al., 2013; Verma et al., 2013). RQC complex consists of Ltn1, Rqc1, Rqc2, and Cdc48 (Listerin, TCF25, NEMF and p97 in a mammal, respectively). RQC complex is recruited to the 60S subunit-peptidyl-tRNA complex after splitting of the stalled ribosome (Shao et al., 2013; Shao and Hegde, 2014). Rqc1 is also involved in this ubiquitination step, yet its function is little understood (Brandman et al., 2012; Defenouillere et al., 2013). Rqc2 facilitates recruitment of Ltn1 to the 60S subunit (Lyumkis et al., 2014; Shao et al., 2015). Rqc2 also mediates elongation of stalled nascent polypeptides on the 60S subunit by the C-terminal addition of multiple alanyl and threonyl residues (CAT-tailing) in a template-free and 40S subunit independent manner (Shen et al., 2015). A recent study has shown that CAT-tailing acts as a fail-safe mechanism for efficient ubiquitination by Ltn1, by extracting the lysine residues sequestered in the ribosomal tunnel to the cytosol so that the lysine residues are accessible to Ltn1 (Kostova et al., 2017). Cdc48/p97, together with its co-factors Ufd1 and Npl4, acts in the downstream step to extract the nascent polypeptides for proteasomal degradation (Brandman et al., 2012; Defenouillere et al., 2013; Verma et al., 2013). An important question remaining to be solved is the mechanism for releasing the peptidyl-tRNA from the 60S subunit. Recent studies have identified Vms1 (ANKZF1 in a mammal) as a peptidyl-tRNA hydrolase in RQC. Vms1/ANKZF1 has a eukaryotic release factor 1 (eRF1) like domain with the conserved Gln residue that is supposed to catalyze peptidyl-tRNA hydrolysis in a similar way to the Gln residue of conserved GGQ motif in eRF1 (Verma et al., 2018; Zurita Rendon et al., 2018). However, the GGQ motif is not conserved, but is deviated to GSQ in yeast Vms1 and even to TAQ in human ANKZF1. More recently, ANKZF1 has been reported to function as a tRNA endonuclease rather than as a peptidyl-tRNA hydrolase in vitro (Kuroha et al., 2018). Future structural and biochemical studies will uncover the detailed mechanism of peptidyl-tRNA hydrolysis and tRNA cleavage by Vms1/ANKZF1.

Failure of ubiquitination by a loss of Ltn1 function causes accumulation of CAT-tailed proteins (Shen et al., 2015). Recently, several groups have reported that CAT-tailed proteins have a strong propensity to aggregate and cause proteotoxic stress in yeast (Choe et al., 2016; Defenouillere et al., 2016; Yonashiro et al., 2016; Izawa et al., 2017). They sequester multiple essential chaperones and form SDS-resistant aggregates, thereby interfering with general protein quality control pathways. CAT-tailed proteins have also reported to have an extremely toxic effect on mitochondria. The CAT-tailed mitochondrial proteins, synthesized in the cytosol but once imported into the mitochondria, sequester multiple essential chaperones, proteases and the components of translation machinery in the mitochondrial matrix, resulting in defective assembly of respiratory chain complexes and cell death (Izawa et al., 2017). In this context, Vms1 also acts as a key player to protect mitochondria from the toxic effect of CAT-tailed mitochondrial proteins. Vms1 antagonizes Rqc2, thereby preventing CATtailing and facilitating the release of stalled mitochondrial proteins to the downstream quality control network in the mitochondrial matrix (Izawa et al., 2017). Vms1 is particularly crucial for stalled mitochondrial proteins that cannot be ubiquitylated by Ltn1 due to the coupling of translation and protein import across the mitochondrial membranes. Since all the RQC components are conserved in eukaryotes including human, clearance of stalled proteins may be of general significance for cellular homeostasis. Indeed, the listerin hypomorphic mouse was shown to cause neurodegeneration (Chu et al., 2009). Further studies will clarify the roles of RQC in protein aggregation, mitochondrial dysfunction, and disease progression.

### NO-GO mRNA DECAY AND ROLES OF QUALITY CONTROL FACTORS

No-go mRNA decay (NGD) is a cytosolic quality control system for mRNA induced by the ribosome stalling. NGD system is firstly discovered in Saccharomyces cerevisiae (Doma and Parker, 2006). NGD is conserved in fruit fly and plant, yet it has not been characterized in a mammal (Passos et al., 2009; Szadeczky-Kardoss et al., 2018). The NGD is triggered by endonucleolytic cleavage of mRNA in the vicinity of the stalled ribosome, then the resulting the 5<sup>0</sup> - and 3<sup>0</sup> -fragments are rapidly degraded by the exoribonucleolytic cleavages (**Figure 2**).

### Roles of Ribosome Collision and Ribosome Ubiquitination in NGD

A recent study has proposed that ribosome collision is a critical trigger for mRNA cleavages in the initial step of NGD (Simms et al., 2017). In the study, accumulation of ribosomes on the mRNA has been observed as multiple cleavage sites which were distributed 43∼300 nt upstream of the stalling site with approximately 30 nt periodicity likely due to the stacked ribosomes array. Similar situations that can be evidence for ribosome collision were observed on the GFP-Rz (hammerhead ribozyme) auto-cleaved truncated stop-codon-less reporter mRNA or the endogenous truncated mRNA under the dom34 deletion condition (Tsuboi et al., 2012; Guydosh and Green, 2014; Ikeuchi and Inada, 2016). Moreover, the collision of ribosome seems to be required for the cleavages of mRNA, as cleavage efficiency of mRNA was decreased with reduced initiation efficiency by using long 5<sup>0</sup> -UTR. Hel2 was also proposed as a factor involved in NGD, and Hel2-mediated ubiquitination of ribosomal protein uS3 associates with ribosome collision. However, the requirement of Hel2 and ubiquitination of uS3 in NGD is not manifested.

Although the mechanism of NGD has been studied using reporter genes, little is known about its endogenous targets. Ribosome footprinting is a powerful method for seeking cleavage sites of the endogenous targets (Guydosh et al., 2017). Recent sequencing-based strategy termed 5<sup>0</sup> hydroxyl RNA sequencing (50OH-seq) is a striking method to identify the 5<sup>0</sup> ends of the cleaved intermediates (Peach et al., 2015; Ibrahim et al., 2018). Interestingly, NGD occurs under oxidative stress conditions (Simms et al., 2014), likely due to oxidized bases of mRNAs that interfere with base-pairing and cause aberrant translation elongation. The levels of K63-linked polyubiquitinated ribosomal proteins, translation factors, and various proteins were significantly increased upon oxidative stress by the down-regulation of deubiquitinating enzyme Ubp2 (Silva et al., 2015). These results strongly suggest a crucial role for K63-linked polyubiquitination of ribosomal proteins in NGD upon oxidative stress. To address the precise functions of K63 linked polyubiquitination in NGD, the essential E3 ubiquitin ligase and its target sites should be identified.

### A Disome as a Unit for Hel2-Dependent RQC and NGD in Yeast

Recent study by Beckmann and Inada labs demonstrated that the mRNA cleavage by the CGA rare codon cluster is dependent on Hel2-mediated K63-linked polyubiquitination (Ikeuchi et al., 2019). The determination of the cleavage sites by primer extension revealed that endonucleolytic cleavage of an NGD reporter mRNA occurs at sites within a disome unit consisting of the stalled ribosome and the following colliding ribosome. This minimal ribosome collision unit is required to couple NGD and RQC via Hel2. The cleavages in a disome require the ubiquitination of uS10 at lysine 6th (K6) or 8th (K8) residues as well as the activity of the RQT component Slh1/Rqt2. Both Hel2-mediated ubiquitination of uS10 at K6 or K8 residues and Slh1/Rqt2 are essential for RQC, indicating that NGD and RQC are coupled via this ubiquitination event, and the NGD is referred to as the NGDRQC<sup>+</sup> (**Figure 3**). The determination of the cleavage sites in the RQC-defective mutants in which Hel2-dependent ubiquitination is defective or Slh1/Rqt2 deletion mutant revealed that the NGD pathway could be dissected into two interdependent branches. In this alternative NGD pathway, endonucleolytic mRNA cleavages occur upstream of the stalled disome (referred to as the NGDRQC−; **Figure 3**). These cleavages require K63-linked polyubiquitination of ribosomal protein eS7. This polyubiquitination happens in a two-step mechanism, where the E3 ligase Not4 first monoubiquitinates eS7 which is followed

degraded by Xrn1.

FIGURE 3 | A unique structural interface to induce Hel2-driven quality control pathways. Model for quality control pathways induced by R(CGN)12-mediated translation arrest (Ikeuchi et al., 2019). Hel2-mediated ribosome ubiquitination is required both for canonical NGD (NGDRQC+) and RQC coupled to the disome, and that RQC-uncoupled NGD outside the disome (NGDRQC−) takes place in a Not4-mediated monoubiquitination dependent manner. The arrowheads indicate the endonucleolytic cleavages sites in NGD. The red line indicates the rare codon cluster. Left: the RQC pathway is intact, the leading ribosome that is stalled by the arrest sequence undergoes RQC. The uS10 ubiquitination and Slh1/Rqt2-dependent subunit dissociation induce the endonucleolytic cleavages in the disome. Right: In the absence of uS10 ubiquitination and Rqt2, RQC in the first ribosome, as well as NGD in the disome, are eliminated. RQC-uncoupled NGDRQC<sup>−</sup> takes place upstream of the disome. The figure concept has been reproduced from the original article (Ikeuchi et al., 2019).

by Hel2-mediated polyubiquitination. Finally, it was proposed that a dual role of Hel2 leading to two distinct NGD pathways, which require specific ubiquitination events on the stalled disome (**Figure 3**).

It is still unknown whether ribosome collision itself is essential for ribosome splitting and endoribonucleolytic cleavage, resulting in nascent peptide degradation and mRNA degradation. Hel2 mediated uS10 ubiquitination is required for RQC and NGD in a disome, and the disomes are preferred as targets for Hel2-mediated uS10 ubiquitination over monosomes. This raises the question of whether the endoribonucleolytic cleavage is required for RQC, which is an essential question to understand the mechanism of the coupled quality controls by ribosome stalling.

### PERSPECTIVES

fgene-09-00743 January 14, 2019 Time: 14:59 # 6

Recent studies have clarified the novel molecular mechanisms how quality controls systems recognize stalling ribosome and eliminate aberrant products. However, many questions should be addressed including the relation between RQC and NGD, the roles of RQC factors in NGD. Future experiments and analyses will uncover the molecular mechanisms and

### REFERENCES


biological functions of quality controls induced by ribosome stalling.

### AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

### FUNDING

TIn was supported by a Grant-in-Aid for Scientific Research (KAKENHI) from the Japan Society for the Promotion of Science (Grant No. 26116003), and by Research Grants in the Natural Sciences from the Takeda Foundation.



yeast saccharomyces cerevisiae. PLoS One 7:e36295. doi: 10.1371/journal.pone. 0036295


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Ikeuchi, Izawa and Inada. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

digital media

of impactful research

article's readership