# RNA REGULATION IN DEVELOPMENT AND DISEASE

EDITED BY : Maritza Jaramillo, Pascal Chartrand and Chiara Gamberi PUBLISHED IN : Frontiers in Genetics

#### Frontiers eBook Copyright Statement

The copyright in the text of individual articles in this eBook is the property of their respective authors or their respective institutions or funders. The copyright in graphics and images within each article may be subject to copyright of other parties. In both cases this is subject to a license granted to Frontiers. The compilation of articles constituting this eBook is the property of Frontiers.

Each article within this eBook, and the eBook itself, are published under the most recent version of the Creative Commons CC-BY licence. The version current at the date of publication of this eBook is CC-BY 4.0. If the CC-BY licence is updated, the licence granted by Frontiers is automatically updated to the new version.

When exercising any right under the CC-BY licence, Frontiers must be attributed as the original publisher of the article or eBook, as applicable.

Authors have the responsibility of ensuring that any graphics or other materials which are the property of others may be included in the CC-BY licence, but this should be checked before relying on the CC-BY licence to reproduce those materials. Any copyright notices relating to those materials must be complied with.

Copyright and source acknowledgement notices may not be removed and must be displayed in any copy, derivative work or partial copy which includes the elements in question.

All copyright, and all rights therein, are protected by national and international copyright laws. The above represents a summary only. For further information please read Frontiers' Conditions for Website Use and Copyright Statement, and the applicable CC-BY licence.

ISSN 1664-8714 ISBN 978-2-88963-789-8 DOI 10.3389/978-2-88963-789-8

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# RNA REGULATION IN DEVELOPMENT AND DISEASE

Topic Editors: Maritza Jaramillo, Centre de Recherche en Biotechnologie de la Santé Armand Frappier (INRS), Canada Pascal Chartrand, Université de Montréal, Canada Chiara Gamberi, Concordia University, Canada

Citation: Jaramillo, M., Chartrand, P., Gamberi, C., eds. (2020). RNA Regulation in Development and Disease. Lausanne: Frontiers Media SA. doi: 10.3389/978-2-88963-789-8

# Table of Contents


#### *192 Insights Into Non-coding RNAs as Novel Antimicrobial Drugs* Gisela Parmeciano Di Noto, María Carolina Molina and Cecilia Quiroga

*199* Drosophila *mRNA Localization During Later Development: Past, Present, and Future*

Sarah C. Hughes and Andrew J. Simmonds

*218 Differential Regulation of the Three Eukaryotic mRNA Translation Initiation Factor (eIF) 4Gs by the Proteasome*

Amandine Alard, Catherine Marboeuf, Bertrand Fabre, Christine Jean, Yvan Martineau, Frédéric Lopez, Patrice Vende, Didier Poncet, Robert J. Schneider, Corinne Bousquet and Stéphane Pyronnet

# Editorial: RNA Regulation in Development and Disease

#### Pascal Chartrand<sup>1</sup> , Maritza Jaramillo<sup>2</sup> and Chiara Gamberi <sup>3</sup> \*

<sup>1</sup> Department of Biochemistry and Molecular Medicine, Université de Montréal, Montréal, QC, Canada, <sup>2</sup> Institut National de la Recherche Scientifique (INRS) – Centre Armand-Frappier Santé Biotechnologie, Laval, QC, Canada, <sup>3</sup> Biology Department, Concordia University, Montreal, QC, Canada

Keywords: RNA, translational control, RNA-binding proteins, mRNA localization, development, disease

#### **Editorial on the Research Topic**

#### **RNA Regulation in Development and Disease**

A wide variety of post-transcriptional regulatory events in the life of an mRNA have emerged as major checkpoints during its temporal and spatial journey within the cell. The advent of deep sequencing technologies combined with various fractionation or enrichment protocols has produced a wealth of data regarding transcripts, their variants and their interactomes. Yet, these data must be integrated with mechanistic and biological frameworks in order to better understand complex and dynamic regulatory networks that tailor mRNA metabolism and shape the cell proteome in healthy and diseased states.

#### Edited by:

William Cho, Queen Elizabeth Hospital (QEH), Hong Kong

#### Reviewed by:

Peng Jin, Emory University, United States Félix Recillas-Targa, National Autonomous University of Mexico, Mexico

\*Correspondence:

Chiara Gamberi chiara.gamberi@concordia.ca

#### Specialty section:

This article was submitted to RNA, a section of the journal Frontiers in Genetics

Received: 25 March 2020 Accepted: 07 April 2020 Published: 28 April 2020

#### Citation:

Chartrand P, Jaramillo M and Gamberi C (2020) Editorial: RNA Regulation in Development and Disease. Front. Genet. 11:430. doi: 10.3389/fgene.2020.00430

The articles in this Research Topic review our current knowledge in eukaryotic post-transcriptional gene regulation, from mRNA export out of the nucleus to its localization, translation, and eventual decay. Among several topics related to RNA regulation, this article collection puts a particularly strong emphasis on translational control (i.e., regulation of mRNA translation efficiency) and its impact on localized protein synthesis, downstream transcriptional programs, cellular metabolism, organismal development, and disease pathogenesis.

First, several articles review fundamental RNA-based mechanisms of post-transcriptional gene regulation. Starting in the nucleus, Palazzo and Lee describe the various cis-acting determinants regulating nuclear retention or export of both long non-coding and coding RNAs. Once in the cytoplasm, mRNAs can be sorted to specific subcellular domains, allowing localized translation of these transcripts. In their article, Neriec and Percipalle present the different mechanisms behind this process, focusing on CBF-A/hnRNP AB-mediated mRNA transport and localization. The role of the 3'UTR in modulating mRNA localization, but also its translation and fate, are reviewed by Mayya and Duchaine. Finally, Karamyshev and Karamysheva discuss various mechanisms involved in quality control of both mRNAs and proteins during translation to prevent production of abnormal proteins.

A second group of articles in this collection focuses more specifically on the roles of RNA-mediated control of cellular metabolism and organismal development. Necessary to produce biological building blocks, regulated translation is key for cell growth and is a downstream target of several signaling pathways that control cellular metabolism. One example is the mammalian or mechanistic target of rapamycin (mTOR) signaling pathway. A review by Cao discusses novel rhythmic functions of mTOR signaling in translational control in neurons, as they regulate their metabolism to suit circadian functions. Another example is the role of ribosome availability in regulating cellular metabolism and the cell cycle. While ribosomes have been considered for a long time as mere executants in the translation program, Calamita et al. discuss novel evidence of ribosome heterogeneity and its impact on differential mRNA translation and ribosomopathies, diseases in which these processes malfunction.

Translational control also plays important developmental functions such as stem cell differentiation, which is the topic of a review by Tahmasebi et al., who describe several mechanisms

**5**

that control mRNA translation to coordinate stem cell renewal and differentiation. Particularly important during early development, mRNA localization has been extensively researched in Drosophila. While original studies were carried out in the oocyte and early embryo, most mRNA localization factors are conserved evolutionarily and are expressed in multiple tissues at late developmental stages and the adult, suggesting that RNA localization may be necessary throughout the lifespan of many organisms to enable structural and functional cellular asymmetries. Hughes and Simmonds illustrate the diversity of mRNA localization patterns in Drosophila, its role of sorting proteins to various subcellular compartments and reflect on the conservation of the underlying regulation. Finally, this section also includes two original research articles, one on the global transcriptome of adipogenic differentiation in cattle by Cai et al., and a second article by Alard et al. on the regulation of the translation initiation factors eIF4Gs by the proteasome.

The third section of this collection includes several articles on the roles of RNA regulation and mis-regulation in diseases. There is growing appreciation that sets of mRNAs encoding functionally related proteins are coordinately regulated through Untranslated Sequence Elements for Regulation (USER) codes that are "read" by specific RNA-binding proteins. This posttranscriptional regulatory mechanism, referred to as the RNA regulon model, is reviewed by Culjkovic-Kraljacic and Borden, who discuss the concepts of one- and two-tier RNA regulons and explain how their mis-regulation is a feature of diseases such as cancer. Moreover, the authors highlight how the advent and integration of "OMICS" approaches (e.g., RIP-seq, CLIP-seq, RIP-ChIP, etc.) has contributed to uncover the RNA-interactome of RNA-binding proteins and the therapeutic potential of redirecting RNA regulons. Dysregulation of signaling cascades or mis-expression of translation initiation factors frequently occurs in cancers, which impact translation initiation (a key, highly regulated step) and cell growth. This topic is reviewed by Hernández et al., with a focus on the development of pharmacological inhibitors of translation initiation as a potential treatment for prostate cancer. Translational output can also be affected by mutations in the sequence of a transcript, and a review article by Robert and Pelletier discusses how single nucleotide polymorphisms (SNPs) in regulatory elements of an mRNA (5′UTR, 3′UTR, uORF, miRNA-binding site, etc.) can impact its translation.

RNA molecules are also at the forefront of human action against infectious diseases. Efficient innate immune responses to bacterial, protozoan, fungal, and viral pathogens are largely dependent on a delicate balance between transcriptional and post-transcriptional regulation of genes encoding pro- and anti-inflammatory mediators. Ostareck and Ostareck-Lederer describe in their review key RNA-binding proteins (i.e., HuR, TTP, hnRNP K, and TIA-1/TIAR) that coordinate macrophage inflammatory responses to the Gram-negative bacterial endotoxin lipopolysaccharide by controlling turnover and translation of immune-related transcripts. Non-coding RNAs from bacteria such as ribozymes, riboswitches, and CRISPR-Cas9 systems are being developed as potential antimicrobials to curb acquired multi-drug resistance in pathogens without harming beneficial microbiota, as reviewed by Di Noto et al.. Finally, the current COVID-19 pandemic reminds us that RNA viruses, such as coronavirus and flavivirus, remain among the most formidable challenges faced by today's world health system. Flavivirus (such as dengue, Zika, West Nile, and yellow fever viruses) and the fates of the flavivirus RNA genome are the topic of a review by Mazeaud et al..

The last two review articles focus on RNA regulation in the nervous system, where eIF4E-dependent translational control plays a major role in regulating the brain response to pain and the development of chronic pain diseases (Uttam et al.). Last but not least, RNA dysregulation is a key contributor in several neurodegenerative disorders, such as amyotrophic lateral sclerosis (ALS), frontotemporal degeneration (FTD), and microsatellite expansion disorders such as Fragile X syndrome. A review by Butti and Patten describes how major genes mutated in ALS, such as SOD1, TARDBP, FUS, and C9orf72, are all involved in various aspects of RNA metabolism.

As the field of RNA Biology advances very rapidly and the integrative analysis of global-scale RNA-protein interactions continues to evolve, novel examples of sophisticated regulatory mechanisms of RNA metabolism will emerge and thereby improve fundamental knowledge of cellular and organismal physiology. We may thus begin to comprehend how widespread, yet selective changes in transcriptional and translational programs underscore normal biological rhythms and adaptations to the changing environment. A better understanding of the role of dysregulated RNA metabolism in disease pathogenesis will be instrumental to design targeted RNA-based therapeutics to combat morbidity and mortality related to pathological conditions that affect millions of people around the world.

### AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

### FUNDING

The authors acknowledge funding from the Canadian Institute for Health Research, CIHR, (MOP-130325) and a Fonds de recherche du Québec-Santé, FRQS, Research Chair to PC; CIHR (MOP-166017) and Natural Sciences and Engineering Council, NSERC (RGPIN-2019-06671) to MJ; Mathematics of Information Technology and Complex Systems, Mitacs, (IT10214) and a Concordia University CUPFA Professional Development Grant to CG.

**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Chartrand, Jaramillo and Gamberi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# mTOR Signaling, Translational Control, and the Circadian Clock

#### Ruifeng Cao1,2 \*

<sup>1</sup> Department of Biomedical Sciences, University of Minnesota Medical School, Duluth, MN, United States, <sup>2</sup> Department of Neuroscience, University of Minnesota Medical School, Minneapolis, MN, United States

Almost all cellular processes are regulated by the approximately 24 h rhythms that are endogenously driven by the circadian clock. mRNA translation, as the most energy consuming step in gene expression, is temporally controlled by circadian rhythms. Recent research has uncovered key mechanisms of translational control that are orchestrated by circadian rhythmicity and in turn feed back to the clock machinery to maintain robustness and accuracy of circadian timekeeping. Here I review recent progress in our understanding of translation control mechanisms in the circadian clock, focusing on a role for the mammalian/mechanistic target of rapamycin (mTOR) signaling pathway in modulating entrainment, synchronization and autonomous oscillation of circadian clocks. I also discuss the relevance of circadian mTOR functions in disease.

#### Edited by:

Maritza Jaramillo, Institut National de la Recherche Scientifique (INRS), Canada

#### Reviewed by:

Yoshihiro Shimizu, RIKEN, Japan Fabrizio Loreni, Università degli Studi di Roma "Tor Vergata", Italy

\*Correspondence:

Ruifeng Cao rcao@umn.edu

#### Specialty section:

This article was submitted to RNA, a section of the journal Frontiers in Genetics

Received: 25 June 2018 Accepted: 22 August 2018 Published: 10 September 2018

#### Citation:

Cao R (2018) mTOR Signaling, Translational Control, and the Circadian Clock. Front. Genet. 9:367. doi: 10.3389/fgene.2018.00367 Keywords: mRNA, translational control, circadian clock, mTOR, entrainment, synchronization, oscillation, SCN

## INTRODUCTION

As explained by the central dogma of molecular biology, genetic information flows from DNA to RNA to make a functional product, a protein. Protein synthesis accounts for ∼75% of a cell's total energy consumption and is highly regulated in cells (Lane and Martin, 2010). Translational control (regulation of protein synthesis) plays a significant role in the regulation of gene expression under physiological conditions. Deregulation of translational control is frequently involved in the pathophysiology of human diseases, including cancer, metabolic syndromes, and neurological disorders (Sonenberg and Hinnebusch, 2007; Hershey et al., 2012).

Circadian (∼24 h) rhythmicity is an autonomous biological property that controls a variety of biochemical, physiological, and behavioral processes in all living organisms (Hall and Rosbash, 1993; Young, 1998; Takahashi et al., 2008). The rhythmic processes are driven by autonomous oscillations of "clock genes" in cells. Whereas a significant role for protein synthesis in the circadian clock was found half a century ago (Feldman, 1967, 1968), novel mechanisms of mRNA translation control are being discovered in recent years. Some of these findings have been nicely summarized in three reviews (Lim and Allada, 2013b; Green, 2018; Torres et al., 2018). Here I discuss the latest progress in our understanding of translational control mechanisms in the circadian clock, focusing on a critical role for the mammalian/mechanistic target of rapamycin (mTOR) signaling pathway.

### CIRCADIAN RHYTHMS AND CIRCADIAN CLOCKS

Circadian rhythms are endogenously driven by proteins called "circadian clocks" that oscillate in either their physical levels or functional states on a daily basis. The fundamental property enables organisms to temporally coordinate their physiology and behavior, according to changes in daily light/darkness cycles, food availability, temperature, moisture, and air pressure in the environment

(Rosbash, 2009). Thus, organisms can predict and prepare for upcoming environmental changes to meet their physiological needs (Reppert and Weaver, 2002). Rhythmic physiological and metabolic processes are normally coupled and synchronized to the environmental cycles so that optimal physiological and metabolic efficiencies can be attained at the right time of a day.

The circadian system is hierarchically organized. In Drosophila the central clock cells are located in the large and small lateral ventral neurons (l-LNvs and s-LNvs) of the optic lobe (Dubowy and Sehgal, 2017), which synthesize pigment dispersing factor (PDF) as a circadian neuromodulator among clock neurons. In mammals, the suprachiasmatic nucleus (SCN) of the anterior hypothalamus is the master pacemaker (Moore, 2013). SCN receives photic input from the retina, generates robust circadian rhythms and sends out neural and endocrine signals as rhythmic outputs to various brain regions as well as peripheral organs and systems. In the body, functions of the autonomic nervous system, endocrine and immune systems are all regulated by the SCN (Mohawk et al., 2012; Gamble et al., 2014). Clock genes are ubiquitously expressed in almost all cells and tissues. Almost all types of cells can perform circadian oscillations, with different robustness, accuracy, and period (Liu et al., 2007). Thus, rhythms in various organs and systems need to be orchestrated by the master pacemaker and synchronized to the environmental light/dark cycles (Aton and Herzog, 2005; Golombek and Rosenstein, 2010).

In cells circadian oscillations are driven by autonomous genetic feedback loops. Work over past three decades has identified evolutionarily conserved transcriptional/translational feedback loops (TTFLs) and about a dozen genes that account for cellular circadian oscillations (Hall and Rosbash, 1993; Young, 1998; Takahashi, 2017). In mammals, the heterodimers of transcription factors CLOCK and BMAL1 activate gene transcription of Per and Cryptochrome (Cry). PER and CRY proteins form multiprotein complexes. Once the complexes accumulate to certain levels in the cytosol, they translocate back to the cell nucleus, associate with CLOCK/BMAL1 heterodimers, and repress Per and Cry gene transcription (Takahashi et al., 2008).

Per gene expression functions as a "knob" of the clock and is tightly regulated by intracellular and extracellular signals via complex mechanisms. Firstly, rhythmic Per transcription is activated by the CLOCK: BMAL1 complexes through the E-box enhancers in the promoter region. Secondly, at the post-transcriptional level Per mRNA processing is regulated by methylation (Fustin et al., 2013). Thirdly, as the degradation rate of PER proteins is also a key determinant of the length of a circadian cycle, PER cycling is controlled by sophisticated posttranslational modifications such as phosphorylation (Kloss et al., 1998; Lowrey et al., 2000; Lee et al., 2001; Meng et al., 2008; Chiu et al., 2011) and ubiquitination (Busino et al., 2007; Siepka et al., 2007; Hirano et al., 2013; Yoo et al., 2013).

Recent work has started to uncover a key role for translational control in regulating clock gene expression. In Drosophila, the RNA binding proteins Ataxin-2 (Atx2) interacts with Twentyfour (Tyf) to activate Per mRNA translation in pacemaker neurons to sustain robustness of circadian behavioral rhythms (Lim et al., 2011; Lim and Allada, 2013a; Zhang et al., 2013). A targeted RNAi screen revealed knockdown of the atypical translation factor NAT1 lengthens circadian period and reduces PER protein levels in PDF neurons (Bradley et al., 2012). In mice, we show that as the downstream targets of the mitogen-activated protein kinase (MAPK)/extracellular Signalregulated Kinase (ERK) pathway, MAPK interacting protein kinases (MNKs) phosphorylate the cap-binding protein eIF4E in the SCN. Activities of the MAPK/MNK/eIF4E pathway can be activated upon light exposure at night. Phosphorylation of eIF4E stimulates Per1 and Per2 mRNA translation and functions as a facilitator of photic entrainment of the SCN circadian clock (Cao et al., 2015). Besides these mechanisms, another emerging translational control pathway with more complexity is the mTOR signaling.

### mTOR SIGNALING

mTOR is an evolutionarily conserved serine/threonine protein kinase, also known as FK506-binding protein 12-rapamycinassociated protein 1 (FRAP1). mTOR forms two multiprotein complexes in cells, the mTOR complex (mTORC) 1 and mTORC2. mTORC1 and mTORC2 share some protein components, including mTOR, mLST8 (mammalian lethal with sec13 protein 8, also known as GβL), and DEPTOR (the inhibitory DEP domain containing mTOR-interacting protein). mTORC1 also includes Raptor (the regulator-associated protein of the mammalian target of rapamycin) and PRAS40 (prolinerich Akt substrate of 40 kDa). Raptor interacts with the TOS (target of rapamycin signaling) motifs mTOR in a rapamycinsensitive manner and is essential for mTORC1 activity. mTORC2 consists of Rictor (the rapamycin insensitive companion of mTOR), mSIN1(mammalian stress activated MAP kinaseinteracting protein 1), and PROTOR 1 and 2 (proteins observed with rictor 1 and 2). Rictor and mSIN1 are both critical for mTORC2 function (Lipton and Sahin, 2014; González and Hall, 2017; Saxton and Sabatini, 2017).

mTOR signaling refers to an intracellular signaling network centered on mTORC1 and mTORC2. mTOR signaling senses intracellular signals and also responds to extracellular stimuli. It can be activated by upstream signals including growth factors (e.g., insulin and insulin-like growth factor-1), energy status (e.g., oxygen and ATP levels), nutrients (e.g., leucine and arginine), as well as neurotransmitters (e.g., glutamate and neuropeptides). Growth factors and mitogens inhibit the Tuberous Sclerosis Complex (TSC) complex. TSC is a key negative regulator of mTORC1. It is a GTPase activating protein for the small GTPase Rheb, which directly binds and activates mTORC1. Once activated, mTOR signaling controls fundamental biological processes including protein synthesis and turnover, lipid and glucose metabolism, autophagy, cytoskeleton organization, etc. (González and Hall, 2017; Saxton and Sabatini, 2017). mTORC1 has the most defined role in translational control. mTORC1 exhibits protein kinase activity and regulates mRNA translation by regulation of its translation effectors, which include the eukaryotic initiation factor 4E-binding proteins (4E-BPs) and

ribosomal protein S6 kinases (S6K1 and S6K2) (Hay and Sonenberg, 2004).

#### mTOR AND TRANSLATIONAL CONTROL

In general, translational control can be achieved via two mechanisms: (1) impacting on the mRNAs by sequence specific RNA binding proteins or small non-coding RNAs such as microRNAs; (2) impacting on the translational apparatus, which include translation factors, ribosomes and tRNAs. The latter predominantly affects the step of translation initiation (Hershey et al., 2012).

All nuclear transcribed mRNAs are capped at the 5<sup>0</sup> -ends with the 7-methyl-guanosine. Eukaryotic translation initiation factor 4E (eIF4E) is a cap-binding protein. It recognizes and binds to the mRNA 5<sup>0</sup> m7GpppN (where N is any nucleotide) (Hinnebusch et al., 2016). eIF4G is a scaffolding protein that associates with 4E and 4A. eIF4A is a RNA helicase that resolves mRNA secondary structures. eIF4F (including 4E, 4G, and 4A) complex interacts with eIF3 to recruit the small ribosomal subunit and initiates cap-dependent translation initiation. The eIF4E-binding proteins (4E-BPs) control eIF4E binding to the cap structure. 4E-BP binding of eIF4E causes repression of cap-dependent translation initiation and can be relieved by phosphorylation of 4E-BPs through mTORC1 (Gingras et al., 1999). Activated by various extracellular and intracellular cues, mTORC1 phosphorylates 4E-BPs to lead to its dissociation from eIF4E (Brunn et al., 1997; Gingras et al., 1999), which allows cap-dependent mRNA translation to initiate. Thus, mTORC1 regulates cap-dependent translation via 4E-BPs.

As another major branch of mTORC1, S6K1 is activated by phosphorylation on its hydrophobic motif site, Thr389. S6K1 in turn phosphorylates a number of proteins that control mRNA translation. It phosphorylates eukaryotic translation initiation factor 4B (eIF4B) at S422, which is a cofactor of eIF4A and increases its processivity (Holz et al., 2005). S6K1 also phosphorylates and promotes the degradation of PDCD4 (Programmed Cell Death 4). PDCD4 inhibits eIF4B and enhances the translation efficiency of spliced mRNAs via its interaction with SKAR (S6K1 Aly/REF-like target, Dorrello et al., 2006), a component of exon-junction complexes involved in mRNA splicing (Ma et al., 2008). S6K1 inactivates eukaryotic elongation factor-2 kinase (eEF2K) (Wang et al., 2001; Knebel et al., 2002), which is a negative regulator of eukaryotic elongation factor 2 (eEF2), by phosphorylating it at S366, and thus regulates translation elongation.

#### mTOR AND THE CIRCADIAN CLOCK

Intracellular signal transduction pathways control circadian timing and entrainment by regulating clock gene expression at different levels (Gillette and Mitchell, 2002; Evans, 2016). In general, a signaling pathway that is important for the circadian clock is usually regulated by circadian rhythmicity and therefore exhibits rhythmic activities under constant conditions. Moreover, the signaling pathway is often regulated by the extracellular signals and couples these signals to circadian gene expression. Thus, the feedback loops within the clock is coupled to a feedforward loop involving the environmental cues, the signaling pathway and the clock. The mTOR pathway is a typical signaling example that couples environment cues to the clock cells and its network. In this section, I will discuss the interactions between the mTOR pathway and the circadian clock.

### mTOR Regulation of the Circadian Clock

Work over the past decade has started to uncover a multifaceted role of mTOR in the circadian clock. Firstly, mTOR signaling is part of the photic entrainment pathway in the SCN; secondly, mTOR regulates autonomous clock properties in a variety of circadian oscillators; thirdly, mTOR regulates network properties of coupled circadian oscillators, such as the SCN neurons.

#### Regulation of Photic Entrainment of the SCN Circadian Clock by mTORC1

To adapt to the changing environment, the circadian clock is constantly adjusted by environment signals. Light is the most important cue to regulate the SCN clock. Photic input is received by the retina and relayed to the SCN via the retinohypothalamic tract (RHT). The pathway is distinct from the image-forming visual pathway in that the reception of light is mediated by intrinsically photosensitive retina ganglion cells (ipRGCs), which express the photopigment melanopsin (Peirson and Foster, 2006; Panda, 2007). RHT terminals form synaptic connections with the ventral SCN neurons. The excitatory neurotransmitter glutamate and the neuropeptide pituitary adenylate cyclase-activating peptide (PACAP) are released at the RHT terminals (Hannibal, 2002) upon photic stimulation at night. They in turn bind to their receptors on the SCN neurons and evoke activation of intracellular signaling events that regulate clock gene expression and trigger clock resetting (Golombek and Rosenstein, 2010). At aforementioned, cellular rhythmic clock gene expression is driven by transcription/translational feedback loops. A major negative feedback loop is composed of CLOCK/BMAL1-driven rhythmic Per and Cry expression. Per and Cry levels are high during the day and low at night. Light at night triggers transient upregulation of Per and Cry expression, which will shift the phase of cyclic gene expression and reset the SCN clock. Light at the early night delays the clock whereas light at the late night advances the clock.

In searching for intracellular signaling pathways that mediate photic entrainment of the SCN clock, it is found that light at night activates S6K1 by inducing its phosphorylation at Thr389. In turn, activated S6K1 phosphorylates its downstream translation effectors including the ribosomal protein S6 (S6), a component of the 40S ribosomal subunit. S6K1 activation and S6 phosphorylation often correlate with translation efficiency of a subset of mRNAs which have a 5<sup>0</sup> -terminal oligopyrimidine (TOP) tract (Meyuhas and Dreazen, 2009), whereas evidence exists that neither S6K1 nor S6 phosphorylation is required for translational response of these mRNAs (Stolovich et al., 2002). In the SCN, it is found that protein products of TOP mRNAs such as eEF1A (eukaryotic elongation factor

1A) and Jun B are light-inducible in a rapamycin-sensitive manner (Cao et al., 2010). Light also increases phosphorylation of 4E-BP1 at Thr37/46 in the SCN (Cao et al., 2008). Phosphorylation of 4E-BP1 triggers its dissociation from eIF4E and activates cap-dependent translation. These activities are mTORC1-dependent, as rapamycin blocks light-induced S6K1 and 4E-BP1 phosphorylation. Light-induced mTORC1 activation appears to be important for photic entrainment of the SCN clock, as rapamycin modulates light-induced phase shifts of wheel-running and body temperature rhythms in mice (Cao et al., 2010). Effects of rapamycin on behavioral phase shift are consistent with its inhibition of light-induced PER1 and PER2 proteins in the SCN. Together, these results demonstrate that the mTORC1 signaling is an integral part of the photic entrainment pathway that regulates light-inducible mRNA translation in the SCN, although precise translational control mechanisms via S6K1 remain to be delineated.

During photic entrainment, eIF4E is a pivotal point where the MAPK and mTORC1 pathways cross to control mRNA translation in the SCN. As aforementioned, MAPK pathway activates MNKs, which in turn phosphorylates eIF4E in the photo-recipient SCN cells and facilitates light-induced Per1 and Per2 mRNA translation. mTORC1 phosphorylates and inhibits the eIF4E repressor protein 4E-BP1. 4E-BP1 represses the eIF4E-dependent translation of Vip. Thus, mTORC1 activation disinhibits Vip mRNA translation and increases the abundance of Vip (also see section "Regulation of Synchronization of SCN Neurons by mTOR"). The lightregulated mTORC1 and MAPK pathways are summarized in **Figure 1**.

#### Regulation of Synchronization of SCN Neurons by mTOR

As the master pacemaker in mammals, unique anatomical and physiological features enable the SCN to generate accurate and robust rhythms. One of such features is the unique coupling mechanism among SCN neurons. SCN neurons are heterogenous in their expression of neuropeptides, pacemaking ability, response to light, and periods of their firing rhythms (Welsh et al., 1995, 2010; Herzog et al., 1998; Shirakawa et al., 2000; Aton and Herzog, 2005). In general, the ventral SCN neurons express VIP (vasoactive intestinal peptide), and the dorsal SCN neurons express AVP (arginine vasopressin). Some cells in between express GRP (gastrin releasing peptide). The ventral SCN cells receive photic input from the RHT and are directly entrained by light. In turn, these neurons send out output to the dorsal SCN neurons and reset their rhythms. To produce a coherent daily output, the SCN cells must entrain to each other. SCN intercellular coupling is essential for synchrony among cellular oscillators and robustness against genetic or environmental perturbations (Liu et al., 2007). Studies over the past decade have found that VIP signaling is particularly important for coupling SCN neurons.

Vasoactive intestinal peptide is a peptide of 28 amino acid and a ligand of G protein-coupled receptors (Gozes and Brenneman, 1993). Vip expression is enriched in the SCN cells and is also found in a subset of GABAergic neurons in the neocortex,

FIGURE 1 | Schematic overview of light-regulated translational control pathways in the SCN. Light at night activates mTORC1, which in turn regulates its translational effectors ribosomal protein S6 kinases (S6Ks, including S6K1, and S6K2) and eukaryotic translation initiation factor 4E-binding proteins (4E-BPs). 4E-BP phosphorylation leads to its dissociation from eIF4E and activation of cap-dependent translational initiation. Vip (vasoactive intestinal peptide) mRNA translation is regulated by 4E-BPs. In another pathway, photic ERK/MAPK activation leads to phosphorylation of eIF4E via MNK kinases and promotes mRNA translation of Per1 and Per2 in the SCN. Thus, the MAPK and mTOR pathways converge on eIF4E to regulate mRNA translation in the SCN.

olfactory bulb, some midbrain and brainstem regions as well as the gut and the pancreas (Liu et al., 2018). Through its receptor VPAC2R (encoded by the Vipr2 gene), VIP signaling is essential for synchrony between ventral and dorsal SCN cells. Loss of VIP or VPAC2R leads to unstable, low amplitude circadian cycling in individual SCN cells and weak rhythms or arrhythmicity in SCN slices and animals (Harmar et al., 2002; Colwell et al., 2003; Cutler et al., 2003; Aton et al., 2005; Maywood et al., 2006). The direct protein product of the Vip gene is prepro-VIP, a 170-amino acid peptide. How Vip mRNA translation was regulated was not known.

As aforementioned, 4E-BPs are translational repressors and their activities are inhibited after phosphorylation by mTORC1. In the post-mitotic adult brain, cell growth and division are limited, and phosphorylation of 4E-BPs is low in a variety of brain regions, presumably because of the relatively moderate demand for protein synthesis. However, it is found that 4E-BPs are highly phosphorylated in the SCN (Cao and Obrietan, 2010), indicating a unique role for 4E-BPs in the SCN circadian clock. Indeed, it is found that 4E-BP1 specifically inhibits mRNA translation of Vip. By phosphorylating 4E-BP1, mTORC1 promotes Vip mRNA translation and increases the abundance of VIP in the SCN (Cao et al., 2013). In 4E-BP1 null mice, levels of prepro-VIP (precursor protein of VIP) and VIP are increased in the SCN. Consequently, these animals re-entrain to a shifted light/dark cycle more quickly and show resistance to the rhythm-disruptive effects of constant light. At the tissue level, the 4E-BP1 null SCN slices exhibit a shorter period and higher amplitude of PER2::LUCIFERASE (PER2::LUC) rhythms,

consistent with enhanced coupling among SCN cells (Cao et al., 2013).

Conversely, in Mtor heterozygotes prepro-VIP and VIP level is decreased, and the PER2::LUC rhythms in SCN are damped with a lengthened period (Cao et al., 2013; Ramanathan et al., 2018). These mice show longer period under constant conditions and are more susceptible to the effects of constant light (Cao et al., 2013). To test whether mTOR regulates SCN cell synchrony, the mTORC1 inhibitor PP242 was applied and PER2::LUC bioluminescence imaging was performed on SCN slices. SCN synchrony is indeed disrupted by PP242 (Liu et al., 2018), consistent with its rhythm-damping effects. As mTOR inhibition decreases Vip expression, the effects of PP242 may be ascribed to decreased VIP level in the SCN. To test whether mTOR regulates SCN synchrony through VIP neurons, conditional mTOR knockout mice were created, where mTOR gene was specifically knocked out in VIP cells (Liu et al., 2018). Indeed, these mice exhibit significant circadian defects, including weakened circadian behavioral rhythmicity under constant light, disrupted circadian behavior under a skeleton photoperiod, and decreased synchrony among SCN cells. These phenotypes largely resemble those seen in the Vip or VPAC2 null mice (Harmar et al., 2002; Colwell et al., 2003; Cutler et al., 2003; Aton et al., 2005; Maywood et al., 2006) as well as in rats treated with VIP antagonists (Gozes et al., 1995), suggesting that mTOR regulates circadian synchrony via VIP signaling. However, additional mechanisms cannot be excluded. For example, as mTOR regulates the amplitude of cellular circadian oscillators (Ramanathan et al., 2018, see section "Regulation of Autonomous Properties of the Circadian Clocks by mTOR" ), decreased intercellular coupling could be due to attenuated cellular oscillations. Further studies are needed to identify additional mechanisms whereby the mTOR signaling controls SCN synchrony. **Figure 2** recapitulates the current model to explain how intracellular mTOR signaling can regulate intercellular coupling in the SCN via translational control of Vip.

#### Regulation of Autonomous Properties of the Circadian Clocks by mTOR

mTOR is a major integrator of intracellular signals that senses energy status (e.g., oxygen and ATP levels), and nutrients (e.g., leucine and arginine levels) to regulate cell growth and metabolism. Presumably it can serve as a linker between cellular metabolic states and the circadian timing process. A genome-wide RNAi screen in human cells identified hundreds of genes that regulate cellular clock functions (Zhang et al., 2009). The insulin signaling pathway was identified as the most overrepresented pathway. Downregulation of its multiple components such as PI3K and mTOR alters circadian period. Recently, the effects of mTOR manipulation on autonomous circadian clock properties were studied in various cellular and tissue oscillators (Ramanathan et al., 2018).

mTOR regulates fundamental clock properties (e.g., period and amplitude) in a variety of clock models. mTOR inhibition increases period and reduces amplitude, whereas activation of mTOR shortens period and augments amplitude in fibroblasts,

hepatocytes, and adipocytes (Ramanathan et al., 2018). These results are consistent with studies showing dose-dependent lengthening of circadian period and damping of amplitude in human U2OS cells in response to rapamycin and torin1 treatments (Feeney et al., 2016; Lipton et al., 2017). Constitutive activation of mTOR in Tsc2−/<sup>−</sup> fibroblasts alters the dynamics of clock gene oscillations and elevates levels of core clock proteins, including CRY1, BMAL1, and CLOCK (Lipton et al., 2017; Ramanathan et al., 2018). However, serum stimulation upregulates CRY1 in an mTOR-dependent but Bmal1- and Period-independent manner (Ramanathan et al., 2018). Moreover, mTOR also regulates properties of the ex vivo SCN and liver clocks in a similar way (Ramanathan et al., 2018). In mice, heterozygous mTor knockout mice show lengthened circadian period of locomotor activity rhythms under constant conditions. Consistently, the 4E-BP1 knockout mice, where mTOR activity is increased, show shortened circadian period (Cao et al., 2013). However, TOR modulates circadian period in the opposite direction in Drosophila. Overexpressing S6K in the ventral lateral neurons, the central pacemaker cells, lengthens the circadian period (Zheng and Sehgal, 2010). Consistently, another study reports that knockout of Tor in Per expressing cells decreases circadian period of locomotor rhythms in flies (Kijak and Pyza, 2017). The reasons for this discrepancy between mice and flies are not clear, possibly due to different clock mechanisms in these species.

The circadian functions of mTOR in disease models are more intriguing. One study suggests that circadian rhythms of mTOR activities in cancer cells should be considered in chemotherapy In the study by Okazaki et al. (2014), circadian mTOR activities are found in mouse renal carcinoma. The rhythmic mTOR activities affect the efficacy of everolimus,

#### TABLE 1 | Circadian mTOR signaling in various tissues.

fgene-09-00367 September 6, 2018 Time: 19:33 # 6


#### TABLE 1 | Continued

fgene-09-00367 September 6, 2018 Time: 19:33 # 7


a rapalog mTOR inhibitor that is clinically applied to treat cancers of the kidney, pancreas, breast, and brain. The drug is more effective in improving survival of tumor-bearing mice if applied at the time of a day when mTOR activities are elevated.

Studies also suggest that aberrant mTOR activities underlie circadian dysfunction under pathological conditions. In one study, mTOR is found to mediate the effects of circadian disruption caused by hypoxia, which is seen in many disease conditions such as cancer. When hypoxic cells are permitted to acidify to recapitulate the tumor microenvironment, the circadian clock is impacted through the transcriptional activities of hypoxia-inducible factors (HIFs) at clock genes. Acidification of cells suppresses mTORC1 signaling and restoring mTORC1 signaling rescues clock oscillation (Walton et al., 2018). In another study, Lipton et al. (2017) investigated circadian rhythms in a mouse model of TSC. In the TSC mice mTOR activities are constitutively elevated. They find that Tsc-deficient mice demonstrate shorter wheel-running period and disrupted core body temperature rhythms in constant darkness. Mechanistically, translation of Bmal1 mRNA is increased and BMAL1 protein degradation is decreased, both of which lead to increased BMAL1 protein level and abnormal clock functions in the TSC tissues. Interestingly, reducing the dose of Bmal1 genetically rescues circadian behavioral phenotypes in the TSC mouse models. The results, together with the findings of mTOR regulation of physiological clock properties, support a significant role for mTOR in circadian timekeeping under normal conditions as well as in mediating circadian dysfunction under disease conditions.

#### Circadian Regulation of mTOR Activities and mRNA Translation

As is the case with many circadian clock-regulated signaling pathways, mTOR activities are regulated by the circadian clock and in turn the rhythmic mTOR activities reinforce the clock function. Indeed, one of the most prominent features of mTOR signaling is the temporal regulation of its activities by the circadian clock. Since mTOR was first studied in the SCN clock a decade ago, dozens of studies have identified circadian mTOR activities in different cells, tissues and organisms. First of all, mTORC1 activities exhibit robust circadian oscillations in the SCN under constant conditions, as indicated by rhythmic S6 and 4E-BP1 phosphorylation (Cao et al., 2011, 2013). In the mouse brain, mTORC1 activities also exhibit daily oscillations in the arcuate nucleus, hippocampus as well as the frontal cortex (Khapre et al., 2014; Saraf et al., 2014; Albert et al., 2015). These brain regions are important for circadian rhythms, feeding, learning, memory, and emotions. In Drosophila, TOR rhythms are also found in the brain and in particular the ventral lateral neurons (Zheng and Sehgal, 2010; Kijak and Pyza, 2017). In peripheral tissues, mTOR activities are rhythmic in the liver, cardiac and skeletal muscles, adipocytes, and retinal photoreceptors (Huang et al., 2013; Jouffe et al., 2013; Shavlakadze et al., 2013; Khapre et al., 2014; Drägert et al., 2015a,b; Lipton et al., 2015, 2017; Chang et al., 2016). Interestingly, mTOR also shows circadian rhythms in human osteosarcomas, mouse renal carcinomas as well as human breast cancer cells (Zhang et al., 2009, 2018; Okazaki et al., 2014). These circadian mTOR studies are summarized in **Table 1**. It remains elusive, however, what mechanisms drive rhythmic mTOR activities in different tissues and cells. In the SCN, cellular S6K1 activity levels correlates with cellular Per1 but not Per2 transcription due to unknown mechanisms (Cao et al., 2011).

Several mechanistic studies have highlighted a role for mTOR signaling as an output pathway which links circadian rhythmicity to mRNA translation. Jouffe et al. (2013) investigated circadian coordination of mRNA translation in the mouse liver. They identified rhythmic activation of a number of translational control signaling pathways, including the mTORC1 pathway and the ERK/MAPK pathway. They found that the circadian clock influences the temporal translation of a subset of mRNAs that are mainly involved in ribosome biogenesis. The circadian clock also controls the transcription of ribosomal protein mRNAs and ribosomal RNAs. Together these data demonstrate that the circadian clock exerts its function by temporal translation of a subset of mRNAs that are involved

in ribosome biogenesis. In another study by the same group, they found that Bmal1 deletion affects both transcriptional and post-transcriptional levels of rhythmic output. Translation efficiencies of genes with 5<sup>0</sup> -terminal oligopyrimidine tract (5<sup>0</sup> -TOP) sequences and genes involved in mitochondrial activity (many of which harbor a Translation Initiator of Short 5<sup>0</sup> -UTR motif) are differentially regulated during the diurnal cycle (Atger et al., 2015).

Lipton et al. (2015) made a surprising finding that the canonical clock protein BMAL1 also functions as a translation factor by associating with the translational machinery and promoting protein synthesis. Interestingly, translational activity of BMAL1 is regulated by rhythmic phosphorylation at Ser42 by the mTORC1/S6K1 pathway. S6K1-mediated phosphorylation is critical for BMAL1 stimulation of protein synthesis. Thus, these results demonstrate that the mTORC1/S6K1 pathway links circadian timing to rhythmic translation via BMAL1. Thus, the transcriptional feedback loop in the circadian clock is coupled to a translational regulatory loop mediated by the mTOR pathway. As translational control is involved in a number of cellular processes, this mechanism is potentially important to understand circadian regulation of many biological processes.

#### REFERENCES


#### SUMMARY

mRNA translation is subject to complex regulation mechanisms. Among these, temporal regulation of mRNA translation occurs on a daily basis in various tissues as coordinated by the circadian clock and its output signaling pathways such as the mTOR signaling. In turn, rhythmic mTOR signaling and mRNA translation feedback to the clock machinery and regulate important clock functions, including its timing, response to entrainment cues, as well as the network properties among circadian oscillators. Deregulation of translational control is linked to circadian clock dysfunction, as seen in the TSC and hypoxia mouse models. Knowledge of mTOR and translational control in the circadian clock is not only essential for understanding the basic clockwork mechanisms, but also could provide insights into mechanistic links between circadian dysfunctions and human diseases so that therapeutic strategies can be developed for these disorders.

#### AUTHOR CONTRIBUTIONS

The author confirms being the sole contributor of this work and approved it for publication.

circadian clock. Mol. Cell. Neurosci. 38, 312–324. doi: 10.1016/j.mcn.2008. 03.005


gene expression and increases blood pressure. Hypertension 66, 332–339. doi: 10.1161/HYPERTENSIONAHA.115.05398


fgene-09-00367 September 6, 2018 Time: 19:33 # 9


pathway but requires neither S6K1 nor rpS6 phosphorylation. Mol. Cell. Biol. 22, 8101–8113. doi: 10.1128/MCB.22.23.8101-8113.2002


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Cao. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

fgene-09-00367 September 6, 2018 Time: 19:33 # 10

# Lost in Translation: Ribosome-Associated mRNA and Protein Quality Controls

#### Andrey L. Karamyshev<sup>1</sup> \* and Zemfira N. Karamysheva<sup>2</sup> \*

<sup>1</sup> Department of Cell Biology and Biochemistry, Texas Tech University Health Sciences Center, Lubbock, TX, United States, <sup>2</sup> Department of Biological Sciences, Texas Tech University, Lubbock, TX, United States

Aberrant, misfolded, and mislocalized proteins are often toxic to cells and result in many human diseases. All proteins and their mRNA templates are subject to quality control. There are several distinct mechanisms that control the quality of mRNAs and proteins during translation at the ribosome. mRNA quality control systems, nonsensemediated decay, non-stop decay, and no-go decay detect premature stop codons, the absence of a natural stop codon, and stalled ribosomes in translation, respectively, and degrade their mRNAs. Defective truncated polypeptide nascent chains generated from faulty mRNAs are degraded by ribosome-associated protein quality control pathways. Regulation of aberrant protein production, a novel pathway, senses aberrant proteins by monitoring the status of nascent chain interactions during translation and triggers degradation of their mRNA. Here, we review the current progress in understanding of the molecular mechanisms of mRNA and protein quality controls at the ribosome during translation.

#### Edited by:

Maritza Jaramillo, National Institute of Scientific Research (INRS), Canada

#### Reviewed by:

Tohru Yoshihisa, University of Hyogo, Japan Woan-Yuh Tarn, Academia Sinica, Taiwan

#### \*Correspondence:

Andrey L. Karamyshev andrey.karamyshev@ttuhsc.edu Zemfira N. Karamysheva zemfira.karamysheva@ttu.edu

#### Specialty section:

This article was submitted to RNA, a section of the journal Frontiers in Genetics

Received: 16 July 2018 Accepted: 11 September 2018 Published: 04 October 2018

#### Citation:

Karamyshev AL and Karamysheva ZN (2018) Lost in Translation: Ribosome-Associated mRNA and Protein Quality Controls. Front. Genet. 9:431. doi: 10.3389/fgene.2018.00431 Keywords: RNA quality control, protein quality control, post-transcriptional regulation of gene expression, RNA stability, RNA degradation, translation, protein targeting and folding, ribosome

### INTRODUCTION

Genetic information is transferred during transcription and translation into correctly folded active proteins that are localized at the proper places for their functioning. Despite the high fidelity of these mechanisms, defective proteins can be produced as result of mutations, mistakes in transcription and translation, stress, or other reasons. Cellular quality control pathways evolved to prevent synthesis of the aberrant proteins at the ribosome or degrade them if they are already synthesized (**Figure 1**).Many quality control systems are engaged cotranslationally and conduct mRNA and protein surveillance at the ribosome. Protein synthesis and degradation of defective proteins are energetically expensive processes and ribosomeassociated quality control can prevent futile aberrant protein synthesis. Nonsense-mediated decay (NMD), no-go decay (NGD), and non-stop decay (NSD) recognize and eliminate mRNAs with premature termination codons (PTCs), truncated and stalled in translation mRNAs, and mRNAs without natural stop codons, respectively (Welch and Jacobson, 1999; Doma and Parker, 2007; Shoemaker and Green, 2012; Popp and Maquat, 2013). Truncated polypeptides produced at the stalled ribosomes are ubiquitinated and degraded by proteasome (Dimitrova et al., 2009; Bengtson and Joazeiro, 2010; Brandman et al., 2012; Duttler et al., 2013; Brandman and Hegde, 2016). The regulation of aberrant protein production (RAPP) pathway senses aberrant proteins by scanning the status of nascent chains interactions

during translation and triggers degradation of their mRNAs (Karamyshev et al., 2014; Pinarbasi et al., 2018). When defective mRNAs and proteins are missed by these quality control systems, the aberrant proteins are degraded by proteolytic machinery in the cytosol (Heck et al., 2010), or in the endoplasmic reticulum (ER) by ER associated degradation (ERAD) pathway (Brodsky and Wojcikiewicz, 2009). If aberrant proteins escape the quality control, they may misfold, form insoluble aggregates or amyloids, and result in many human diseases (Stefani and Dobson, 2003; Gregersen et al., 2006; Zimmermann et al., 2006; Hebert and Molinari, 2007; Jarjanazi et al., 2008; Hipp et al., 2014).

The interactions of a polypeptide nascent chain during translation have a crucial role in protein biogenesis and quality control (Gandin and Topisirovic, 2014). These interactions determine the future localization of the proteins, their folding and modifications (Pechmann et al., 2013). Disruptions of these processes may serve as signals for quality control machinery and for detection of abnormal mRNAs/proteins. In this review, we analyze nascent chain interactions occurring at the ribosome and the events taking place during ribosome-associated mRNA and protein quality controls.

### NASCENT CHAIN INTERACTIONS DURING TRANSLATION ARE IMPORTANT FOR PROTEIN TARGETING AND FOLDING

Protein targeting, transport, and folding occur cotranslationally or posttranslationally (Park and Rapoport, 2012; Ellgaard et al., 2016). In this review, we focus only on co-translational protein interactions. During the first steps of translation, polypeptides exposed from the ribosomal exit tunnel start their first interactions with different factors required for folding, modification, targeting, and transport. Loss of these interactions leads to improper folding and protein degradation, protein aggregation and the formation of amyloids, or mRNA elimination (**Figure 1**). All living cells have different compartments and proteins should be precisely delivered to the proper locations in the cells. While cytosolic proteins remain in the cytosol after completing their synthesis, other proteins are transported to different cellular organelles or outside of the cell.

Despite very big differences between prokaryotic and eukaryotic cells, protein targeting and transport are regulated by similar mechanisms. Proteins possess specific localization signals that are recognized by specialized proteins (Emanuelsson and von Heijne, 2001). These interactions are essential for protein targeting. The best studied localization signals so far are signal sequences (von Heijne, 1985, 1990; Nilsson et al., 2015). Secretory proteins are synthesized as precursors containing N-terminal extension called signal sequence or signal peptide. Signal sequences are responsible for directing secretory proteins to Sec61 translocon in the ER membrane (in eukaryotes) or to SecYEG complex in the bacterial plasma membrane (in prokaryotes) for translocation through the membranes (Alder and Johnson, 2004; Wild et al., 2004; Egea et al., 2005; Rapoport, 2007; Dudek et al., 2015; Voorhees and Hegde, 2016). Different signal sequences do not have sequence homology, but possess similar structural features (von Heijne, 1985, 1990).

In bacteria, sorting events are determined by a balance of interactions of a newly synthesized nascent chain with Ffh/4.5S RNA complex, SecA protein, chaperone trigger factor, and other proteins (Karamyshev and Johnson, 2005; Eisner et al., 2006). Overproduction of secretory proteins leads to imbalance of targeting/folding and accumulation of their precursors in insoluble form in cytoplasm in bacteria (Nesmeyanova et al., 1991; Nesmeyanova et al., 1997).

In eukaryotic cells, the interactions of nascent chains are more complex and reflect more complicated process of cotranslational folding and targeting to multiple organelles. Signal sequences are recognized by signal recognition particle (SRP) (Walter et al., 1981; Krieg et al., 1986; Kurzchalia et al., 1986). These interactions serve as basis for cotranslational targeting of secretory proteins to translocon. In the case of membrane proteins, their first transmembrane spans are also recognized by SRP for targeting.

There are other localization signals for direction of the proteins to mitochondria, nucleus, and peroxisome (Emanuelsson and von Heijne, 2001). These signals are important for proper recognition by specialized targeting factors. Some of these signals are localized at the N-termini and thus probably are recognized cotranslationally, and some are at the C-terminus of the protein, suggesting posttranslational targeting. Examples of N-terminal signals include specialized mitochondrial presequences that enriched in positively charged residues and have ability to form amphiphilic α-helices (von Heijne et al., 1989; Emanuelsson and von Heijne, 2001), peroxisomal targeting signals type 2 (PTS2) for some peroxisomal proteins (Williams, 2014), and others. Tail-anchored (TA) proteins as well as PTS1 peroxisomal proteins contain C-terminal signals and most likely are targeted by posttranslational mechanisms (Stefanovic and Hegde, 2007; Williams, 2014).

There are many specialized proteins that interact with nascent chains during their synthesis. They include targeting factors, chaperones assisting protein folding, and modification factors. These proteins are organized in a group with a general name ribosome-associated protein biogenesis factors (RPBs) (Raue et al., 2007). Nascent chains of cytosolic and secretory proteins interact with different partners of RPBs to achieve proper folding and correct targeting (**Figure 2**). RPBs act during translation when a short nascent chain emerges from the ribosomal polypeptide tunnel. In yeast, RPBs consist of targeting factor SRP, nascent polypeptide-associated complex (NAC), chaperones Ssb1 and Ssb2 (Hsp70 homologs), the ribosome-associated complex (RAC), N-terminal acetyltransferase (NatA), and Map1 and Map2 proteins (Raue et al., 2007).

Yeast RAC consists of two proteins, zuotin (or Zuo1, DnaJ homolog, Hsp40 family) and Ssz1p (DnaK homolog, Hsp70 family) (Gautschi et al., 2001; Zhang et al., 2017). Mammalian RAC includes chaperones MPP11 and HSP70L1 (Otto et al., 2005). It was found that RAC binds ribosomes near polypeptide tunnel exit (Peisker et al., 2008). NAC consists of two subunits, α and β, both of them are localized in close proximity to a nascent

chain, as it was demonstrated by crosslinking (Wiedmann et al., 1994; Wang et al., 1995). It binds short nascent chains when they are just exposed from the polypeptide tunnel. In the case of secretory proteins, NAC binds the nascent chain only when signal sequence is not completely exposed from the ribosome (**Figure 2B**). Binding of NAC is important for SRP specificity and translocation fidelity (Wiedmann et al., 1994). In normal conditions, NAC binds ribosomes to promote protein folding, however in stress it moves to protein aggregates and functions as a protein chaperone (Kirstein-Miles et al., 2013). Chaperone Ssb binds wide variety of substrates – cytosolic, ER, nuclear, and mitochondria nascent polypeptides (Doring et al., 2017). Its binding accelerates translation. Ssb (Ssb1 and Ssb2), RAC, and NAC have a dual function in folding of new proteins and regulation of the ribosome production (Koplin et al., 2010).

It was also found that there are two major chaperone groups or networks with discrete functions in the cells, one is for de novo folding (named CLIPS for chaperones linked to protein synthesis) and the other (HSPs, heat shock proteins) is for protein refolding to rescue them in stress (Albanese et al., 2006). Thus, translation-associated chaperones are organized in the CLIPS network (Albanese et al., 2010; Pechmann et al., 2013). While secretory/membrane proteins need SRP during the first step of protein synthesis, cytosolic proteins require ribosome bound chaperones Ssb (HSP70 family) in yeast (Willmund et al., 2013), HSP70L1 and MPP11 in mammals (Otto et al., 2005), chaperonin TRiC (McCallum et al., 2000; Etchells et al., 2005), and other factors (Hartl et al., 2011; Pechmann et al., 2013). Another chaperonin, prefoldin, also binds nascent chains and is involved in folding of actin and tubulin (Hartl and Hayer-Hartl, 2002). It is not completely understood how specificity of chaperones/chaperonins to nascent chains is controlled. In addition, large group of proteins involved in quality control and ubiquitination of aberrant nascent chains are also found bound to translating ribosomes (Comyn et al., 2014).

Thus, ribosome itself serves not only as a protein synthesis machinery but it also plays a key role in arranging protein targeting/folding and quality control. Studying the normal interactions of nascent polypeptides during translation and their change during engagement of mRNA and protein quality control machineries are important for understanding of molecular foundation of protein biogenesis and homeostasis, as well as for

molecular basis of human diseases associated with dysregulation of these processes.

#### RIBOSOME-ASSOCIATED mRNA QUALITY CONTROL PATHWAYS

mRNA turnover is one of the major mechanisms to control gene expression and maintain a high level of fidelity for cell function and viability. Cells use multiple mRNA degradation pathways to eliminate non-functional transcripts. mRNA decay is a highly orchestrated process controlled by distinct set of genes. mRNA surveillance starts in the nucleus. Defective mRNAs could be detected and subjected for degradation at different stages of their production and maturation including transcription, capping, splicing, and polyadenylation. Exosome is the major machinery to degrade the faulty mRNAs in the nucleus. Then mRNAs that passed a quality control in the nucleus are exported to the cytoplasm as messenger ribonucleoproteins (mRNPs) where they can be engaged in translation. In the cytoplasm, mRNAs are subjected to additional cotranslational mRNA surveillance quality control. Several major mRNA degradation pathways operate to identify faulty mRNAs and protect the cell from translation of aberrant mRNAs and potentially toxic proteins – NMD, NGD, and NSD (**Figure 3**).

## NONSENSE-MEDIATED DECAY

Nonsense-mediated decay is mRNA surveillance pathway that recognizes and targets mRNAs with PTCs for rapid degradation to reduce translation of truncated proteins with dominantnegative or deleterious gain-of-function activities (Welch and Jacobson, 1999; Popp and Maquat, 2013) (**Figure 3A**). This pathway exists in all eukaryotes examined so far (Culbertson, 1999). NMD was not found in bacteria. The presence of the PTCs in bacterial genes leads to termination or reinitiation of translation (Karamyshev et al., 2004).

Exon-exon junction complex (EJC) is a complex of proteins that are assembled at the pre-mRNA during splicing (Gehring et al., 2009). After mRNA export EJC is being removed from the mRNA during pioneer round of translation and replaced with proteins promoting translation. However, if premature termination codon is present on the mRNA ≥ 50–55 nucleotides upstream of the EJC the NMD is activated most likely because

the terminating ribosome (at the PTC) is not able to remove EJC and proceed with normal translation (Popp and Maquat, 2014).

Several proteins are conserved in NMD across species and constitute the core of this pathway: the up-frameshift proteins UPF1, UPF2, and UPF3. UPF1 is the master regulator of NMD. ATPase activity of UPF1 is required for disassembly of mRNPs during NMD (Franks et al., 2010). In mammals, two variants of UPF3 exist: UPF3a and UPF3X (UPF3b) (Serin et al., 2001). In multicellular organisms, additional proteins called suppressors with morphological effects on genitalia (SMG1, SMG5 – SMG9) contribute to the regulation of NMD (Yamashita et al., 2001, 2009). NMD takes place in three stages including detection of NMD substrates, tagging, and finally degradation of PTC containing transcript. NMD activation begins with detection of PTC during pioneer round of translation. After detection stage the PTC is tagged by formation of SURF complex at the terminating ribosome. SURF complex includes the serine/threonine kinase SMG1, UPF1, and eukaryotic release factors eRF1-eRF3 (Kashima et al., 2006; Hwang et al., 2010). Then UPF1-SMG1 binds to EJC via interaction with UPF2. UPF2 is bound to EJC through interaction with UPF3 or UPF3X. SMG1 phosphorylates UPF1. Hyperphosphorylated UPF1 induces translational repression and recruits SMG6 protein (Isken et al., 2008). SMG6 performs endonucleolitic cleavage of mRNA. This cleavage occurs between the PTC and EJC sites of the defective mRNA during last stage of NMD (Huntzinger et al., 2008; Eberle et al., 2009). Activated UPF1 then recruits SMG5-SMG7 or SMG5-PNRC2 (Kervestin and Jacobson, 2012). These proteins further recruit decapping and/or deadenylation machinery to facilitate exonucleolytic degradation of unprotected 5<sup>0</sup> - and 3<sup>0</sup> -mRNA fragments resulted from endonucleolytic cleavage of PTC-containing mRNA (Lejeune et al., 2003; Loh et al., 2013). 5<sup>0</sup> -to-3<sup>0</sup> exonuclease XRN1 is responsible for degradation of the 3<sup>0</sup> -cleavage product (Lejeune et al., 2003; Unterholzner and Izaurralde, 2004; Eberle et al., 2009). The 5<sup>0</sup> -cleavage product most likely is degraded by exosome (Schmid and Jensen, 2008). NMD proteins can be co-purified with components of mRNA degradation machinery (DCP2, XRN1, and XRN2/RAT1, and several exosome subunits) (Lejeune et al., 2003; Muhlemann and Lykke-Andersen, 2010). Decapping and deadenylation enzymes may contribute to faster degradation of mRNA fragments in mammalian cells (Lejeune et al., 2003). However, more research is needed to understand the role of decapping and deadenylation in NMD.

While the mechanism explained above (Exon Junction Complex model) is appealing, it cannot explain all the details of the NMD mechanism and alternative models including Upf1 3<sup>0</sup> -UTR sensing/potentiation and the faux 3<sup>0</sup> - UTR models were proposed (reviewed in He and Jacobson, 2015). While the models recognize importance of 3<sup>0</sup> -UTR, however, they propose different roles for 3<sup>0</sup> -UTR and NMD target recognition (Amrani et al., 2004; Hogg and Goff, 2010). According to sensing/potentiation model Upf1 senses 3<sup>0</sup> -UTR and potentiates mRNA decay (Hogg and Goff, 2010). According to faux model, efficient termination is inhibited when the distance between PTC and polyA tail is large (Amrani et al., 2004).

The major function of UPF1, the master regulator of NMD, is to limit translation from aberrant mRNAs. Thus, NMD is translation-dependent process and truncated protein derived from pioneer round of translation could be toxic and contribute to human pathology. Therefore, PTC-containing mRNA degradation should be coupled to the protein degradation of truncated polypeptide. While limited information is available in this regard some studies on yeast suggest that Upf1 could have E3 ubiquitin ligase properties promoting degradation of truncated polypeptide through proteasome (Takahashi et al., 2008; Kuroha et al., 2009). However, the fate of truncated proteins produced during NMD in mammalian cells remains an open question for further investigations.

## NO-GO DECAY

No-go decay degrades mRNAs stalled in translation elongation complexes (**Figure 3B**). Translational arrest could be caused either by specific features of nascent peptides, strong secondary structures in mRNA physically blocking the translation machinery along the transcript, or a rare codon repeat causing the A site to be unoccupied for a long duration (Kuroha et al., 2010; Tsuboi et al., 2012). Insertion of stable stem-loop RNA structure into PGK1 mRNA led to translational arrest and endonucleolytic cleavage of mRNA stalled in translation elongation with subsequent rapid mRNA degradation. While NGD pathway was initially discovered in yeast, it was also identified in fruit flies and mammals (Doma and Parker, 2006; Passos et al., 2009; Pisareva et al., 2011).

Proteins Pelota (in mammals; Dom34 in yeast) and HBS1 are involved in regulation of NGD pathway (Doma and Parker, 2006; Pisareva et al., 2011) and are structurally related to the termination factors eRF1 and eRF3, respectively (Atkinson et al., 2008). They also mimic complex of elongation factor and tRNA suggesting that they bind A site at the ribosome (van den Elzen et al., 2010). Indeed, Dom34 and Hbs1 interact directly with A site of the ribosome but instead of termination they promote dissociation of aberrant translation elongation complex and ribosome recycling (Shoemaker et al., 2010; Becker et al., 2011). Dom34/Hbs1 can also stimulate endonucleolytic cleavage event in NGD substrate and promote subsequent mRNA degradation, however, these factors are not essential since cleavage of NGD mRNA can take place even in the absence of these proteins (Passos et al., 2009; Tsuboi et al., 2012). The data suggest that endonucleolytic cleavage occurs upstream of the ribosome stalling site (Tsuboi et al., 2012). It was shown recently that NGD is triggered by the ribosome collision resulting in multiple endonuclease cleavages (Simms et al., 2017). Efficiency of NGD depends on the ribosome density on the substrate mRNA suggesting that ribosome collision transmits signal to activate endonuclease. Like in NMD pathway, generated fragments are rapidly degraded by the exosome and XRN1 during NGD. It still remains unknown what endonuclease is responsible for the cleavage of NGD substrates.

## NON-STOP DECAY

fgene-09-00431 October 1, 2018 Time: 14:37 # 7

Non-stop decay degrades mRNAs that lack stop codons (**Figure 3C**). NSD was first discovered in yeast (van Hoof et al., 2002) and mammals (Frischmeyer et al., 2002). Non-stop mRNAs could arise by different mechanisms. These aberrant mRNAs may be produced by erroneous polyadenylation within the ORF resulting in non-stop mRNAs with poly(A) or by endonucleolytic cleavage within the ORF generating non-stop mRNA lacking poly(A) (Ozsolak et al., 2010; Graille and Seraphin, 2012). Translation of poly(A) leads to formation of poly-lysine chain at the C-terminus of the synthesized polypeptide. This positively charged amino acid chain causes stalling of the polypeptide in the ribosome tunnel most likely due to interaction with negatively charged ribosomal RNA (Dimitrova et al., 2009). In case of truncated non-stop mRNAs lacking poly(A), ribosomes stall at the very 3<sup>0</sup> -end of the mRNA. In both cases, translational stalling triggers rapid degradation of non-stop mRNAs by the translation-dependent NSD pathway. Translational repression is a prerequisite for mRNA degradation during NSD (Inada and Aiba, 2005; Akimitsu et al., 2007) similarly to NGD and NMD. It was shown that Ski7, a protein structurally related to Hbs1 and eRF3, is able to bind stalled ribosome and recruit exosome to the transcript during NSD in yeast (van Hoof et al., 2002). However, Ski7 is not present in higher eukaryotes. Organisms lacking Ski7 rely on Hbs1 and Dom34 proteins that function in both NSD and NGD (Tsuboi et al., 2012; Saito et al., 2013) suggesting a substantial overlap in function of these pathways. Recent study from Inada's group has shown that Dom34:Hbs1 complex has a crucial role to dissociate ribosomes and stimulate mRNA degradation in both NSD and NGD pathways (Tsuboi et al., 2012). Endonucleolytic cleavage is a first step for mRNA degradation in NSD. It has been found that stalled ribosomes can induce multiple endonucleolytic cleavage events on nonstop mRNA covered by the individual ribosomes (Tsuboi et al., 2012). However, similar to NGD, the identity of endonuclease implicated in NSD is not known yet.

Thus, all of these cotranslational quality control systems share several common features: the aberrant mRNA must be eliminated, the truncated protein products should be degraded and the stalled ribosomes should recover from stalling and return for translation. NMD was originally discovered as a surveillance pathway with major function to reduce errors in gene expression by eliminating PTC-containing mRNAs; however, new roles of the NMD pathway have recently emerged. It has been found that NMD pathway is also capable to target normal and physiologically functional mRNAs in order to drive a rapid change in gene expression (He and Jacobson, 2015). Ribosome profiling revealed that the NMD pathway regulates expression levels of at least 10% of human transcripts (Celik et al., 2017). NMD contributes to regulation of germ granules and spermatogenesis, and NMD components were found in the composition of chromatoid body (Meikar et al., 2014; Bao et al., 2016; MacDonald and Grozdanov, 2017). It is conceivable that NSD and NGD pathways are also involved in regulation of gene expression in addition to mRNA quality control in a similar manner as NMD. Recent data suggest that NGD pathway can be used not only to degrade faulty mRNAs but also normal histone mRNAs from stalled degradation complexes as a part of cell cycle regulation (Slevin et al., 2014). Chemically damaged mRNAs (oxidized, depurinated, or alkylated) can cause translational stalls and become NGD substrates in order to reduce burden of toxic protein products for the cell (Shan et al., 2007; Wurtmann and Wolin, 2009).

Deficiencies in the NMD components such as UPF3B and SMG9 lead to an intellectual disability or multiple congenital anomaly syndrome, respectively, due to global transcriptional deregulation (Rebbapragada and Lykke-Andersen, 2009; Shaheen et al., 2016). The NMD pathway has also been found to regulate immune responses. The component of NMD, UPF1, is involved in antiviral responses and restricts the Semliki Forest virus (SFV) and Sindbis viral infections (Balistreri et al., 2014). Somatic mutations in UPF1 gene are connected to pancreatic adenosquamous carcinoma (Liu et al., 2014). Deregulation of NMD pathway is associated with several types of cancer and reviewed in details in the recent publication (Popp and Maquat, 2018).

### REGULATION OF ABERRANT PROTEIN PRODUCTION

Novel type of ribosome-associated protein quality control, RAPP, was recently discovered (Karamyshev et al., 2014) (**Figures 1**, **4**). The first natural RAPP substrate, granulin with disease-causing signal sequence mutations, was also recently identified, demonstrating that RAPP activation serves as a molecular mechanism for some types of frontotemporal lobar degeneration (Pinarbasi et al., 2018). The RAPP pathway detects aberrant proteins during translation and degrades their mRNA templates to prevent synthesis of potentially hazardous products (preventive quality control). It involves recognition of nascent chains that lost their normal interactions with factors for targeting and directs the aberrant protein mRNA for degradation. Original research was conducted on the example of secretory protein preprolactin with deletions in the signal sequence (Karamyshev et al., 2014). The central event of the RAPP pathway is a balance of interactions at the ribosome during translation. Normally, during translation, secretory proteins are recognized by SRP and targeted to the ER membrane for translocation through a translocon into the ER lumen (**Figure 4A**). When an aberrant signal sequence is not recognized by SRP due to a mutation or when SRP is absent or defective, AGO2 protein binds ribosome-nascent chain complex and triggers specific mRNA degradation (**Figures 4B,C**). Thus, SRP has a novel function in mRNA protection of the secretory proteins from degradation in addition to its role in protein targeting.

Although there are no distinct sequence requirements to trigger mRNA degradation, a mutation should take place in the vicinity of the region responsible for a necessary protein interaction and lead to impairment of this interaction. The AGO2 role in this process is not known yet. We hypothesize that the positioning of AGO2 close to a mutated nascent chain regulates its ability to direct mRNA for degradation. AGO2 is a protein

that is involved in miRNA and siRNA response, translational silencing and a major component of RNA-induced silencing complex (RISC) (Hammond et al., 2001; Martinez et al., 2002). However, our experiments demonstrated that RAPP process does not involve miRNAs, Drosha and Dicer proteins suggesting a novel AGO2 function in the absence of RISC formation (Karamyshev et al., 2014). AGO2 possesses slicer or ribonuclease H activity (Liu et al., 2004; Song et al., 2004; Rivas et al., 2005). However, experiments involving enzymatically inactive AGO2 indicate that AGO2 slicer activity is not required for mRNA degradation during RAPP (Karamyshev et al., 2014). We have found that the mRNA degradation of the model RAPP substrates was suppressed by AGO2 depletion and accelerated by AGO2 overexpression. However, granulins with disease-causing mutations were not affected (Pinarbasi et al., 2018). These observations suggest that AGO2 functions as a sensor for some substrates during RAPP response, and an unidentified protein may serve that function for other substrates. Other explanation is that the major sensor of the pathway is not determined yet and AGO2 conducts a helper or enhancer function for some substrates. Our data suggest that the mRNA cleavage is conducted by other than AGO2 endonuclease. However, the nature of the endonuclease still remains to be found. Thus, the mechanism of the RAPP pathway is far from understanding yet.

It was found earlier that under stress conditions that lead to accumulation of unfolded proteins in ER, a process known as regulated Ire1-dependent decay (RIDD) is triggered (Hollien and Weissman, 2006; Hollien et al., 2009). It reduces quantity of secretory protein mRNAs to decrease accumulation of secretory proteins in ER during unfolded protein response (UPR). RIDD is an important general stress response mechanism that senses unfolded secretory proteins that have been successfully transported into ER, and prevents their further synthesis and therefore transport into ER and accumulation. By contrast, RAPP senses mutated polypeptide nascent chains that are not able to interact with SRP and therefore are not targeted and not translocated into ER thereby reducing accumulation of these potentially hazardous proteins in the cytosol.

The current RAPP model is based on selection of mRNA for degradation by a loss of cotranslational interaction between nascent chain and targeting factor at the ribosome. If interaction with SRP is reduced due to a mutation in the signal sequence then AGO2 interacts with nascent chain and directs its mRNA for degradation. If SRP interaction is intact, AGO2 cannot interact with nascent chain. It is possible that this mechanism is general and involved in quality control of other types of proteins that lost their normal interactions. It could be cytosolic aberrant proteins that lost natural interactions with some chaperones (for instance, ribosome associated chaperones and components of RAC, MPP11, and HSP70L1), or peroxisomal and mitochondrial proteins, that lost their interactions with their targeting factors. However, the understanding of these processes requires future studies.

### RIBOSOME-ASSOCIATED QUALITY CONTROL AT A NASCENT CHAIN LEVEL

What happens to partially synthesized nascent chains after activation of degradation of the faulty mRNAs? Recent studies on cotranslational quality control systems induced by translational stalls have revealed that not only faulty mRNAs but also truncated proteins are rapidly degraded. In yeast, Ltn1, a ribosomeassociated E3 ubiquitin ligase (Bengtson and Joazeiro, 2010)

and a component of Ccr4-Not complex, Not4p (that may act as E3 ubiquitin-protein ligase) (Panasenko et al., 2006; Dimitrova et al., 2009; Halter et al., 2014) play important role in aiding of truncated protein products for degradation by proteasome. It was demonstrated that Ltn1, Tae2 (other name Rqc2), Rqc1, and AAA-ATPase Cdc48 (other names VCP, valosin containing protein, and p97) are involved in removal of aberrant translational products in yeast and form a complex on the 60S ribosome subunit (Brandman et al., 2012; Defenouillere et al., 2013). This complex was named the ribosome quality control complex (RQC) (Brandman et al., 2012). Listerin, the functional mammalian homolog of Ltn1, is involved in ubiquitination of aberrant nascent chains produced by the stalled ribosomes (Shao et al., 2013) (**Figure 5**). Notably, that the ubiquitinated nascent chains were found still attached to tRNAs, however, the process required dissociation of the ribosome subunits. Pelota, HBS1, ABCE1 are involved in the ribosome subunits dissociation in mammals, while Dom34, Hbs1, Rli1 are in yeast (Shoemaker et al., 2010; Pisareva et al., 2011; Shoemaker and Green, 2011) (**Figure 5**). Ribosome subunits dissociation leads to assembly of the RQC on the 60S ribosome subunit (Shao et al., 2015). Binding of nuclear export mediator factor (NEMF) in mammals (Rqc2 or Tae2 in yeast) prevents subunits association, leads to recruitment of Listerin and its positioning near the polypeptide exit site on the 60S subunit (Lyumkis et al., 2014; Shao et al., 2015). The results of several studies suggested that Cdc48, Npl4, Ufd1, and Rqc1 are involved in extraction of the ubiquitinated nascent chains from the 60S subunit (Brandman et al., 2012; Defenouillere et al., 2013; Verma et al., 2013). The mammalian orthologs are VCP (p97), UFD1, NPLOC4, and TCF25 (Verma et al., 2018). However, the detailed role of the distinct components is not well understood. It was discovered recently that yeast Vms1 (ANKZF1 in mammals) releases ubiquitinated nascent chains from the stalled ribosomes by peptidyl-tRNA hydrolysis for further degradation of polypeptides by proteasome (Verma et al., 2018). Very little is known about truncated nascent chain degradation during NMD pathway. It was shown that UPF1 promotes degradation of truncated peptides generated in NMD pathway and can potentially serve as E3 ubiquitin ligase (Takahashi et al., 2008; Kuroha et al., 2009). However, more research is needed to identify all key players in regulation of cotranslational protein degradation and details of the mechanism during NMD.

Several cotranslational protein quality controls induced by stress were recently discovered in mammals. One of them, pre-emptive quality control (pQC), cotranslationally reroutes membrane and secretory proteins to cytoplasm for degradation under acute ER stress (Kang et al., 2006; Kadowaki et al., 2015). Derlins (degradation in ER proteins) redirect them from the translocon to the proteasome with involvement of chaperone Bag6 (BCL2 associated athanogene 6) and p97 (alias Cdc48 or VCP) (Kadowaki et al., 2015). pQC reduces the burden of misfolded proteins in the ER during stress. Bag6 complex is also involved in mislocalized protein degradation pathway (Hessa et al., 2011). This pathway senses the presence of unprocessed or non-inserted hydrophobic domains released into the cytosol and directs these proteins for degradation. Other stress-induced quality control involves recruitment of c-Jun N-terminal kinase (JNK) to ribosomes by the receptor for activated protein C kinase 1 (RACK1), phosphorylation of elongation factor eEF1A2, and promotion of degradation by proteasome (Gandin et al., 2013). It implicates the complex JNK/RACK1/eEF1A2 in protein quality control at the ribosome in response to stress.

### CONCLUSION

Thus, nascent chains interact with a number of different factors at the ribosome during translation. These interactions are required for normal folding, transport and formation of active proteins. The alterations of these important interactions because of mutations or defective factors trigger protective mechanisms to prevent accumulation of the potentially toxic products in the cells. In addition, different aberrations in mRNAs may lead to translational stalling that prevents new rounds of translation and potentially may be fatal. Cells developed protective mechanisms to recycle stalled ribosome and remove aberrant proteins and mRNAs. Therefore, network of ribosomeassociated proteins, endo- and exo-nucleases, chaperones, ubiquitin ligases, proteasome and other proteins, working in concert, is maintaining protein homeostasis in the cells. Multiple mechanisms are engaged at different stages of protein biogenesis to get rid of aberrant mRNA templates, mutated or uncompleted nascent chains, and misfolded or mislocalized proteins. However,many details of these mechanisms are still not completely understood and additional studies are needed to fill that gaps.

### AUTHOR CONTRIBUTIONS

AK and ZK wrote, discussed, and edited the manuscript.

### FUNDING

This work was supported by the Start-up funds from Texas Tech University Health Sciences Center and by the National Institute of Neurological Disorders and Stroke of the National

#### REFERENCES


Institutes of Health under award number R03NS102645 to AK. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

### ACKNOWLEDGMENTS

The authors thank Elena B. Tikhonova, Clinton C. MacDonald, and Petar N. Grozdanov for critical reading of the manuscript.

the clearance of aberrant translation products. Proc. Natl. Acad. Sci. U.S.A. 110, 5046–5051. doi: 10.1073/pnas.1221724110



large ribosomal subunit-associated protein quality control complex. Proc. Natl. Acad. Sci. U.S.A. 111, 15981–15986. doi: 10.1073/pnas.1413882111


a stalled ribosome at the 3<sup>0</sup> end of aberrant mRNA. Mol. Cell. 46, 518–529. doi: 10.1016/j.molcel.2012.03.013


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Karamyshev and Karamysheva. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Sequence Determinants for Nuclear Retention and Cytoplasmic Export of mRNAs and lncRNAs

Alexander F. Palazzo\* and Eliza S. Lee

Department of Biochemistry, University of Toronto, Toronto, ON, Canada

Eukaryotes are divided into two major compartments: the nucleus where RNA is synthesized and processed, and the cytoplasm, where mRNA is translated into proteins. Although many different RNAs are made, only a subset is allowed access to the cytoplasm, primarily RNAs involved in protein synthesis (mRNA, tRNA, and rRNA). In contrast, nuclear retained transcripts are mostly long non-coding RNAs (lncRNAs) whose role in cell physiology has been a source of much investigation in the past few years. In addition, it is likely that many non-functional RNAs, which arise by spurious transcription and misprocessing of functional RNAs, are also retained in the nucleus and degraded. In this review, the main sequence features that dictate whether any particular mRNA or lncRNA is a substrate for retention in the nucleus, or export to the cytoplasm, are discussed. Although nuclear export is promoted by RNA-splicing due to the fact that the spliceosome can help recruit export factors to the mature RNA, nuclear export does not require splicing. Indeed, most stable unspliced transcripts are well exported and associate with these same export factors in a splicing-independent manner. In contrast, nuclear retention is promoted by specialized cis-elements found in certain RNAs. This new understanding of the determinants of nuclear retention and cytoplasmic export provides a deeper understanding of how information flow is regulated in eukaryotic cells. Ultimately these processes promote the evolution of complexity in eukaryotes by shaping the genomic content through constructive neutral evolution.

Keywords: TREX, lncRNAs, transposable elements, RNA modification, splicing, polyadenylation, constructive neutral evolution

## INTRODUCTION

The distinguishing feature of eukaryotic cells is that they are divided into two compartments: the nucleus where pre-messenger RNAs (mRNAs) are made and processed, and the cytoplasm where mature mRNAs are translated into proteins (Martin and Koonin, 2006; Palazzo and Gregory, 2014). This is in contrast to prokaryotes, where mRNAs are made and translated at the same time in the same compartment. In eukaryotes, the temporal and spatial separation of mRNA synthesis from translation allows each newly made RNA to be subjected to extensive quality control before it ever encounters a ribosome (Palazzo and Akef, 2012). This quality control involves the nuclear retention and/or degradation of spurious transcripts, which are synthesized from intergenic DNA regions, and misprocessed RNAs, which result from errors in splicing or 3<sup>0</sup> cleavage. In the absence of this quality control, spurious transcripts and misprocessed mRNAs would be exported to the cytoplasm

Edited by: Chiara Gamberi, Concordia University, Canada

#### Reviewed by:

David Michael Shechner, University of Washington, United States Claudia Ghigna, Istituto di Genetica Molecolare (IGM), Italy

> \*Correspondence: Alexander F. Palazzo alex.palazzo@utoronto.ca

> > Specialty section:

This article was submitted to RNA, a section of the journal Frontiers in Genetics

Received: 03 August 2018 Accepted: 14 September 2018 Published: 17 October 2018

#### Citation:

Palazzo AF and Lee ES (2018) Sequence Determinants for Nuclear Retention and Cytoplasmic Export of mRNAs and lncRNAs. Front. Genet. 9:440. doi: 10.3389/fgene.2018.00440

**30**

and then translated into toxic proteins. Thus, the separation of RNA synthesis in the nucleus and translation in the cytoplasm, and the associated quality control mechanisms that go along with this separation, reduces some of the harmful side-effects of non-functional RNAs that are transcribed from non-functional DNA. This is why both junk DNA and a low level of spurious transcription are tolerated in most eukaryotes (Palazzo and Gregory, 2014; Palazzo and Lee, 2015).

Importantly, non-functional RNAs, whose harmful effects are reduced by eukaryotic quality control systems, are not effectively eliminated by natural selection and some of these can eventually evolve into functional long non-coding RNAs (lncRNAs). These add to the repertoire of bio-active polymers that organisms can use to regulate growth, homeostasis and development. Although some lncRNAs function in the cytoplasm, most operate in the nucleus (Djebali et al., 2012; Derrien et al., 2012; Tilgner et al., 2012; Palazzo and Gregory, 2014; Kaewsapsak et al., 2017). As a result, lncRNAs must be appropriately sorted to allow for their proper retention in the nucleus or export to the cytoplasm. This separation is critical as alterations in mRNA nuclear retention and cytoplasmic export have been associated with various diseases (Borden and Culkovic-Kraljacic, 2018; Bovaird et al., 2018). Furthermore, many neuropathological states are associated with the formation of RNA-protein liquid–liquid phase separated structures that can disrupt proper nuclearcytoplasmic trafficking by soaking up nuclear transport factors and components of the nuclear pore complex (Mahboubi et al., 2013; Zhang et al., 2018).

So how does this all work? First, long and short RNAs are generally treated very differently. In mammals, RNA length appears to be evaluated by hnRNP C (McCloskey et al., 2012), with transcripts that are shorter than 200 nucleotides (e.g., snRNAs, tRNAs, and miRNAs) being directed toward specialized export pathways (Masuyama et al., 2004; Fuke and Ohno, 2008; McCloskey et al., 2012), while longer RNAs (mRNA and lncRNAs) being shunted to a more generalized pathway that require the major export complex, TREX, and its heterodimeric nuclear transport receptor composed of Nxf1/TAP and Nxt1/p15 (**Figure 1**) (Katahira et al., 1999; Strässer et al., 2002). In addition to these major export factors, other export-promoting complexes exist. SR proteins, which promote splicing, also help to recruit Nxf1/TAP to these RNAs (Huang et al., 2003; Müller-McNicoll et al., 2016). TREX2, which is thought to localize to the nucleoplasmic side of the nuclear pore, also plays a major role in promoting export (Wickramasinghe et al., 2010, 2014; Umlauf et al., 2013; Zhang et al., 2014b). Dbp5, Rae1/Gle1, and Gle2, which associate with the cytoplasmic face of the nuclear pore, may be involved in recycling nuclear export factors back into the nucleoplasm (Blevins et al., 2003; Lund and Guthrie, 2005; Alcázar-Román et al., 2006; Weirich et al., 2006), although this is not quite understood. For reviews on these factors and complexes see (Katahira, 2012; Palazzo and Akef, 2012; Heath et al., 2016; Borden and Culkovic-Kraljacic, 2018). The mechanism that dictates nuclear retention is less well understood. Some of the factors involved are described in later sections of this review.

So how are exported RNAs, which mostly code for protein, differentiated from nuclear retained RNAs, which are typically non-coding? Ultimately these two types of RNAs must differ in one or more ways. This can include cis-elements (i.e., particular RNA motifs) or general features such as splicing, polyadenylation, and RNA modifications. These differences will dictate what proteins are loaded onto the RNA, resulting in the formation of a ribonucleoprotein (RNP) complex that determines the ultimate fate of the transcript.

In this review we shall cover what is known about the RNA features that impact the nuclear retention and cytoplasmic export of mRNAs and lncRNAs. However, before we start, there are a few points to keep in mind. First, although we speak of a given RNA species as being retained in the nucleus or exported to the cytoplasm, few RNAs are completely nuclear or cytoplasmic at steady state. Instead, each RNA species exists at some point along a spectrum between these two extremes. Second, the ultimate steady state distribution of an RNA is dictated not only by the rate of RNA export, but also by the rates of RNA synthesis and of RNA decay in both the nucleus and the cytoplasm. Few studies have taken all these various factors into account with some exceptions. Having said that, it is clear that nuclear retention and cytoplasmic export play critical roles in dictating the ultimate distribution of any RNA species. Third, although it is generally true that most mRNAs are well exported, many are not (Djebali et al., 2012; Bahar Halpern et al., 2015; Bouvrette et al., 2018). Likewise, although there is a general consensus that many lncRNAs are nuclear, it is also clear that several are cytoplasmic, with some studies suggesting that the number of cytoplasmic lncRNAs may be higher than previously thought (Wilk et al., 2016; Bouvrette et al., 2018).

#### WHAT IS THE DEFAULT PATHWAY?

Before addressing the question of what sequence determinants impact nuclear export, it becomes necessary to determine whether an RNA which lacks any distinguishing feature is a substrate for nuclear export. In other words, what is the default pathway – nuclear retention or cytoplasmic export? Three pieces of evidence point to the fact that long RNAs do not require any specialized cis-element for them to be exported from the nuclei of mammalian tissue culture cells.

### Reporter mRNAs

Whether the default pathway for any given long RNA was nuclear retention or cytoplasmic export was up for debate for a number of years, due largely to differences between the nuclear/cytoplasmic distribution of mRNAs derived from different reporter genes (Luo and Reed, 1999; Lu and Cullen, 2003; Masuyama et al., 2004; Nott et al., 2004; Palazzo et al., 2007; Valencia et al., 2008; Lei et al., 2011, 2013; Takemura et al., 2011; McCloskey et al., 2012). For example, it had been observed that certain reporter mRNAs transcribed from cDNAs were not exported, suggesting that in the absence of splicing, mRNAs are nuclear retained (Valencia et al., 2008). This confusion was largely due to the fact that it was unclear whether any particular reporter is truly devoid of cis-elements or other distinguishing features that may promote or inhibit

some cases these mRNP remodeling events render the mRNP more 'translationally' competent (Palazzo and Truong, 2016). Factors that are essential for mRNA export are depicted in red, other export factors are depicted in yellow.

mRNA nuclear export. More recently, we have demonstrated that two widely used reporters, a mini gene derived from the Drosophila fushi tarazu gene (ftz), and the β-globin mRNA, each have nuclear retention elements (Akef et al., 2015; Lee et al., 2015). Importantly, when the newly identified nuclear retention elements were removed, RNAs generated from these reporters were well exported despite the fact that they are not spliced.

repeats of Nups to ferry its cargo across the nuclear pore. GANP, which forms part o the TREX2 complex is also required for export, although its exact role is not understood. After passing through the nuclear pore complex, the mRNP is furthered remodeled by cytoplasmic pore-associated proteins such as Gle1, Dbp5 and Rae1/Gle2. It is though that these remodeling events remove certain nuclear associated exported factors, which are then recycled back into the nuclear pore. In

### mRNAs With Random Sequences

In other experiments it was found that RNAs generated from artificial genes, purported to have "random" sequences, were not exported but were instead rapidly degraded (Dias et al., 2010). One potential problem with completely random sequences is that they contain elevated numbers of CG dinucleotides, which are depleted in vertebrate genomes (Karlin and Mrázek, 1997). In DNA, CG dinucleotides are often methylated, and when these N<sup>5</sup> -methylcytosines undergo spontaneous deamination they are converted to thymidine causing CG dinucleotides to be mutated away in vertebrates (Lindahl, 1993). In contrast, unmethylated cytosines deaminate to uracils, which are efficiently removed by uracil-DNA glycosylase and reconverted back to cytosines. Recently it was found that RNAs with significant numbers of CG dinucleotide are substrates for decay, which would effectively prevent their accumulation in the cytoplasm (Takata et al., 2017). This process likely evolved to protect cells against viral infection. Interestingly, the proteins involved in this decay, ZAP/ZC3HAV1 and TRIM25, are primarily cytoplasmic and are known to be involved in viral RNA degradation.

In the study by Dias et al. (2010), the RNA reporters with "random" sequence were in fact generated from the reverse compliment sequences of intronless genes from the human genome (IFNA1, IFNB1 and HSPB3). As expected, the three constructs have relatively low CG-content (as is true for almost all human-derived DNA); however, all three are predicted to have either 5<sup>0</sup> splice site motifs or 3<sup>0</sup> splice site motifs [scoring ≥ 0.97 according to NNSPLICE 0.9 (Reese et al., 1997)]. These motifs are known to inhibit nuclear export if they are not used for splicing (see The 5<sup>0</sup> Splice Site Motif – Other Intron-Associated Motifs). It is also possible that these transcripts were spliced and that the researchers were detecting the distribution of lariat introns in their experiments. Additionally, it is conceivable that these RNAs may have other nuclear retention elements. Again, interpreting experiments with "random" RNAs is difficult, as unidentified cis-elements may drastically alter the behaviors of these transcripts.

#### lncRNAs

The last piece of evidence which suggests that nuclear export is the default pathway is that when nuclear localized lncRNAs were analyzed, it was observed that they contained nuclear retention elements (Miyagawa et al., 2012; Zhang et al., 2014a; Lubelsky and Ulitsky, 2018; Shukla et al., 2018). When these nuclear retention elements were removed or mutated, the altered lncRNAs were exported. In one extreme case the intronless MALAT1 lncRNA was expressed as a series of small fragments (each ∼1 kb), with the majority of the resulting RNAs being efficiently exported (Miyagawa et al., 2012). This allowed researchers to identify two regions that retain this lncRNA in the nucleus by targeting it to nuclear speckles. Moreover, fusion of either of these two nuclear retention fragments to reporters promotes their nuclear retention (Lubelsky and Ulitsky, 2018; Shukla et al., 2018). Thus, it is likely that lncRNAs like MALAT1 must be actively retained in the nucleus, and in the absence of these factors, the resulting RNAs are automatically exported to the cytoplasm.

Taking in all of these lines of evidence, it is likely that in the absence of any active cis-element, a stable RNA that is capped and polyadenylated is a substrate for nuclear export.

### THE ROLE OF RNA PROCESSING IN NUCLEAR RETENTION AND mRNA EXPORT

In eukaryotes, most functional RNAs are extensively processed. Although very strong processing signals are found in regions of the genome that are used to produce functional RNA transcripts (be they mRNAs or lncRNAs), weaker processing signals are found throughout the genome. Even comparing mRNAs and lncRNAs, the former are typically more efficiently spliced than the latter (Tilgner et al., 2012; Melé et al., 2017; Mukherjee et al., 2017; Deveson et al., 2018). Thus, robust processing is typically a good indication that the RNA transcript in question is functional and likely encoding a protein (Palazzo and Akef, 2012; Palazzo and Lee, 2015). Moreover, many RNA processing machineries directly interact with, and promote the recruitment of, RNA nuclear export factors. This "coupling" between RNA processing and RNA nuclear export has been extensively documented in other reviews (Maniatis and Reed, 2002; Moore and Proudfoot, 2009; Palazzo and Akef, 2012).

### Splicing

Splicing involves the removal of introns by the spliceosome, which in turn can deposit factors onto the newly spliced RNA. By comparing the localization of these spliced RNAs to transcripts synthesized from cDNAs (which lack introns), it has been observed that splicing in some scenarios enhances the extent and the rate of nuclear export (Luo and Reed, 1999; Palazzo et al., 2007; Valencia et al., 2008). The spliceosome directly interacts with many key mRNA nuclear export factors, such as the TREX component UAP56 (Fleckner et al., 1997; Strässer and Hurt, 2001). Indeed, splicing is known to help recruit TREX components to RNAs (Masuda et al., 2005; Dufu et al., 2010; Chi et al., 2013). This phenomenon is, however, not universal. Most cDNA-derived RNAs (which lack introns) are well exported (Palazzo and Akef, 2012) and can recruit TREX and Nxf1/TAP (Taniguchi and Ohno, 2008; Hautbergue et al., 2009; Akef et al., 2013, 2015; Lee et al., 2015). Likely, where splicing matters most is in transcripts that happen to have nuclear retention elements. In some cases, splicing can override their activity (Akef et al., 2015), while in other cases it cannot (Lee et al., 2015). The second scenario is probably true for lncRNAs that are efficiently spliced and yet still retained in the nucleus (Hacisuleyman et al., 2014).

#### 5 <sup>0</sup> Capping

The 5<sup>0</sup> RNA cap is an N<sup>7</sup> -methylguanine connected via a 5<sup>0</sup> to 5<sup>0</sup> triphosphate linkage to the beginning of RNAs which are generated by RNA Polymerase II. This structure recruits the nuclear cap binding complex (CBC), which consists of CBP20 and CBP80. It has been reported that CBP80 can recruit nuclear

export factors, such as the TREX component Aly, to the 5<sup>0</sup> end of spliced (Cheng et al., 2006) and intronless (Nojima et al., 2007) transcripts. More recently it was shown that a functional paralog of CBP80, NCBP3, also interacts with components of the TREX and exon junction complexes (Gebhardt et al., 2015). Importantly, the co-depletion of CBP80 and NCBP3 inhibits mRNA nuclear export (Gebhardt et al., 2015). As such it is clear that the 5<sup>0</sup> cap is a major contributor to the proper export of mRNAs. Whether it is absolutely required is a bit unclear. The incorporation of non-canonical caps (trimethyl-guanosine [3mGpppG], adenosine [ApppG]) does not block the export of certain microinjected intronless RNAs, but does block the export of intron-containing mRNAs (Palazzo et al., 2007). Since the 5 0 cap is also required for splicing (Izaurralde et al., 1994), it is possible that RNAs with cap analogs are inefficiently spliced and are thus actively retained in the nucleus. As detailed below, RNA motifs that are associated with introns are potent nuclear retention signals. Lastly, it has been reported that the export of circular RNAs requires UAP56 (Huang et al., 2018), a core factor of the TREX complex that is required for the export of most mRNAs (Luo et al., 2001; Strässer and Hurt, 2001; Kapadia et al., 2006). This would suggest that TREX-mediated export does not strictly require a 5<sup>0</sup> cap to function.

#### 3 <sup>0</sup> Cleavage and Polyadenylation

The 3<sup>0</sup> end of an RNA Polymerase II-generated transcript is recognized and processed by the cleavage and polyadenylation complex (Chan et al., 2011). Members of this complex interact with Aly (Johnson et al., 2009; Shi et al., 2017), the TREX component THOC5 (Katahira et al., 2013; Tran et al., 2014) and Nxf1/TAP (Ruepp et al., 2009). In line with these studies, the RNAs produced from reporter genes with defective 3<sup>0</sup> cleavage signals are restricted to the nucleus (Dias et al., 2010). It is, however, likely that these RNAs are never released from RNA polymerase due to the lack of cleavage, complicating the interpretation of this observation. In another set of experiments, it was found that microinjected RNAs that lack a poly(A)-tail, but are modified at their 3<sup>0</sup> end to protect the RNAs from degradation are retained in the nucleus (Akef et al., 2013). Again, it is possible that the modification itself, a dialdehyde formed by the oxidation of the free 3<sup>0</sup> end ribose by periodate, may trigger nuclear retention. On the flip side, a GFP reporter that lacks a tail and contains a 3<sup>0</sup> terminal triple helix structure derived from the MALAT1 lncRNA, which stabilizes unpolyadenylated transcripts, is efficiently exported (Wilusz et al., 2012). This observation suggests that either the poly(A)-tail is not strictly required for mRNA export or that this triple helix motif promotes nuclear export, although this element is derived from MALAT1, a nuclear lncRNA. Another observation that suggests that the poly(A)-tail is not absolutely required for export is that circular RNAs, which lack a tail, are efficiently exported in a UAP56-dependent manner (Huang et al., 2018). Finally, histone mRNAs, which do not have a poly(A)-tail, are exported by Nxf1 and do not appear to have any export-promoting cis-elements (Erkmann et al., 2005).

In summary, it is likely that RNA processing helps to promote export; however, results from a variety of case studies (cDNA derived reporters, GFP mRNA with a 3<sup>0</sup> terminal triple helix, and circular RNAs) suggest that these processes are not absolutely required. Again, as most RNAs exist on a spectrum between being fully nuclear and being fully cytoplasmic, RNA processing events may help to move the RNA closer to the cytoplasmic end of this continuum.

### THE ROLE OF RNA NUCLEOTIDE MODIFICATIONS IN NUCLEAR RETENTION AND mRNA EXPORT

It has been known for quite some time that RNA is extensively modified; however, until recently the majority of these studies focused on these modifications within tRNA and rRNA. More recently it has been observed that mRNA and lncRNAs are also modified. Furthermore, some of these modifications appear to impact nuclear export.

### Adenosine to Inosine Editing

Adenosine to inosine editing was the first RNA modification known to affect nuclear export. Specifically, it was observed that double stranded RNA (dsRNA) was a substrate for the RNA specific adenosine deaminase (ADAR), which converts adenosine to inosine (Polson et al., 1991). This reaction occurs specifically in the nucleus and promotes the nuclear retention of these RNAs (Zhang and Carmichael, 2001). Thus RNAs that are prone to forming long dsRNA, including mRNAs with inverted Alu repeats and viruses (Kumar and Carmichael, 1997; Athanasiadis et al., 2004; Blow et al., 2004; Kim et al., 2004; Levanon et al., 2004), are modified and retained in paraspeckles (Chen et al., 2008). In certain cases nuclear retention of inosine-containing mRNAs can also be used to regulate gene expression (Prasanth et al., 2005). Interestingly, this nuclear retention pathway appears to be less active in human embryonic stem cells due to the fact that they do not express the lncRNA NEAT1, which is required for paraspeckle formation (Chen and Carmichael, 2009).

#### Other RNA Modifications

In the last 6 years, it has become clear that other modifications, which were known to occur in tRNA and rRNA, play significant roles in mRNA biology. This includes N<sup>6</sup> -methyladenosine (Dominissini et al., 2012; Meyer et al., 2012), which accumulates near the stop codon, N<sup>5</sup> -methylcytosine (Squires et al., 2012) and N1 -methyladenosine (Dominissini et al., 2016; Li et al., 2016a), which both accumulate near the start codon, and pseudouridine (Carlile et al., 2014; Schwartz et al., 2014; Li et al., 2015), which accumulates in the ORF and 30UTR.

Recently, it has been reported that N<sup>6</sup> -methyladenosine promotes the nuclear export of mRNAs (Roundtree et al., 2017) through the action of the YTHDC1 protein, which directly binds to the modified base and helps to recruit nuclear export factors to the mRNA. This makes sense as depletion of the N6 -methyladenosine demethylase, ALKBH5, enhances overall mRNA export (Zheng et al., 2013). Similarly, N<sup>5</sup> -methylcytosine has also been reported to promote mRNA nuclear export by recruiting Aly to the transcript (Yang et al., 2017). Although N1 -methyladenosine has not been directly linked to export,

this modification is enriched in the 5<sup>0</sup> terminal exon of a particular class of mRNAs (Cenik et al., 2017). These mRNAs have interesting 5<sup>0</sup> terminal exons. Not only are they modified, but they also tend to contain the start codon (in most human genes the start codon is found in internal exons), and are enriched in certain GC-rich motifs that are associated with exon junction complexes (Singh et al., 2012). Typically, exon junction complexes are deposited upstream of all newly formed exonexon splice sites; however, in a subset of genes the exon junction complex also associates with these GC-rich motifs. Importantly, this complex has also been found to bind to nuclear export factors (Le Hir et al., 2001; Singh et al., 2012), although it is not strictly required for export (Palazzo et al., 2007).

In conclusion, RNA modifications that have been reported to promote export may enhance this process, especially if an RNA has nuclear retention elements; however, it is likely that RNA modifications are not absolutely required to promote export.

### THE ROLE OF cis-ELEMENTS IN NUCLEAR RETENTION AND mRNA EXPORT

### The 5<sup>0</sup> Splice Site Motif

Some of the most studied RNA motifs that affect the distribution and stability of mature mRNA are the 5<sup>0</sup> and 3<sup>0</sup> splice site motifs, which specify the boundaries of introns. These are typically removed by the act of splicing. Importantly, these motifs are found in fully processed exported RNAs of many viruses, such as HIV. In its normal life cycle, HIV produces both spliced and unspliced RNAs from the same primary transcript, the latter being used to make late-stage proteins and to generate the RNAbased genome that will be incorporated into new viruses that are assembled in the cytoplasm of the host cell. Importantly, these unspliced RNAs are retained in the nucleus in early stages by the presence of intronic sequences (Chang and Sharp, 1989; Lu et al., 1990; Borg et al., 1997; Séguin et al., 1998). These retention signals can be overcome in late stages by the virally encoded Rev protein, which recognizes the Rev response element, an RNA structure that is present in the late stage RNAs and the viral RNA genome (Chang and Sharp, 1989; Emerman et al., 1989; Tan et al., 1996). In the absence of Rev, the nuclear retention of these RNAs was mediated in part by U1 snRNP, the component of the spliceosome that recognizes the 5<sup>0</sup> splice site motif (Lu et al., 1990) (**Figure 2**). It should be noted that the Rev response element itself also contributes to the nuclear retention of the late-stage viral mRNAs and of the HIV genomic RNA when Rev protein is not present (Brighty and Rosenberg, 1994; Nasioulas et al., 1994).

In other work, it was also demonstrated that when the 5<sup>0</sup> splice site motif was present in the terminal exon of an mRNA, it inhibited expression of the encoded protein. This was due in part to the fact that this element suppresses 3<sup>0</sup> polyadenylation, which in turn targets the mRNA for degradation (Gunderson et al., 1998) (**Figure 2**). This configuration is not only seen in certain viral mRNAs, but also in human mRNAs. For example, a mutation in the LAMTOR2 gene, which is associated with congenital neutropenia, creates a novel 5<sup>0</sup> splice site in the 3<sup>0</sup> UTR that results in the inhibition of gene expression (Langemeier et al., 2012). Importantly, this inhibition is likely due to the recruitment of U1 snRNP to the mature mRNA (Langemeier et al., 2012), through the direct hybridization of the U1 snRNA with the 5<sup>0</sup> splice site. Indeed, when the sequence of the U1 snRNA is altered so that it now base pairs to some other mRNA, these newly targeted transcripts becomes silenced (Fortes et al., 2003; Abad et al., 2008; Goraczniak et al., 2009; Blázquez and Fortes, 2013). A protein component of the U1 snRNP, U1-70K, is required for this inhibition by directly interacting and inhibiting poly(A)-polymerase (Gunderson et al., 1998).

As we stated in the introduction, disentangling the effects of mRNA stability and nuclear retention on the final nuclear/cytoplasmic distribution of an mRNA can be challenging. This is certainly the case with the 5<sup>0</sup> splice site motif which appears to promote both RNA degradation and RNA nuclear retention. To tease these two forces apart, we monitored the level and distribution of newly synthesized reporter mRNAs that contained or lacked a 5<sup>0</sup> splice site motif in its 30UTR. This was accomplished by microinjecting DNAs that were transcribed into each mRNA species, then allowing transcription to proceed for a short amount of time (15–20 min) before halting transcription with α-amanitin, and then monitoring the newly transcribed RNA by fluorescence in situ hybridization at various timepoints after injection. Using this approach we found that about half of newly synthesized reporter mRNAs that contained a 5<sup>0</sup> splice site motif are rapidly degraded, with the remaining fraction being retained in the nucleus as polyadenylated RNAs (Lee et al., 2015) (**Figure 2**). Interestingly, the nuclear retained RNAs accumulate in nuclear speckles, subnuclear regions where post-transcriptional splicing is thought to occur (Dias et al., 2010; Vargas et al., 2011). Indeed, unspliced mRNAs which are generated by the inhibition of the U2 or U4 snRNPs, also accumulate in nuclear speckles (Kaida et al., 2007; Hett and West, 2014). These unspliced RNAs and 5<sup>0</sup> splice site bearing RNAs are likely targeted to nuclear speckles by U1. Then, the subsequent failure to complete the splicing reaction prevents these RNAs from exiting the nuclear speckles. In agreement with these results, the artificial tethering of U1-70K to a reporter RNA prevents its nuclear export, although the authors did not test for nuclear speckle targeting (Takemura et al., 2011). Surprisingly, 5 0 splice site motif-containing mRNAs are still able to recruit UAP56 and Nxf1/TAP (Lee et al., 2015), suggesting that if they could reach the pore, these mRNAs could cross it; however, access to the pore may be prevented by their sequestration into speckles (**Figure 2**). This may explain why many poorly exported mRNAs are also localized to nuclear speckles (Bahar Halpern et al., 2015).

The presence of 5<sup>0</sup> splice site motifs may also be critical for the nuclear retention of many lncRNAs and may help to distinguish them from mRNAs. According to annotated databases of human genes, fully mature lncRNAs, unlike mature mRNAs, are not depleted of 5<sup>0</sup> splice site motifs in their terminal exons (Lee et al., 2015). Even when comparing intronless RNAs, lncRNAs have higher levels of 5<sup>0</sup> splice site motifs than mRNAs (Lee et al., 2015). These numbers may be an underestimate as lncRNAsare

not as efficiently spliced as mRNAs (Tilgner et al., 2012; Melé et al., 2017; Mukherjee et al., 2017; Deveson et al., 2018), with many isoforms containing retained introns due to the inefficient recruitment of spliceosomal factors to 3<sup>0</sup> splice sites (Melé et al., 2017). Interestingly, the corresponding 5<sup>0</sup> splice sites of these inefficiently spliced introns still recruit U1 (Melé et al., 2017), and thus likely promote nuclear retention. Indeed, lncRNA splicing appears to be sloppier than mRNA splicing, with each lncRNA gene producing a multitude of different isoforms with altered splice junctions (Deveson et al., 2018), and this may also cause the inclusion of 5<sup>0</sup> splice site motifs into the mature RNA.

The 5<sup>0</sup> splice site motif also inhibits 3<sup>0</sup> cleavage. When cells were depleted of U1 snRNPs, prematurely truncated mRNAs started to appear (Kaida et al., 2010; Berg et al., 2012; Almada et al., 2013). This was due to a decrease of splicing which led to the appearance of retained introns, which in turn contained cryptic 3<sup>0</sup> cleavage/polyadenylation sites that were inappropriately used by the 3<sup>0</sup> cleavage machinery. Importantly, these truncated RNAs contain intact 5<sup>0</sup> splice site upstream of the new 3<sup>0</sup> end. It was inferred that under normal circumstances the binding of U1 inhibits 3<sup>0</sup> cleavage from any sites in the downstream intron (**Figure 3**). This finding is in agreement with studies of Bovine Papilloma Virus and HIV RNAs where the recruitment of U1 to a 5<sup>0</sup> splice site inhibited proximal 3 0 cleavage events (Furth et al., 1994; Ashe et al., 1995, 1997; Vagner et al., 2000). Similar results were seen with the mutant form of the LAMTOR2 mRNA (Langemeier et al., 2012). It should be noted that in normal situations, suppression of 3 0 cleavage by U1 snRNP helps to perform two tasks: first it represses the misprocessing of mRNAs by preventing the activity of cryptic 3<sup>0</sup> cleavage/polyadenylation sites that are found in introns; secondly, it enforces promoter directionality. In particular, it was found that in bidirectional promoters which generate an unstable short cryptic transcript in one direction and a stable protein-coding mRNA in the other direction, that 3 0 cleavage/polyadenylation consensus sites were enriched in the former, and 5<sup>0</sup> splice site motifs were enriched in the latter (Almada et al., 2013) (**Figure 3**). Under normal conditions the transcriptional elongation of these cryptic transcripts is curtailed by the presence of these 3<sup>0</sup> cleavage/polyadenylation sites. Early 3<sup>0</sup> cleavage promotes RNA degradation, although the exact mechanism in vertebrates remains unclear (Proudfoot, 2016). In contrast, the 5<sup>0</sup> splice site motif present on the opposite transcriptional unit prevents the utilization of any downstream cryptic 3<sup>0</sup> cleavage/polyadenylation site and thus promotes the transcriptional extension and stability of functional RNAs. This arrangement of 5<sup>0</sup> splice site motifs and 3<sup>0</sup> cleavage/polyadenylation sites is sometimes referred to as the U1-PAS axis (PAS stands for polyadenylation sites).

One important complex which may promote nuclear retention and degradation of RNAs that contain 5<sup>0</sup> splice site motifs is the PAXT complex, which consists of the RNA helicase, Mtr4, the zinc finger-containing protein, ZFC3H1, and the nuclear poly(A) binding protein, PABPN1 (Meola et al., 2016) (**Figure 2**). Depletion of Mtr4 or ZFC3H1 resulted in the cytoplasmic accumulation of truncated mRNAs that utilized cryptic 3<sup>0</sup> cleavage sites from intronic regions (Ogami et al., 2017). Mtr4 may promote nuclear retention of these transcripts

stable transcript, 50SS motif are enriched in the sense direction (stable RNA) while 3<sup>0</sup> cleavage sites are enriched in the anti-sense direction (unstable RNA). Under normal circumstances these cryptic unstable RNAs are cleaved and degraded. In the sense direction, the 5<sup>0</sup> splice site motif suppresses the use of downstream cryptic 3<sup>0</sup> cleavage sites, allowing RNA PolII to synthesize the RNA transcript without the recruitment of the 3<sup>0</sup> end processing machinery. These cryptic 3<sup>0</sup> cleavage sites are typically present in introns and are removed during splicing.

by competing with the RNA export adaptor Aly for binding of the 5<sup>0</sup> cap (Fan et al., 2017). Furthermore, Mtr4 is also a coactivator of the nuclear exosome (Schilders et al., 2007), the major RNA degradation machinery in the nucleus, suggesting that PAXT may also target these RNAs for degradation. It is currently unclear how the PAXT complex would recognize its substrates, although one possibility is that it interacts with U1 that is bound to misprocessed mRNAs.

In budding yeast, mRNAs with unspliced introns are also nuclear retained and degraded, and this likely requires an intact 5 0 splice site (Legrain and Rosbash, 1989). This retention activity requires the Mlp1/2 proteins (Galy et al., 2004; Vinciguerra et al., 2005), which form the nuclear basket, a structure that sits on the nucleoplasmic face of the nuclear pore complex. In vertebrates, the nuclear basket protein TPR, which shares some homology with Mlp1/2, is also required for the nuclear retention of intron-bearing mRNAs (Coyle et al., 2011; Rajanala and Nandicoori, 2012). Interestingly, TPR is also required for mRNA export (Shibata et al., 2002; Umlauf et al., 2013; Wickramasinghe et al., 2014), likely by associating with the TREX2 factor GANP (**Figure 1**), which is essential for the nuclear export of most mRNAs (Wickramasinghe et al., 2010; Zhang et al., 2014b).

Finally, it should be pointed out that the nuclear retention of mRNAs harboring retained introns may also be used to regulate gene expression. It has been found that certain regulated mRNAs contain "detained" introns that are poorly spliced, leading to the retention of the transcripts into the nucleus (Boutz et al., 2015; Mauger et al., 2016; Naro et al., 2017). These are typically the last introns in the pre-mRNA, and it is likely that the primary signal for nuclear retention is the presence of a 5<sup>0</sup> splice site motif in these terminal exons. These retained mRNAs are stable and not subject to degradation. However, in response to some signal, these introns are post-transcriptionally spliced, releasing the mRNAs from the nucleus and triggering protein production.

#### Other Intron-Associated Motifs

Besides the 5<sup>0</sup> splice site motif, it has been reported that other sequences that are normally associated with introns also potentiate nuclear retention. Typically, the 3<sup>0</sup> end of an intron is defined by a polypyrimidine track which can be recognized by the polypyrimidine track binding protein (PTB). Association of PTB with mature RNAs is known to inhibit splicing and nuclear export (Yap et al., 2012; Roy et al., 2013). In addition, the 3<sup>0</sup> end of the intron also recruits the splicing factor U2AF65, whose association with a mature RNA also promotes nuclear retention (Takemura et al., 2011). Finally, it also appears that the presence of an intact branch-point sequence in the mature mRNA also promotes nuclear retention in budding yeast (Legrain and Rosbash, 1989; Rain and Legrain, 1997). Thus, it is likely that several different intron-associated elements may help to promote the nuclear retention and decay of RNA.

#### Transposable Element Associated Motifs

The majority of the human genome is composed of dead transposable elements, constituting half to two-thirds of all DNA (Gregory, 2005; de Koning et al., 2011). Although they are numerous, they are rarely found in mature mRNAs and found at moderate levels in lncRNAs (Kelley and Rinn, 2012). When they are present, they usually inhibit nuclear export and promote RNA decay. As described above, if a pair of transposable elements are found in the sense and anti-sense orientation in a single transcript, they can hybridize to form double stranded RNAs. These regions either become substrates for the ADAR enzyme and thus acquire inosine modifications (Chen et al., 2008; Chen and Carmichael, 2008), or are recognized by the RNA binding protein Staufen, which targets these RNAs for decay (Gong and Maquat, 2011; Elbarbary et al., 2013; Park and Maquat, 2013; Lucas et al., 2018). In addition, double stranded RNAs activate the kinase PKR, which then phosphorylates the translation initiation factor eIF2α and thus shuts down

global translation (Clemens and Elia, 1997). Typically, PKR is activated by double stranded viruses, however, it is also known to regulate the processing of certain host mRNAs (Ilan et al., 2017). It remains unclear if PKR activity impacts nuclear export.

It is likely that other features associated with transposable elements are recognized by nuclear retention machinery. It was recently found that the reverse complement of the Alu SINE, a primate-specific transposable element, contains a 42 nucleotide long element, named SIRLOIN, that mediates nuclear retention by recruiting the RNA binding protein, hnRNP K (Lubelsky and Ulitsky, 2018). A similar C-rich motif that contributed to nuclear retention was found in a large analysis of human lncRNAs (Shukla et al., 2018). Since Alu elements are not found outside of primates, lncRNAs must use other elements, especially in non-primates. In addition, it appears that many transposable elements are recognized by particular C2H2 zinc finger proteins (Emerson and Thomas, 2009; Rowe and Trono, 2011; Schmitges et al., 2016), many of which contain not only the capability to bind DNA, but also RNA (Burdach et al., 2012). It has been speculated that when a new transposable element invades a genome, it catalyzes the evolution of novel zinc finger proteins that protects the host. These zinc finger proteins likely repress transposable element activity primarily through transcriptional silencing, although it is also possible that these proteins may help target RNAs for decay or nuclear retention.

### Other cis-Elements That Promote Nuclear Retention

A few other cis-elements that promote nuclear retention have been characterized in the literature. As mentioned above, the Rev responsive element promotes nuclear retention (Brighty and Rosenberg, 1994; Nasioulas et al., 1994). Another example is the AGCCC motif which promotes the nuclear retention of the BORG lncRNA (Zhang et al., 2014a). Although the authors of this study show that the presence of this motif correlated with the nuclear/cytoplasmic distribution of a few lncRNAs and mRNAs, such a sequence would be predicted to be depleted from mRNAs in general; however, in a large genome-wide analysis, we have failed to detect such a depletion (A. F. Palazzo, unpublished observations). This is unlike the 5 0 splice site motif, which is depleted from intronless mRNAs and the 3<sup>0</sup> terminal exons of human mRNAs (Lee et al., 2015).

In many cases, recruitment of certain proteins to the RNA has been linked to nuclear retention, however, it remains unclear whether the simple presence of their RNA-binding motifs promotes retention more broadly throughout the transcriptome. This is true of the Firre lncRNA, whose nuclear retention requires the recruitment of hnRNP U protein (Hacisuleyman et al., 2014). Similarly, it has been reported that the recruitment of hnRNP A2 inhibits nuclear export (Lévesque et al., 2006). Again, a more global analysis of how these factors affect the nuclear/cytoplasmic distribution of all RNAs would be useful in determining whether other nuclear retention elements exist.

Other global analyses have tried to identify nuclear retention/export motifs by sequencing RNA derived from nuclear and cytoplasmic compartments (Bahar Halpern et al., 2015; Bouvrette et al., 2018). Interestingly, both studies found a reasonable number of mRNAs that were poorly exported. Although the distribution of mRNAs with either the nuclear or cytosolic compartment correlated with the association of certain RNA binding proteins, no obvious patterns were discovered. This is in contrast to lncRNAs where the presence of motifs that are either associated with transposable elements (Lubelsky and Ulitsky, 2018; Shukla et al., 2018) or unused splicing signals (Lee et al., 2015; Melé et al., 2017) likely promote widespread nuclear retention. Why would lncRNAs and mRNAs have different mechanisms for their nuclear distribution? One difference may be that nuclear lncRNAs are actively retained while nuclear mRNAs are simply exported to the cytoplasm at a very low rate. This would allow these particular mRNAs to accumulate in the nucleus at high levels. It has been hypothesized that since these large pools of nuclear mRNAs would slowly exit the nucleus, they would supply the cytoplasm with a steady level of mRNA over long periods of time and this could help to buffer the protein translation machinery in the cytoplasm from any wide fluctuations in mRNA production in the nucleus (Bahar Halpern et al., 2015). This may be especially important for genes that experience transcriptional bursts, the sporadic production of many mRNAs in a short interval, followed by periods of inactivity (Larson, 2011). Without this buffering, mRNA levels in the cytoplasm would stochastically increase and decrease over short intervals of time, especially if the mRNA has a short half-life.

A few studies have uncovered large RNA elements that have nuclear retention activity but remain ill-defined. Two of the best examples are the intronless β-globin mRNA and the MALAT1 lncRNA. In the case of β-globin, the nuclear retention activity maps to the last 210 nucleotides of the open reading frame (Akef et al., 2015). This nuclear retention activity can be overcome by either extending the length of the transcript (Akef et al., 2015), including an intron to promote splicing (Valencia et al., 2008; Akef et al., 2015), or by inserting certain export-promoting viral RNA elements (Guang and Mertz, 2005; Chi et al., 2014). Deleting the first or the second half of this 210 nucleotide region does not disrupt nuclear retention, suggesting that there may be multiple sequences that account for this activity. Despite this, the two halves do not share any obvious motif or structure. In the case of MALAT1, its two nuclear speckle targeting regions (termed regions "E" and "M") are also ill-defined (Miyagawa et al., 2012). In the case of region E which is about 1KB in length, elimination of the first or last half disrupts its activity. For region M, its activity maps to 600 nucleotides, but it is disrupted if it is truncated any further. It is likely that the XIST, NEAT1 and TUG1 lncRNAs also have large nuclear retention elements (Lubelsky and Ulitsky, 2018; Shukla et al., 2018). Ultimately, it remains possible that these pieces of RNA contain one or more discrete motifs or structures that have weak nuclear retention activity (Shukla et al., 2018), and that further in-depth studies would be needed in order to better define these elements.

## Cis-Elements That Promote Nuclear Export

Many viral elements are known to promote nuclear export; however, a number of these act to overcome nuclear retention elements such as the presence of unspliced introns. Besides the Rev responsive element (described above), the most well studied is the constitutive transport element (CTE) of type D retroviruses (Bray et al., 1994). This large-structured RNA directly recruits Nxf1/TAP to the transcript (Grüter et al., 1998). Interestingly, the Nxf1 mRNA contains a CTE-like element that can also recruit Nxf1/TAP (Li et al., 2006; Wang et al., 2015). These elements appear to modulate the export of Nxf1 mRNA isoforms that contain a retained intron (and hence a 5<sup>0</sup> splice site). The mRNA is then translated into a short isoform of Nxf1 that may play a role in mRNA trafficking (Li et al., 2016b).

Some mRNAs have been described to have cis-elements that promote nuclear export. mRNAs that encode proteins required for the cell cycle, contain an export promoting element in their 3 <sup>0</sup>UTR which consists of a stem loop structure that recruits the eIF4E protein (Culjkovic et al., 2005, 2006). Intriguingly, the export of these transcripts requires UAP56, but not Nxf1/TAP (Topisirovic et al., 2009). Instead they use the CRM1 nuclear transport receptor, which promotes the export of proteins. The recruitment of HuR to mRNAs and lncRNAs has also been reported to promote their nuclear export (Fan and Steitz, 1998; Noh et al., 2016). Finally, it has been reported that naturally intronless transcripts contain specialized cytoplasmic accumulation region elements (CAR-E), which recruit specific complexes to the RNA (Lei et al., 2011, 2013). Some of the interpretations of these experiments are complicated by the fact that CAR-Es were fused to reporters harboring nuclear retention elements whose activity can be overcome by simply extending the length of the transcript (see Discussion in Akef et al., 2015). Notably, the export of these mRNAs require the TREX component UAP56, which appears to be recruited to reporter RNAs that do not contain any known nuclear export elements (Taniguchi and Ohno, 2008; Akef et al., 2015; Lee et al., 2015). Thus, the functional relevance of these purported exportpromoting elements seems unclear at this time. It is likely that bone fide export-promoting elements, such as the CTE, function by overcoming the activity of nuclear retention elements, such as the ones present in mRNAs with retained introns.

#### NUCLEAR RETENTION AND EXPORT OF RNAS, A FORCE FOR CONSTRUCTIVE NEUTRAL EVOLUTION?

#### The Conversion of Junk RNA to lncRNA

The nuclear/cytoplasmic distribution of RNAs plays an important role in shaping the genomic content of eukaryotes by evolution. In particular, the nuclear retention and degradation of spurious transcripts eliminates much of the harm caused by junk RNA and hence reduces the deleteriousness of cryptic TSSs and intergenic DNA regions that harbor such sites (Palazzo and Akef, 2012; Palazzo and Gregory, 2014; Palazzo and Lee, 2015). As a result, junk DNA and its associated junk RNA are not effectively eliminated by natural selection. It is likely that these non-functional transcripts act as the raw substrates for natural selection and some are converted into novel functional lncRNAs. Thus, in a sense, junk RNA and functional lncRNAs come as a package. The idea that neutral mutations (i.e., intergenic insertions, and the serendipitous creation of cryptic TSSs) create novel entities (i.e., junk RNA) that are subsequently shaped by natural selection to create novel genes (i.e., lncRNAs) is an example of a general process called constructive neutral evolution (Stoltzfus, 1999, 2012; Gray et al., 2010; Lukeš et al., 2011). A key component in this process is the role of the nuclear/cytoplasmic division (and its associated quality control mechanisms) in reducing the deleteriousness of spurious transcription.

So how exactly would junk RNA be converted to lncRNA? Likely, this is a step by step process where new entities are created by non-adaptive processes and then acquire functions which can be selected for by natural selection. One example is presented in **Figure 4**. First, random mutations create and destroy cryptic TSSs. These sites are engaged by RNA polymerase II which not only generates unstable ncRNAs, but also recruits histone modification enzymes that alter chromatin packaging downstream from the TSS (van Werven et al., 2012; Castelnuovo et al., 2013; Kim et al., 2016; Woo et al., 2017). If the resulting altered histone modifications impart some benefit by regulating nearby genes in a way that improves the fitness of the organism, then the transcriptional event and its cryptic TSS will be selectively retained. Eventually the ncRNA generated from these loci, which is initially a by-product, may act as a platform to help assemble chromatin remodeling complexes in the vicinity of their target genes. In this way, the ncRNA acquires a novel function over time and is thus converted into a lncRNA (**Figure 4**).

This conversion process may frequently occur in tissues that have a high amount of spurious transcription, such as in developing spermatids (Kaessmann, 2010; Jandura and Krause, 2017). During sperm development, DNA is unpackaged from histones and then repackaged into protamines. This transiently exposed DNA can act as a non-specific substrate for RNA polymerases causing high levels of spurious transcription. Once a ncRNAs acquires some associated function in the testes, it can subsequently be expressed in other tissues. This is known as the "out of the testes" hypothesis (Kaessmann, 2010; Jandura and Krause, 2017).

#### The Conversion of Misprocessing to Alternative Processing

The nuclear/cytoplasmic distribution and degradation of RNA also facilitates the evolution of alternative splicing. In particular, by retaining and degrading misprocessed mRNAs, they are not efficiently translated into proteins and do not cause much harm to the organism. This reduces the deleteriousness of splicing and polyadenylation errors and prevents their elimination by natural selection. This may explain why splicing appears to be inherently sloppy in mammalian cells. In support of this idea, it has been widely noted that although most genes are alternatively spliced, they typically give rise to only one polypeptide (Tress et al., 2017), suggesting that many spliced

isoforms are not translated due to their degradation and/or nuclear retention. As such, nuclear/cytoplasmic distribution and degradation of RNA prevents the elimination of cryptic splice site motifs and any other splicing-regulating elements that may appear by random mutation in the genome. These elements then act as the raw substrates necessary for the

evolution of functional alternative splicing events. This is another example of constructive neutral evolution in action. In this case the newly created entities are splice sites and/or elements that regulate splicing, which are rendered effectively neutral by the RNA nuclear retention and degradation machinery, and these provide the raw substrates for the evolution of alternatively spliced isoforms. A similar process can be invoked for the evolution of 3<sup>0</sup> cleavage/polyadenylation sites.

#### CONCLUSION

Results from ENCODE point to a wide diversity of nuclear/cytoplasmic distribution for many different types of RNA molecules (Djebali et al., 2012; Palazzo and Gregory, 2014). Over the past few years, we have gained a fuller picture of the rules that dictate RNA distribution to these two compartments. We have established that any stable long RNA is a substrate for nuclear export unless it contains a nuclear retention element. Undoubtably, splicing and other RNA processing events further enhance nuclear export. In addition, RNA modifications also play an important role in this process. Although our understanding of

#### REFERENCES


the major components that drive export are well known, we still must identify nuclear retention complexes and determine their mode of action to obtain a full picture of how the nuclear and cytoplasmic transcriptomes are achieved.

#### AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

#### FUNDING

This work was supported by a grant from the Canadian Institutes of Health Research to AP (FRN 102725).

#### ACKNOWLEDGMENTS

We would like to thank J. Wan and C. Smibert for comments on the manuscript.






**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Palazzo and Lee. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Global Transcriptome Analysis During Adipogenic Differentiation and Involvement of Transthyretin Gene in Adipogenesis in Cattle

Hanfang Cai<sup>1</sup>† , Mingxun Li1,2† , Xiaomei Sun1,2, Martin Plath<sup>1</sup> , Congjun Li<sup>3</sup> , Xianyong Lan<sup>1</sup> , Chuzhao Lei<sup>1</sup> , Yongzhen Huang<sup>1</sup> , Yueyu Bai<sup>4</sup> , Xinglei Qi<sup>5</sup> , Fengpeng Lin<sup>5</sup> and Hong Chen<sup>1</sup> \*

<sup>1</sup> College of Animal Science and Technology, Northwest A&F University, Yangling, China, <sup>2</sup> College of Animal Science and Technology, Yangzhou University, Yangzhou, China, <sup>3</sup> Animal Genomics and Improvement Laboratory, United States Department of Agriculture-Agricultural Research Service, Beltsville, MD, United States, <sup>4</sup> Animal Health Supervision in Henan Province, Zhengzhou, China, <sup>5</sup> Biyang Bureau of Animal Husbandry of Biyang County, Biyang, China

#### Edited by:

Pascal Chartrand, Université de Montréal, Canada

#### Reviewed by:

Sergio Verjovski-Almeida, Universidade de São Paulo, Brazil Xiaofei Cong, Eastern Virginia Medical School, United States Yongsheng Yu, Jilin Academy of Agricultural Sciences (CAAS), China

#### \*Correspondence:

Hong Chen chenhong1212@263.net; chfsci2018@gmail.com

†These authors have contributed equally to this work

#### Specialty section:

This article was submitted to RNA, a section of the journal Frontiers in Genetics

Received: 15 July 2018 Accepted: 21 September 2018 Published: 18 October 2018

#### Citation:

Cai H, Li M, Sun X, Plath M, Li C, Lan X, Lei C, Huang Y, Bai Y, Qi X, Lin F and Chen H (2018) Global Transcriptome Analysis During Adipogenic Differentiation and Involvement of Transthyretin Gene in Adipogenesis in Cattle. Front. Genet. 9:463. doi: 10.3389/fgene.2018.00463 Adipose tissue plays central role in determining the gustatory quality of beef, but traditional Chinese beef cattle have low levels of fat content. We applied RNA-seq to study the molecular mechanisms underlying adipocyte differentiation in Qinchuan cattle. A total of 18,283 genes were found to be expressed in preadipocytes and mature adipocytes, respectively. 470 of which were significantly differentially expressed genes (DEGs) [false discovery rate (FDR) values < 0.05 and fold change ≥ 2]. In addition, 4534 alternative splicing (AS) events and 5153 AS events were detected in preadipocytes and adipocytes, respectively. We constructed a protein interaction network, which suggested that collagen plays an important role during bovine adipogenic differentiation. We characterized the function of the most down-regulated DEG (P < 0.001) among genes we have detected by qPCR, namely, the transthyretin (TTR) gene. Overexpression of TTR appears to promote the expression of the peroxisome proliferator activated receptor γ (PPARγ) (P < 0.05) and fatty acid binding Protein 4 (FABP4) (P < 0.05). Hence, TTR appears to be involved in the regulation of bovine adipogenic differentiation. Our study represents the comprehensive approach to explore bovine adipocyte differentiation using transcriptomic data and reports an involvement of TTR during bovine adipogenic differentiation. Our results provide novel insights into the molecular mechanisms underlying bovine adipogenic differentiation.

Keywords: adipocyte differentiation, alternative splicing, cattle, differentially expressed genes, transthyretin, transcriptome analysis

### INTRODUCTION

In livestock (cattle, sheep, pigs, and others), there are four major adipose depots – visceral, subcutaneous, intermuscular, and intramuscular fat tissues – which develop by a process called adipogenesis (Fernyhough et al., 2005; Hausman et al., 2009). Their occurrence during ontogeny follows the sequence of visceral tissue first, followed by subcutaneous, intermuscular, and eventually intramuscular fat tissues (Hausman et al., 2009). In cattle, adipocytes are formed in visceral

**46**

and subcutaneous tissues at the start of the second trimester during gestation (Fève, 2005; Gnanalingham et al., 2005; Muhlhausler et al., 2007). After 180 days of gestation, adipocytes are barely detected in the intermuscular fat (Du and Dodson, 2011). From an agricultural perspective, the amount and distribution of fat in beef cattle and other farm animals determines the quality of the overall carcass and meat quality (Powell and Huffman, 1973; Wheeler et al., 1994; Lozeman et al., 2001). Research on bovine fat tissue formation, therefore, not only provides general insights into the regulatory processes underlying mammalian adipogenesis, but, also provides invaluable information for breeding programs aimed at improving the beef.

Until now, studies using preadipocyte cell lines obtained from humans (Green and Kehinde, 1975) and mice (Green and Kehinde, 1976; Negrel et al., 1978) have identified several factors that play a role during adipogenesis, such as PPARγ, the CCAAT/enhancer-binding protein (CEBP) family, growth factors, and other cell factors (McLaughlin et al., 2007). In contrast to the wealth of knowledge obtained from studies on murine and human cell lines, the regulatory mechanisms of bovine have received comparatively little attention. Our study on bovine adipogenesis was motivated by the observation that Chinese beef cattle have a low intramuscular fat content and insights into the molecular mechanisms involved in the regulation of adipogenesis may help during assisted breeding programs that use characterization of potential breeding stock to improve overall meat quality.

In recent years, high-throughput sequencing of coding and non-coding RNAs (RNA-seq) has been increasingly applied to unravel the complex molecular mechanisms underlying various biological processes (Mortazavi et al., 2008). Using RNA-seq allows linking changes in gene expression (i.e., mRNA abundance) to the physiological state of tissue under examination, provides comprehensive data from which global gene networks can be constructed, and identifies novel transcriptional unites altered during developmental processes or diseases (Grant et al., 2011). Several studies have reported on transcriptional characteristics related to adipose tissue development using oligo (dT) primers to sequence mRNA. As a consequence, transcripts without a polyA-tail and partial degraded mRNAs are underrepresented in previous studies on mammalian asipogenesis (Zhou et al., 2014; Huang et al., 2017; Zhang Y.Y. et al., 2017). The Ribo-Zero RNA-seq method provide an alternative that can detect mRNAs with and without polyA-tail from intact or fragmented RNA samples, which overcomes the shortcomings of traditional RNA-seq.

In this study, we selected TTR gene, which is significantly differently expressed between preadipocytes and adipocytes, as candidate to primarily explore its role in bovine adipogenic differentiation. TTR is one member of prealbumins (Ingbar, 1958). It is highly conserved among vertebrate species (Schreiber and Richardson, 1997; Power et al., 2000). It is a famous carrier protein, which helps to transport thyroid hormones in plasma and cerebrospinal fluid. It also transport a binding part for RNA binding protein 4 (Raz and Goodman, 1969; Vieira and Saraiva, 2014). Except that, TTR is involved in some intracellular processes, such as proteolysis (Costa et al., 2008), nerve regeneration (Fleming et al., 2007, 2009), and autophagy (Vieira and Saraiva, 2013). Most researches about TTR is focused on the association between its mutations and diseases, such as amyloidosis (Jacobson et al., 1997; Coelho et al., 2013), hereditary (Coelho et al., 2013), and hyperthyroxinemia (Saraiva, 2001). Several reports display serum TTR is associated with body mass and obesity (Cano et al., 2004; Klöting et al., 2007; Zemany et al., 2014). However, little is known about the molecular terms the role of TTR in adipose development.

The aim of our present study were to compare the transcriptome profiles of preadipocytes and adipocytes using Ribo-Zero RNA-seq to gain insight into the potential molecular mechanisms underlying preadipocyte differentiation in beef cattle. For our study we used Qinchuan cattle, which are famous beef cattle native to China. The results of our study not only serve as a basis for further studies on bovine adipogenesis in China.

### MATERIALS AND METHODS

All experiments were approved by the Review Committee for the Use of Animal Subjects of Northwest A&F University. All experiments were performed in accordance with relevant guidelines and regulations.

### Bovine Preadipocyte Isolation, Adipogenic Differentiation, and Treatment

In brief, tissue separation method was carried out to isolate the preadiocytes from fat tissue. The inguinal subcutaneous fat was separated from two 1 year old male Qinchuan cattle immediately after they were slaughtered. These cattle were raised and slaughtered in Qinbao Animal Husbandry Co., Ltd, which is a cattle breeding and slaughtering corporation in Xi'an, Shaanxi province. The adipose tissue was transported to the laboratory in phosphate-buffered saline (PBS) with 300 IU/mL penicillin and 300 µg/mL streptomycin at room temperature. The tissue was successively washed with 70% alcohol for 1–2 min and three times with PBS. The outer layer was separated and remainder was twice washed with PBS including 100 IU/mL penicillin and 100 µg/mL streptomycin, and finely chopped into 1-mm<sup>3</sup> . The tissue pieces were evenly placed onto the bottom surface of a culture bottle containing growth medium (GM), which contains high glucose DMEM with 20% fetal bovine serum (FBS), 100 IU/mL penicillin and 100 µg/mL streptomycin. The samples were then incubated at 37◦C in a humidified atmosphere containing 5% CO2. The GM should be replaced every other day.

After reaching 100% confluence, preadipocyte differentiation was induced by adipogenic agents composing of 10 µg/mL insulin, 0.5 mM 3-isobutyl-1-methylxanthine (IBMX) and 1 µM dexamethasone for 2 days. The cells were then incubated with 10 µg/mL insulin, changing the medium every second day.

For Oil Red O staining, cells were washed with PBS and then fixed with 4% paraformaldehyde for 1 h at 4◦C. After washing twice with PBS, the cells were stained with Oil Red O solution (0.3% Oil Red O, 60% isopropanol, and 40% PBS) for 20 min. In

order to evaluate the amount of lipid droplets, isopropyl alcohol was used to elute the lipid droplets, and then the OD values were measured by UV spectrophotometer at 490 nm.

In TNFα treatments, mature adipocytes were treated with different concentrations of TNFα (2.5, 5, 10, and 20 ng/mL) for 12 h after being cultured with serum-free medium. Then cells were collected for RNA extraction and cDNA preparation.

#### RNA Extraction and cDNA Libraries Construction

Samples, including two groups of cells (two wells of undifferentiated adipocytes and two wells of adipocytes cultured by adipogenic agents for 13 days), were collected for sequencing. Consequently, four cDNA libraries were constructed (preadipocyte-1, preadipocyte-2, adipocyte-1, and adipocyte-2). Total RNA was extracted from cells using TRizol reagent (Life Technologies, United States) according to the instructions. Quality was monitored by NanoDrop ND-1000 and Agilent Bioanalyzer 2100 (Agilent Technology, United States). The RNA was purified by RNeasy Micro kit (QIAGEN, Germany) and RNase-Free DNase Set (QIAGEN, Germany). rRNA was removed using Ribo-Zero rRNA Removal Kits (Epicentre, United States) and then rRNA-depleted mRNA was fragmented as a template for the first- and second-strand cDNA synthesis. These short fragments were purified with Quit dsDNA HS Assay Kit (Invitrogen, United States) and connected with different ligate adapters for sequencing.

### High-Throughput Sequencing and Data Analysis

Each of the four libraries was sequenced by Shanghai Biotechnology Corporation (Shanghai, China) using Illumina HiSeqTM 2500. The sequencing quality was checked using FastQC (Andrews, 2015). Pre-processing and assembly of raw sequence data, including removal of the adapter sequences, low-quality sequences, sequences shorter than 20 nucleotides, and other nucleotides, were performed using the FASTX-Toolkit (version 0.0.13). Clean reads were then mapped to the Bos Taurus genome<sup>1</sup> using Tophat with three base mismatches allowed. The Cufflinks program (version 2.1.1) was used to calculate the expression of transcripts. Data was normalized by calculating the FPKM for each gene. The mapping results were compared to the known gene recorded in the database using Cufflinks compare (Henze et al., 2008). Those not overlapped with known genes were regarded as potential novel genes.

### Identification of AS

The astalavista (version: 3.2), a server extracts and displays AS events from a given genomic annotation of exon-intron gene coordinates, was applied to detect the AS sites and models in final transcriptome assembly, which was achieved using cufflinks (Foissac and Sammeth, 2007; Sammeth et al., 2008). And then mixture-of-isoforms (MISO) was carried out to quantitate the expression level of alternatively spliced isoforms.

### GO Annotation and Pathway Enrichment Analysis of DEGs and Construction of Protein Interaction Network

The fold-change and P-value, which was decided by FDR using Fisher-test, of each gene in two groups were calculated. The foldchange ≥ 2 and FDR < 0.05 were considered as the threshold to distinguish the significance of gene expression differences. GO<sup>2</sup> and KEGG<sup>3</sup> , which are evaluated by the DAVID software (Huang et al., 2009a,b), are major bioinformatics methods to unify the representation of genes and gene products attribute across all species (Ashburner et al., 2000). The corrected P-value ≤ 0.05 was taken as the significance threshold. The DEGs that were enriched to the top three pathways were clustered in STRING 10.0 (Szklarczyk et al., 2015) and used to construct a protein interaction network by Cytoscape 3.4 (Shannon et al., 2003).

#### qPCR

To validate the high-throughput sequencing data, in addition to the cDNA libraries used in the RNA-seq, two more libraries in each group were constructed, qPCR was performed to confirm the transcriptional levels of DEGs that had been identified as being significantly differently expressed between undifferentiated and differentiated adipocytes. Total RNA was extracted from undifferentiated and differentiated adipocytes using Trizol kit (Takara, Japan). cDNA was synthesized as template in qPCR according to PrimeScript RT Reagent Kit (Perfect Real Time) (Takara, Japan). Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) gene was chosen as internal control. The primers used are shown in **Supplementary Table S5**. PCR was carried out in a CFX96TM Real Time detection system with SYBR premix ExTaq II (TaKaRa, Japan) following manufacturer's instruction. All samples were measured in triplicate and a negative control with water as template was included. The relative expression ratios were calculated with the following formula 2 <sup>−</sup>11Ct as Schmittgen and Livak described (Schmittgen and Livak, 2008).

#### Construction and Cell Transfection

The CDS region of bovine TTR gene (GenBank Accession Number: NM\_173967.3) was cloned from adipocytes using the forward primer: 5<sup>0</sup> -CGGGGTACCATGGCTTCCTTCCGTCTG TTCC-3<sup>0</sup> and the reverse primer: 5<sup>0</sup> -GCTCTAGATCACGC CTTGGGACTGCTGA-3<sup>0</sup> , then recombined into the pcDNA3.1 (+) plasmid vector between the KpnI and XbaI (TaKaRa, Dalian, China) restriction sites to construct the overexpression vector of bovine TTR gene (OV-TTR). The empty pcDNA3.1 (+) plasmid without any insertion fragment was set as negative control (OV-NC). The plasmids were transfected into cells using Lipofectamine 2000 (Invitrogen, United States) according to the manual. The cells were seeded into 12-well plates in triplicate and transfected with OV-TTR, OV-NC on 7 day after adipogenic induction, respectively. On 9 day and 11 day post-adipogenic induction, the cells were collected for RNA and protein extraction

<sup>1</sup>http://ftp.ncbi.nlm.nih.gov/genomes/Bos\_taurus/Assembled\_chromosomes/seq

<sup>2</sup>http://www.geneontology.org/

<sup>3</sup>http://www.kegg.jp

using RNAplus (Takara, Japan) and radio immunoprecipitation assay (RIPA) with 1 mM phenylmethanesulfonyl fluoride (PMSF) (Solarbio, China), respectively.

#### Western Blot

fgene-09-00463 October 16, 2018 Time: 19:31 # 4

Proteins were separated in 12% SDS-PAGE gel. The primary antibody, mouse monoclonal anti-PPARγ2 was purchased from Santa Cruz Biotechnology (Santa Cruz, CA, United States), mouse monoclonal anti-FABP4 and mouse monoclonal antiβ-actin were obtained from Sangon (Shanghai, China). The second antibody was horseradish peroxidase-conjugated ECL goat anti-mouse IgG. After being treated with ECL Plus (Solarbio, China), the protein bands were figured by ChemiDoc XRS + system (Bio-Rad Laboratories, United States).

#### Statistical Analysis

The significance of differences in expression level were calculated by Student's t-test in SPSS software (Version 20). The results were presented as mean ± SE (Standard Error), and P-value < 0.05 was considered statistically significant.

#### RESULTS

#### Bovine Adipocytes Culture

After the tissue pieces were seeded for 2 days, bovine preadipocytes gradually spread out around the tissue pieces and showed spindle or polygon morphologies. Cells proliferated rapidly at days 4 and 5. On the 8th day, the confluence reached at 100% (**Supplementary Figure S1A**). The primary adipocytes were then collected for subculture.

After the preadipocytes had proliferated to 100% confluence, the cells were induced by adipogenesis agents. On the 3rd day of differentiation, lipid droplets began to form and increased gradually in numbers. Starting around 6th day, the size of lipid droplets became bigger and bigger. Oil Red O staining confirmed the formation of lipid droplets on the 13th day after inducing differentiation (**Supplementary Figures S1B,C**).

In order to validate the cultured cells were adipose cells, the expression of preadipocyte marker gene, preadipocyte factor 1 (Pref-1) (Wang et al., 2006) was validated by using reverse transcriptional PCR (RT-PCR) (Primers in **Supplementary Table S5** and**Supplementary Figure S1D**). In addition, mRNA expression profiles of four marker genes of mature adipocytes, PPARγ, CEBPα, FABP4, and lipoprotein lipase (LPL) (Ntambi and Young-Cheul, 2000) were evaluated by quantitative realtime quantitative PCR (qPCR) (**Supplementary Figure S1F**). As predicted, Pref-1 was mainly expressed in preadipocytes, while the other four genes were more expressed after differentiation.

### Deep Sequencing of RNAs

Before Ribo-Zero ribonucleic acid sequencing, the RNA samples extracted from two preadipocyte samples and two adipocyte samples were examined firstly. The OD260/OD280 ratios of these four samples were all >1.8 and the amount of them were enough for sequencing. Accordingly, four cDNA libraries were successfully constructed; from each of these libraries, more than 74 M clean reads were obtained. The ratios, bases of quality ≥20 to all bases of sequencing, were all >96%, suggesting that our sequencing results were reliable and suitable for in-depth statistical analysis.

Averages of up to 95.75% (preadipocytes) and 97.6% (mature adipocytes) clean reads ratio were acquired. Correspondingly, average 90 and 89.4% mapping ratio were achieved, respectively. Among them, 83.1 and 82.7% of reads were uniquely mapped to reference genome (CNCI Bos\_taurus\_4.6.1), respectively. More than 89% of the clean reads were mapped to genic regions of the genome (**Supplementary Figure S2**).

### Gene Expression Patterns

We determined global levels of gene expression profiles in preadipocytes and adipocytes and used FPKM-value (fragments per kilobase of exon model per million mapped reads) to compare expression profiles between different genes and between both cell types. Altogether, we found 18,283 genes to be expressed in preadipocyte and adipocytes. We also detected genes that were specifically expressed at only one developmental stage, 779 of which were unique to preadipocytes and 1,082 to adipocytes (**Figure 1A**). After mapping, 1,331 genes could not be mapped to known genes and were assembled as potential novel genes (**Figure 1C** and **Supplementary Table S1**). 71.5% of them had

only one exon, and 68.1% of them were shorter than 2 Kb (**Supplementary Figure S3**).

Alternative pre-mRNA processing can produce multiple transcript with distinct or similar functions from a single genomic locus. These gene isoforms have important regulatory functions in the development of diverse cell types (Black, 2003; Kim et al., 2004). We evaluated the occurrence and abundance of transcript isoforms, and detected a total of 19,108 transcript isoforms (18,362 in preadipocytes and 18,350 in adipocytes, respectively) were detected. 758 transcript isoforms were only presented in preadipocytes and 746 were unique to adipocytes, respectively (**Figure 1B**).

### AS Detection

Among all the genes detected, up to 6411 genes, approximately 32% were found to be alternatively spliced. However, single sample analysis revealed that 938 genes were uniquely alternative spliced in adipocyte, while the AS events of 881 genes only happened in preadipocytes. It was noteworthy that the number of genes undergoing AS events on chromosome 3 (1105) was the largest.

Previous study revealed that skipped exon (SE), alternative 3<sup>0</sup> SS selection (A3SS), alternative 5<sup>0</sup> SS selection (A5SS), retained intron (RI), and mutually exclusive exons (MEX) are the majority AS events (Li et al., 2013). The corresponding values were 1607, 1309, 987, 584, and 47 in preadipocytes, respectively, and 1728, 1447, 1240, 687, and 51 in adipocytes, respectively (**Figure 2A**). SE was the most frequent in both preadipocytes and adipocytes and AS events occurred more frequently in adipocytes than they did in preadipocytes. Interestingly, Venn diagrams illustrate a series of overlapping genes among five types of AS in both preadipocytes and adipocytes, and their distribution is different between preadipocytes and adipocytes. Furthermore, 11 and no genes exhibited all types of AS in the preadipocytes and adipocytes, respectively (**Figure 2B**).

### Function Annotation of DEGs and Construction of Protein Interaction Networks

Based on FDR values < 0.05 and fold change ≥ 2, of the 470 DEGs shown in Figure 244 down-regulated and 226 up-regulated genes showed significant changes in transcript abundance when comparing expression profiles of adipocytes to those of preadipocytes (**Figure 3A** and **Supplementary Table S2**). The strength of change in gene expression (fold changes) of 100s of DEGs were ≥ 10, alluding to their potential involvement in bovine preadipocyte differentiation (**Figure 3B**). Also, the expression levels of 817 AS isoforms were different among adipocytes and preadipocytes.

To gain into the potential biological processes related to bovine adipogenesis in which DEGs are involved, we performed analysis of Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. In total, 305 GO terms were assigned to the 470 DEGs, and 219 terms were significantly enriched (P < 0.05). Among these 219 terms, 150 were corresponding to biological process, 24 to cellular component, and 45 to molecular function, respectively (**Supplementary Table S3**). Considering biological processes involved DEGs, we found sterol metabolic processes, vasculature development, and cholesterol metabolic processes, and DEGs were mainly enriched in extracellular components (**Figure 4A**). Interestingly, the top 10 categories related to biological processes in which DEGs were enriched were all related to adipose tissue development, obesity, and energy metabolism (**Supplementary Table S3**). Among these, the category, lipid biosynthetic processes was remarkable as the greatest number (n = 26) of DEGs were enriched in this category, including LPL, fatty acid synthase (FASN), 24-dehydrocholesterol reductase (DHCR24) and fatty acid desaturase 2 (FADS2). Considering carbohydrate binding, which is associated with fat metabolism, we detected an enrichment of C-type lectin domain family 3, member B (CLEC3B), fibronectin 1 (FN1), thrombospondin 1 (THBS1), pleiotrophin (PTN), and others. Most of these DEGs were upregulated during the differentiation of preadipocytes into mature adipocytes.

Considering KEGG pathways, we found a total of 196 pathways to be assigned to the DEGs, whereby 17 pathways were significantly enriched (P < 0.05, **Figure 4B** and **Supplementary Table S4**). The top 3 pathways in which the DEGs were primarily involved were steroid biosynthesis, extracellular matrix (ECM)– receptor interactions and the PPAR signaling pathway. The most significantly and uniquely enriched pathway for up-regulated DEGs was lipid acid metabolism and for down-regulated DEGs was steroid biosynthesis. Seven pathways related to fatty acid metabolism were also significantly enriched (**Supplementary Table S4**).

We constructed a protein interaction network (**Figure 5**), in which integrin subunit beta 3 (ITGB3), collagen type I alpha 1 chain (COL1A1), collagen type XI alpha 1 chain (COL11A1), collagen type III alpha 1 chain (COL3A1), and collagen type I alpha 2 chain (COL1A2) were the most important interaction partners. Those proteins directly or indirectly interacted with several other DEGs and may be involved in regulatory cascades related to adipogenesis.

### Identification of TTR as Candidate Gene

We screened the RNA-seq data for genes that were previously reported to be involved in preadipocytes differentiation, including LPL, FABP4, and CEBPα. As predicted, our RNAseq results indicated that the expression levels of those genes increased during bovine preadipocyte differentiation. To validate the results obtained from RNA-Seq, we selected 16 DEGs and/or genes previously reported to be associated with adipogensis for qPCR, namely fibronectin1 (FN1) (Duarte et al., 2014), secreted protein, acidic and rich in cysteine (SPARC) (Rodríguez-Alvarez et al., 2010), collagen type III alpha 1 chain (COL3A1) (Duarte et al., 2014), angiopoietin like 2 (ANGPTL2), thrombospondin 1 (THBS1) (Blanco et al., 2012), TTR, legumain (LGMN), glutathione peroxidase 3 (GPX3), platelet derived growth factor receptor beta (PDGFRB), gap junction protein, alpha 1 (GJA1), nephroblastoma overexpressed (NOV), FABP3 (Boutinaud et al., 2013), adiponectin C1Q and collagen domain containing (ADIPOQ) (Zhang et al., 2014), secreted frizzled-related protein

4 (SFRP4) (Jeong et al., 2013), FABP4 and CEBPα, were selected for expression detection. All 16 genes showed significant expression changes between preadipocytes and adipocytes (P < 0.05) and 13 of them have the same expression trend as inferred from the RNA-seq analysis, while the expression trends of NOV, THBS1, and FABP3 in qPCR do not agree with those in RNA-seq data (**Figure 6**). In addition, the Pearson correlation between RNA-seq and qPCR data using these 13 genes for which the RNA-seq and qPCR measurements agree was calculated. It shows there is a significantly positively correlation between RNA-seq and qPCR data (r = 0.933, P < 0.01).

Among the DEGs validated by qPCR, the ADIPOQ gene and TTR gene showed the greatest changes of expression levels (P < 0.001, respectively), with almost 20-fold changes (**Figure 6B**). Since the effects of TTR during bovine preadipocyte differentiation has not been characterized yet, it was chosen as the candidate gene for subsequent experimentation.

#### Involvement of TTR on Bovine Adipogenic Differentiation

As soon as the preadipocyte was induced to differentiation, the expression level of TTR showed a significant decrease (**Figure 7A**). When we overexpressed TTR on 9 day after adipogenic induction, we observed strongly increasing mRNA (P < 0.05) and protein levels of PPARγ and FABP4 (**Figures 7B,E**). And on 11 day post-adipogenic induction, significant rising of expression levels of mRNA and protein of PPARγ and FABP4 were observed after overexpression of TTR (**Figures 7C,E**), suggesting that the overexpression of TTR promote bovine adipogenic differentiation. And we also found that overexpression of TTR slightly promotes the formation of lipid droplets (**Supplementary Figure S4**). Hence, TTR appears to be involved in the control of bovine aidpogenesis.

Tumor Necrosis Factor α (TNFα) is a cytokine that is associated with adipose tissue development (Arner et al., 2010). As shown in **Figure 7D**, increasing concentrations of TNFα significantly reduced the expression of TTR (P < 0.05), further corroborating an involvement of TTR in bovine adipogenesis.

#### DISCUSSION

Different types of adipose depots show distinct mechanisms of lipid accumulation (Fernyhough et al., 2005; Hausman et al., 2009). Most of animal's storage lipids accumulate in the visceral and subcutaneous adipose tissue layer (Dodson et al., 2010).

Subcutaneous fat depots determine meat quality, as the degree to which fat is stored in the subcutaneous fat layer correlates negatively with the extent of meat marbling (i.e., intramuscular fat deposition) and because subcutaneous fat itself serves as a quality assessment criterion (Jeremiah, 1996). Previous research related to the meat quality of beef cattle was motivated by the idea that reducing subcutaneous fat depots will bring about increased intramuscular fat depots (Underwood et al., 2008). Accordingly, previous research to establish a protocol for the isolation, culture, and induction of differentiation, of primary adipocytes typically used subcutaneous fat tissue samples, which are also comparatively easy to collect (Lengi and Corl, 2010; Song et al., 2010). Our present study reports on the culture of preadipocytes (and their differentiation into mature adipocytes) isolated from the subcutaneous fat tissue. In recent years, Ribo-Zero RNAseq has been established as an efficient method to explore the transcriptional characteristics, e.g., during developmental processed (Adiconis et al., 2013; Zhao et al., 2014; Sun et al.,

Right two panels: preadipocytes. Red indicates high expression; blue indicates low expression.

2015). Ribo-Zero RNA seq avoids rRNA interference and shows a high degree of technical reproducibility (Trapnell et al., 2012; Adiconis et al., 2013). In contrast to methods of RNA-Seq that prepare libraries based on poly-A enrichment, Ribo-Zero RNAseq captures mRNA with or without a poly-A tails, allowing for more complete views on transcriptomic changes during development (Adiconis et al., 2013; Zhao et al., 2014; Sun et al., 2015). So our study provides large amounts of information for future studies on the regulatory mechanisms underlying adipogensis in beef cattle.

AS is an essential mechanism in post-transcriptional regulation and leads to protein diversity, which generates multiple different mRNAs and downstream proteins from a single gene through the inclusion or exclusion of specific exons (Pan et al., 2008; Burgess, 2012). Previous reports indicated that RI was the most common event in numerous species (Keren et al., 2010; Barbosa-Morais et al., 2012; Li et al., 2013; Wang et al., 2016). However, in this study, we found SE was the most


FIGURE 5 | Protein interaction networks encoded by DEGs enriched in the top 3 KEGG pathways. Red and green points represent genes up-regulated and down-regulated, respectively.

common. It might be due to the reason that AS patterns varied across species, tissues types, and developmental stages (Keren et al., 2010; Barbosa-Morais et al., 2012; Li et al., 2013; Wang et al., 2016). Furthermore, a large number of genes found in this study underwent multiple types of AS differently during preadipocytes to adipocytes, indicating that AS may be involved in bovine adipocyte differentiation. The key transcription factors and growth factors functions in adipocyte differentiation, such as PPARγ, CEBP, IGF-1, and TGF-β, were not alternatively transcribed, which indicated that the effect of critical regulatory factors were highly conserved during bovine adipogenic differentiation, the previous study showed a similar result

(Zhou et al., 2014; Zhang Y.Y. et al., 2017). Also, the expression levels of 817 AS isoforms were different among adipocytes and preadipocytes, which indicated a closely relationship between the cattle adipogenesis and AS. Therefore, AS may have an important role in bovine adipocyte differentiation.

Functional annotation of DEGs found a number of categories to be significantly enriched during bovine adipogenic differentiation. PPAR signaling pathway, ECM–receptor interaction, and Steroid biosynthesis were the top three KEGG pathways in which DEGs were involved. Those signal pathways were also significantly enriched in previous studies about RNA-seq of bovine adipose tissue (Zhou et al., 2014;

Zhang Y.Y. et al., 2017). In our present study, we constructed a protein interaction network using DEGs and found that ITGB3, COL1A1, COL11A1, COL3A1, and COL1A2 were the most prominent interacted partners, highlighting the central roles played by collagen, a major component of ECM, during cattle adipogenesis. Recently, Ojima used protein sequencing during the differentiation of murine 3T3-L1 cell (Ojima et al., 2016). The author reported that ECM components were the most abundant secreted proteins secreted by differentiating adipocytes, along with a series of collagens, which matches the results of our present study (Trapnell et al., 2012). In our present study, however, expression level of these collagens were low during the early and middle differentiation, while a dramatic increase in the expression levels of most of the collagens was observed at 13th day of differentiation, representing a late stage of differentiation. Contrasting results could be attributed to species-specific differences of regulatory cascades or of adipose tissue formation and different time points at which samples were obtained. Finally, the protein interaction networks used in our present study were simplifications, prone to overlook a number of potential interaction partners. Collagens contribute to formation of fibril (Kofford et al., 1997; Gelse et al., 2003), however, future studies will be needed to provide deeper insights into the regulatory pathways related to bovine adipogenesis and the ECM by which the aforementioned proteins are linked to this process.

Among the 16 DEGs we selected for gene detection by qPCR, TTR showed the greatest change of expression, with a nearly 20-fold decrease during adipogenic differentiation. TTR is a carrier protein and it was hypothesized that TTR could transfer the active components from chylomicrons to adipocytes, thereby stimulating the acylation stimulating protein, a potent stimulator of adipocyte triacylglycerol storage in adipocytes (Scantlebury et al., 1998). Also, Matsuura found body fat mass is correlated with serum TTR levels in maintenance hemodialysis patients (Matsuura et al., 2017). In our study, we found overexpression of TTR to promote PPARγ and FABP4 expression during bovine adipocyte differentiation, and to promote the formation of lipid droplets. Previous study showed that TTR was shown to increase 10-fold after 24 h overexpression of FABP4 and decrease to nearly zero after 48 h overexpression of FABP4 during the adipogenic differentiation

of bovine skeletal muscle stem cell (Zhang L. et al., 2017), which also shows there were a relationship between the expression of TTR and FABP4 during adipogenic differentiation. On the other hand, we found a significant decreasing expression of TTR after mature adipocytes were treated with TNFα, a factor known to contribute to the development of adipose tissue. Altogether, our results suggest that TTR may function as a stimulator of gene that drives bovine adipocyte differentiation. In RNA-seq data, we found no significant increase of PPARγ and CEBPα after adipogenic induction. PPARγ and CEBPα are two major transcription factors regulating adipogenic differentiation. Their expression dramatically increased after adipogenic induction (Chawla et al., 1994; Darlington et al., 1998). At a late differentiation stage, such as 13th day, they have already reached the peak expression level and then decrease to a level without significant difference (Takenaka et al., 2013; Hu et al., 2015). Even though, their expression levels in adipocytes are still higher than that in preadipocytes. On the other hand, as methods used to detect mRNA expression, qPCR and RNAseq are quite different, similar with our study, other reports showed different expression patterns of the same genes between RNA-seq and qPCR (Marioni et al., 2008; Wagner et al., 2012).

#### CONCLUSION

In conclusion, using Ribo-Zero RNA-seq, our study is to provide an overview of transcriptome changes during adipogenesis, namely, during the differentiation of preadipocytes into mature adipocytes. Hundreds of DEGs related to bovine adipogenesis were detected, only few of which could be further characterized regarding their mechanistic involvement during adipogenesis. AS may have an important effect during bovine adipocyte differentiation. Based on the top three enriched KEGG pathways, collagens that are associated with ECM might play central roles in cattle adipocyte differentiation. More importantly, the potential regulatory effect of TTR during bovine adipocyte differentiation is proposed. Our study leaves an array of new questions related to the molecular mechanisms underlying the regulation of bovine adipocyte differentiation and provides

#### REFERENCES


Arner, E., Rydén, M., and Arner, P. (2010). Tumor necrosis factor α and regulation of adipose tissue. N. Engl. J. Med. 362, 1151–1153. doi: 10.1056/NEJMc0910718


primary information for further functional studies about TTR in bovine adipogenesis.

#### DATA AVAILABILITY

The raw transcriptome read data generated from this study has been deposited into NCBI Short Read Archive (SRA) under accession number SRP067820.

### AUTHOR CONTRIBUTIONS

HFC designed and performed the experiments, and wrote the paper. ML designed the experiments and wrote the paper. XS analyzed the data and helped to design the experiments. MP helped the paper writing and language correction. CJL corrected language and the paper design. XL discussed the experiment design. CZL helped to design the experiment. YH helped to analyzed the data. YB provided the samples. XQ collect the samples. FL discussed the experiment design. HC helped to the experiment design and the paper writing.

### FUNDING

This study was supported by the Program of National Beef Cattle Industrial Technology System (CARS-38), National 863 Program of China (2013AA102505), Bio-breeding capacity-building and industry specific projects from National Development and Reform Commission (2014-2573), Specific Projects of Science and Technology in Henan Province (141100110200), Science and Technology Co-ordinator Innovative engineering projects of Shaanxi Province (2015KTCL02-08), Project of breeding and application of Pinan Cattle.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene. 2018.00463/full#supplementary-material

Black, D. L. (2003). Mechanisms of alternative pre-messenger RNA splicing. Annu. Rev. Biochem. 72, 291–336. doi: 10.1146/annurev.biochem.72.121801.161720


Chawla, A., Schwarz, E. J., Dimaculangan, D. D., and Lazar, M. A. (1994). Peroxisome proliferator-activated receptor (PPAR) gamma: adiposepredominant expression and induction early in adipocyte differentiation. Endocrinology 135, 798–800. doi: 10.1210/en.135.2.798

fgene-09-00463 October 16, 2018 Time: 19:31 # 12


with TopHat and Cufflinks. Nat. Protoc. 7, 562–578. doi: 10.1038/nprot. 2012.016


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Cai, Li, Sun, Plath, Li, Lan, Lei, Huang, Bai, Qi, Lin and Chen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

fgene-09-00463 October 16, 2018 Time: 19:31 # 13

# eIF4E-Dependent Translational Control: A Central Mechanism for Regulation of Pain Plasticity

Sonali Uttam<sup>1</sup> , Calvin Wong<sup>1</sup> , Theodore J. Price2,3 and Arkady Khoutorsky1,4 \*

<sup>1</sup> Department of Anesthesia, McGill University, Montreal, QC, Canada, <sup>2</sup> School of Behavioral and Brain Sciences, The University of Texas at Dallas, Richardson, TX, United States, <sup>3</sup> Center for Advanced Pain Studies, The University of Texas at Dallas, Richardson, TX, United States, <sup>4</sup> Alan Edwards Centre for Research on Pain, McGill University, Montreal, QC, Canada

Translational control of gene expression has emerged as a key mechanism in regulating different forms of long-lasting neuronal plasticity. Maladaptive plastic reorganization of peripheral and spinal nociceptive circuits underlies many chronic pain states and relies on new gene expression. Accordingly, downregulation of mRNA translation in primary afferents and spinal dorsal horn neurons inhibits tissue injury-induced sensitization of nociceptive pathways, supporting a central role for translation dysregulation in the development of persistent pain. Translation is primarily regulated at the initiation stage via the coordinated activity of translation initiation factors. The mRNA cap-binding protein, eukaryotic translation initiation factor 4E (eIF4E), is involved in the recruitment of the ribosome to the mRNA cap structure, playing a central role in the regulation of translation initiation. eIF4E integrates inputs from the mTOR and ERK signaling pathways, both of which are activated in numerous painful conditions to regulate the translation of a subset of mRNAs. Many of these mRNAs are involved in the control of cell growth, proliferation, and neuroplasticity. However, the full repertoire of eIF4E-dependent mRNAs in the nervous system and their translation regulatory mechanisms remain largely unknown. In this review, we summarize the current evidence for the role of eIF4E-dependent translational control in the sensitization of pain circuits and present pharmacological approaches to target these mechanisms. Understanding eIF4E-dependent translational control mechanisms and their roles in aberrant plasticity of nociceptive circuits might reveal novel therapeutic targets to treat persistent pain states.

Keywords: eIF4E, mRNA translation, persistent pain, sensitization, treatment

## INTRODUCTION

Chronic pain is a debilitating condition affecting more than 20 percent of the population worldwide (Steglitz et al., 2012; de Souza et al., 2017). Chronic pain is most commonly triggered by tissue inflammation or nerve injury, which can be caused by metabolic diseases (diabetes), autoimmune diseases, viral infection (herpes zoster), cancer, chemotherapy drugs (e.g., platinums, taxanes, epothilones, and vinca alkaloids), and nerve entrapment or blunt trauma. Chronic pain, however, can also appear without any recognizable trigger such as in fibromyalgia, migraine, irritable bowel syndrome, and interstitial cystitis.

#### Edited by:

Maritza Jaramillo, University of Quebec, Canada

#### Reviewed by:

Toshifumi Inada, Tohoku University, Japan Yoshihiro Shimizu, RIKEN, Japan

\*Correspondence: Arkady Khoutorsky arkady.khoutorsky@mcgill.ca

#### Specialty section:

This article was submitted to RNA, a section of the journal Frontiers in Genetics

Received: 07 August 2018 Accepted: 24 September 2018 Published: 24 October 2018

#### Citation:

Uttam S, Wong C, Price TJ and Khoutorsky A (2018) eIF4E-Dependent Translational Control: A Central Mechanism for Regulation of Pain Plasticity. Front. Genet. 9:470. doi: 10.3389/fgene.2018.00470

**59**

In most cases, the pain is a result of increased sensitivity of peripheral or central nociceptive circuits to stimulation, causing painful sensation in response to a normally innocuous stimulus. The increase in sensitivity, also referred as sensitization, is mediated by a combination of mechanisms taking place at several levels along the pain pathway including primary sensory neurons, spinal cord, and higher brain areas (Todd, 2010; Yekkirala et al., 2017).

Long-lasting increases in the sensitivity and responsiveness of pain circuits is ultimately accompanied by changes in gene expression, which support biochemical and structural alterations in neuronal and non-neuronal cells involved in pain processing. Gene expression is a multi-step process that is tightly regulated at different levels. Regulation of the rate by which mRNA is translated into protein is called translational control (Sonenberg and Hinnebusch, 2009; Robichaud et al., 2018). Translational control has a strong impact on the abundance of proteins in the cell, and its dysregulation contributes to many pathologies in the nervous system including developmental abnormalities, metabolic dysregulation, autism spectrum disorder (ASD), and epilepsy (Buffington et al., 2014; Tahmasebi et al., 2018). Tissue injury, metabolic diseases, and certain drugs (e.g., anticancer and opioids) cause an upregulation of mRNA translation in painprocessing tissues such as dorsal root ganglion (DRG) and dorsal horn of the spinal cord (Melemedjian and Khoutorsky, 2015; Khoutorsky and Price, 2018). Inhibition of translational control signaling in these tissues reduces the sensitization of nociceptive circuits and alleviates pain, demonstrating a central role of translational upregulation in the development of persistent pain (Price et al., 2007; Jimenez-Diaz et al., 2008; Asante et al., 2009; Geranton et al., 2009; Price and Geranton, 2009; Melemedjian et al., 2010; Xu et al., 2011; Bogen et al., 2012; Ferrari et al., 2013; Obara and Hunt, 2014). The rate of mRNA translation is controlled via several mechanisms (Costa-Mattioli et al., 2009; Robichaud et al., 2018). The recruitment of the ribosome to the mRNA is a central step in translation initiation and the major site for regulation. A key mechanism to regulate this process is controlling the activity of the eukaryotic translation initiation factor 4E (eIF4E), which binds a mRNA "cap" structure (a 7 methylguanosine linked to the first nucleotide at the 5<sup>0</sup> end of all nuclear transcribed eukaryotic mRNAs) and initiates ribosome recruitment (Altmann et al., 1985; Sonenberg and Hinnebusch, 2009). In this review, we focus on the regulation of eIF4Edependent mRNA translation initiation in nociceptive plasticity, highlighting a central role of this mechanism in the development of chronic pain.

#### TRANSLATIONAL CONTROL MECHANISMS

The process of translation can be divided into three phases: initiation, elongation, and termination. Most of the regulation of translation occurs at the initiation step (Sonenberg and Hinnebusch, 2009; Merrick and Pavitt, 2018). Initiation is regulated by a large number of translation initiation factors, which mediate the recruitment of the ribosome to the mRNA, followed by scanning of the 5<sup>0</sup> untranslated region (5<sup>0</sup> UTR) of the mRNA for the presence of an AUG start codon. A critical step in this process is the binding of eIF4E to the mRNA cap. Following binding to the cap, eIF4E binds a mRNA helicase, eIF4A, and a large scaffolding protein, eIF4G, to form a tri-subunit complex named eIF4F (**Figure 1**). eIF4F facilitates the recruitment of the 43S preinitiation complex (PIC) to the mRNA. The PIC is composed of a small 40S ribosomal subunit, translation factors eIF1, eIF1A, and eIF3, and a ternary complex (eIF2: GTP bound to initiator, Met-tRNAiMet). Recruitment of the PIC is followed by scanning of the mRNA 5<sup>0</sup> UTR and joining of a large ribosomal subunit (60S), upon encountering a start codon, to form an 80S ribosome that is competent to proceed to the elongation phase of translation. Importantly, the helicase activity of eIF4F (mediated by eIF4A) is required for unwinding the mRNAs 5<sup>0</sup> UTR secondary structure to allow the scanning process and translation to proceed (Parsyan et al., 2011).

Other major mechanisms involved in the regulation of translation initiation include regulation of ternary complex availability [via phosphorylation of the alpha subunit of the eukaryotic initiation factor 2 (eIF2α)] (Trinh and Klann, 2013); regulation of the length of mRNA poly(A) tail which promotes translation and protects mRNA from degradation (Gray et al., 2000; Kahvejian et al., 2001; Derry et al., 2006); and finally translation initiation via a cap-independent mechanism (mediated by internal ribosome entry site, IRES) (Pelletier and Sonenberg, 1988; Macejak and Sarnow, 1991; Leppek et al., 2018). Since the expression levels of eIF4E are the lowest among all translation initiation factors, the formation of the eIF4F complex and correspondingly, translation initiation are the rate-limiting steps for translation under most circumstances.

#### eIF4E IS A CENTRAL REGULATOR OF CAP-DEPENDENT TRANSLATION

Eukaryotic translation initiation factor 4E activity is tightly regulated via two mechanisms. Translational repressor 4Ebinding protein (4E-BP) binds eIF4E and prevents its association with eIF4G, and thus precludes the formation of the eIF4F complex (Gingras et al., 1999; Peter et al., 2015). In mammals, there are three 4E-BP isoforms – 4E-BP1, 4E-BP2, and 4E-BP3, which have similar functions but exhibit differences in tissue distribution. The binding of 4E-BP to eIF4E depends on the 4E-BP phosphorylation state. Upon phosphorylation by the mechanistic target of rapamycin complex 1 (mTORC1), the affinity of 4E-BP to eIF4E is reduced, leading to its dissociation from eIF4E and allowing the formation of eIF4F complex at the mRNA cap. This promotes the recruitment of 43S PIC to the mRNA and stimulation of translation (**Figure 1**).

Eukaryotic translation initiation factor 4E activity is required for translation initiation of all capped mRNAs. Complete loss of eIF4E, as in eIF4E−/<sup>−</sup> mice is not compatible with life and leads to embryos death before embryonic day 6.5 (Truitt et al., 2015). Partial loss of eIF4E does not have a strong impact on general translation, mostly because it induces

a compensatory degradation of hypophosphorylated 4E-BP1 (Yanagiya et al., 2012). Even though all nuclear transcribed eukaryotic mRNAs have a cap, not all cellular mRNAs are equally sensitive to eIF4E activity. The translation of "eIF4Esensitive mRNAs" is preferentially stimulated by increased eIF4E activity. For example, housekeeping mRNAs such as GAPDH and β-actin are less sensitive to eIF4E as compared to mRNAs involved in cell growth, proliferation, and immune responses [e.g., c-MYC, cyclins, BCL-2, MCL1, osteopontin, survivin, vascular endothelial growth factor (VEGF), fibroblast growth factors (FGF), and matrix metalloproteinase 9 (MMP-9)] (Rousseau et al., 1996; Sonenberg and Gingras, 1998; Bhat et al., 2015; Chu and Pelletier, 2018). The mRNA features rendering eIF4E-sensitivity have been typically associated with 5<sup>0</sup> UTRs

enriched with high-complexity secondary structures (Pelletier and Sonenberg, 1985; Sonenberg and Gingras, 1998). It has been demonstrated that a long 5<sup>0</sup> UTR favors the formation of stable secondary structures, and that the proximity of these structures to the cap obstructs eIF4F complex formation. On the other hand, hairpin structures with a greater free energy, located further away from the cap, restrict 5<sup>0</sup> UTR scanning (the progression of the PIC toward the start codon) (Kozak, 1989; Pickering and Willis, 2005). However, translation of a subset of mRNAs without long 5<sup>0</sup> UTR can still be sensitive to eIF4E, indicating that other 5<sup>0</sup> UTR signatures may also render this sensitivity (Leppek et al., 2018). Potential mechanisms include the presence of 5<sup>0</sup> terminal oligopyrimidine tracts (50TOPs) (Thoreen et al., 2012) and cis-regulatory elements (Wolfe et al., 2014; Truitt et al., 2015; Hinnebusch et al., 2016; Truitt and Ruggero, 2016; Leppek et al., 2018) at the 5<sup>0</sup> UTR. For example, a Cytosinerich 15-nucleotide motif, termed Cytosine Enriched Regulator of Translation (CERT), was shown to be responsible for conferring eIF4E sensitivity under oncogenic transformation and oxidative stress (Truitt et al., 2015).

Although most studies have attributed the elevated translation of mRNAs with highly structured 5<sup>0</sup> UTRs to the cap-binding ability of eIF4E and it being the limiting component of the eIF4F complex, other studies did not find that the cap-binding ability completely explained eIF4E function and explored further mechanisms of eIF4E-mediated translation regulation. This led to the identification of an additional function of eIF4E – stimulation of eIF4A helicase activity, which is independent of its capbinding ability (Feoktistova et al., 2013). Feoktistova et al. (2013) showed that the eIF4E binding site on eIF4G has an autoinhibitory function. Binding of eIF4E to eIF4G counteracts this autoinhibition, and in turn enables eIF4G to stimulate eIF4A activity (rate of duplex unwinding). They show that this function of eIF4E is independent of its cap-binding activity, suggesting that eIF4E can stimulate translation by two distinct mechanisms (Feoktistova et al., 2013).

In addition to regulation by mTORC1/4E-BP, eIF4E activity is also controlled via phosphorylation of its sole phosphorylation site, Ser 209, by mitogen activated protein kinase [MAPK] interacting protein kinases (MNKs) 1 and 2, downstream of the extracellular-signal-regulated kinase (ERK) and the p38 MAPK signaling cascades (**Figure 1**; Pyronnet et al., 1999; Waskiewicz et al., 1999). The phosphorylation of eIF4E is associated with altered translation of a subset of mRNAs, although the mechanisms underlying the effect of this phosphorylation event on translational efficiency and transcript-specificity remain elusive.

Since eIF4E is a downstream effector of both mTORC1 (via 4E-BP-dependent repression) and ERK (via eIF4E phosphorylation), its activity can be modulated by a multitude of external and internal cues that activate these central cellular signaling pathways. Numerous membrane receptors activate mTORC1 and ERK signaling in neurons including tyrosine receptor kinase A (trkA) and trkB, receptors from the insulin receptor family (IR, IGF1R, EGFR), and metabotropic glutamate and NMDA receptors. In addition to the extracellular cues, these pathways integrate intracellular signals conveying information on the status of cellular energy (via AMPK), oxygen levels [via activation of AMPK and REDD1 (Regulated in DNA damage and development 1)], and DNA damage (via the induction of p53 target genes) (Saxton and Sabatini, 2017; **Figure 1**).

### eIF4E IN REGULATION OF PERIPHERAL NOCICEPTIVE PLASTICITY

Tissue injury induces profound changes in the phenotype of sensory neurons, increasing their excitability and changing the connectivity within peripheral tissues and spinal cord. These alterations are driven by pro-inflammatory molecules released from injured tissues, such as neurotrophin nerve growth factor (NGF) and cytokine interleukin 6 (IL-6), as well as by neuronal activity evoked by direct injury to the nerve. ERK and mTORC1, two central intracellular pathways, are stimulated by tissue inflammation and nerve injury, diabetes, cancer, and druginduced neuropathies (Melemedjian and Khoutorsky, 2015; Khoutorsky and Price, 2018). In addition to the phosphorylationmediated activation of mTOR, downstream of PI3K/AKT pathway, a recent study showed that nerve injury stimulates local axonal mTOR mRNA translation (Terenzio et al., 2018). Translation profiling of DRG tissue from mice subjected to nerve injury showed that ERK is a key regulatory hub controlling both transcriptional and translation gene expression networks (Uttam et al., 2018).

Inhibition of ERK and mTORC1 signaling alleviates the development of pain hypersensitivity in a variety of pain models (Ji et al., 2009; Chen et al., 2018; Khoutorsky and Price, 2018). Since ERK and mTORC1 pathways converge on eIF4E to control the rate of cap-dependent translation, it was suggested that eIF4E might play a central role in the sensitization of pain circuits via regulating the translation of specific mRNAs. The physiological significance of eIF4E phosphorylation was studied using mice lacking eIF4E phosphorylation (knock-in mutation of serine<sup>209</sup> to alanine, eIF4ES209A). These mice display greatly reduced mechanical and thermal hypersensitivity in response to intraplantar administration of IL-6, NGF, and carrageenan, as well as diminished hyperalgesic priming (Moy et al., 2017). Moreover, the increase in excitability of eIF4ES209A primary sensory neurons in response to IL-6 and NGF was reduced as compared to wild-type (WT) controls. These findings were recapitulated in MNK1/2 knockout mice, which also lack eIF4E phosphorylation. In the nerve injury model of neuropathic pain, spared nerve injury (SNI), the development of mechanical and cold hypersensitivity was reduced in both eIF4ES209A and MNK1/2 knockout mice. Notably, local intraplantar inhibition of MNK with cercosporamide reduced mechanical hypersensitivity in response to NGF and alleviated hyperalgesic priming (Moy et al., 2017). These findings support the notion that the stimulation of eIF4E phosphorylation is imperative for the phenotypic changes of sensory neurons, promoting the hyperalgesic state and contributing to the development of chronic pain, and that this likely occurs independently of effects on inflammation

(Moy et al., 2018b). Experiments with local administration of cercosporamide also indicate that pro-inflammatory mediators- or tissue injury-induced phosphorylation of eIF4E mediates sensitization of sensory neurons via local mRNA translation.

The advances in translational profiling techniques have provided important insights into the potential mechanisms by which eIF4E phosphorylation regulates neuronal functions. In the brain, eIF4E phosphorylation controls the translation of mRNAs involved in inflammatory responses such as IκBα, a repressor of the transcription factor NF-κB that regulates the expression of the cytokine tumor necrosis factor (TNFα) (Aguilar-Valles et al., 2018). Genome-wide translational profiling of the brain from eIF4ES209A mice revealed that eIF4E phosphorylation controls translation of mRNAs involved in inflammation (IL-2 and TNFα), organization of the extracellular matrix (Prg2, Mmp9, Adamts16, Acan), and the serotonin pathway (Slc6a4) (Amorim et al., 2018).

In the DRG, phosphorylation of eIF4E stimulates translation of brain derived neurotropic factor (Bdnf) mRNA. eIF4ES209A mice show reduced protein levels of BDNF under baseline conditions and fail to translate Bdnf mRNA to protein in response to pro-inflammatory cytokines despite an increase in Bdnf mRNA levels (Moy et al., 2018a). BDNF is a key molecule mediating pain plasticity (Obata and Noguchi, 2006) and identification of MNK/eIF4E signaling as a central regulator of Bdnf translation has important therapeutic implications (Moy et al., 2018a). Cell-type specific translational profiling of nociceptors [using translating ribosome affinity purification (TRAP) approach] (Heiman et al., 2014) in a mouse model of chemotherapy-induced neuropathic pain revealed that MNKeIF4E signaling controls translation of RagA mRNA, a key regulator of mTORC1 (Megat et al., 2018). This finding suggests crosstalk between ERK/MNK/eIF4E and mTORC1 signaling pathways in promoting pain hypersensitivity in chemotherapyinduced neuropathies.

In addition to phosphorylation, eIF4E in primary sensory neurons is also regulated via mTORC1/4E-BP. IL-6 and NGF activate mTORC1, which promotes 4E-BP1 phosphorylation, increased eIF4F complex formation and nascent protein synthesis in cultured sensory neurons (Melemedjian et al., 2010). Intraplantar administration of IL-6 or NGF induced mechanical allodynia, which is blocked by subcutaneous administration of the mTORC1 inhibitor rapamycin, as well as by 4EGI-1, an inhibitor of eIF4F complex formation that disrupts eIF4E and eIF4G interaction. Intraplantar 4EGI-1 also blocked the establishment of the sensitization state in a hyperalgesic priming model in response to IL-6 and NGF injection (Asiedu et al., 2011).

These findings support a model that local activation of mTORC1 stimulates eIF4F complex formation, promoting pain hypersensitivity via axonal mRNA translation. 4E-BP1 is a major isoform involved in regulation of nociception, whereas in the brain 4E-BP2 is the dominant isoform. 4E-BP1 is highly expressed in nociceptors and mice lacking 4E-BP1, but not 4E-BP2, exhibit enhanced mechanical hypersensitivity. Notably, eif4ebp1 knockout mice show no alterations in thermal sensitivity, suggesting a mechanical-specific effect of eIF4E activation via 4E-BP-dependent mechanisms (Khoutorsky et al., 2015).

A second major downstream effector of mTORC1, p70S6 ribosomal kinase (S6K1 and S6K2) may not play as significant a role in the regulation of nociceptive sensitization. Mice lacking S6K1/2 do exhibit increased mechanical pain sensitivity, but normal thermal thresholds, and an inhibitor of S6K1/2 recapitulates this phenotype (Melemedjian et al., 2013). This finding seems paradoxical; however, further analysis revealed that loss of S6K1/2 function engages a feedback loop that stimulates enhanced ERK phosphorylation, driving mechanical sensitization (Melemedjian et al., 2013). Therefore, it is tempting to speculate that most of the pain inhibitory effects of mTORC1 inhibition are mediated via the suppression of 4E-BP1/eIF4E-dependent protein synthesis. The role of other translation-independent outputs of mTORC1, such as regulation of autophagy, lipogenesis, and mitochondrial function, remain unknown.

### eIF4E IN REGULATION OF SPINAL PLASTICITY

The spinal cord integrates peripheral somatosensory inputs to generate, after processing, an output that is conveyed to the brain where the perception of pain ultimately arises. Peripheral injury, disease, and certain drugs can cause an increase in the gain of spinal nociceptive circuits, resulting in disproportional amplification of somatosensory inputs, and therefore increased pain. These maladaptive plastic changes in the spinal cord, frequently referred to as central sensitization, significantly contribute to the development of pathological pain states. Central sensitization leads to a lowered threshold for the induction of pain (allodynia), an increase in the responsiveness to noxious stimuli (hyperalgesia), and an enlargement of the receptive field, resulting in pain sensation from non-injured areas (secondary hyperalgesia).

Long-lasting spinal plasticity critically relies on new protein synthesis to allow alterations in the cellular proteome, and consequently, sensitization of the pro-nociceptive circuits. Numerous studies have demonstrated the activation of ERK and mTORC1 signaling in the spinal cord following peripheral tissue injury, cancer, and opioid treatment (Geranton et al., 2009; Ji et al., 2009; Norsted Gregory et al., 2010; Xu et al., 2011, 2014; Shih et al., 2012; Jiang et al., 2013; Liang et al., 2013; Zhang et al., 2013). Intrathecal delivery of pharmacological inhibitors targeting these pathways efficiently alleviates pathological pain without affecting the baseline mechanical and thermal sensitivity (Ji et al., 2009; Melemedjian and Khoutorsky, 2015; Martin et al., 2017). There is evidence that the beneficial effect of mTORC1 inhibition on pain in the spinal cord is largely mediated via mTORC1/4E-BP1-dependent regulation of eIF4E activity. Pain hypersensitivity produced by intrathecal injection of epiregulin (EREG), an endogenous agonist of the epidermal growth factor receptor (EGFR) upstream of mTORC1, is blocked by intrathecal injection of 4EGI-1 (Martin et al., 2017). Moreover, specific deletion of 4E-BP1 in the dorsal horn of the spinal cord

causes mechanical hypersensitivity (Khoutorsky et al., 2015). Mice lacking 4E-BP1 show increased excitatory and inhibitory synaptic transmission in lamina II neurons as well as enhanced potentiation of spinal excitatory field potentials following sciatic nerve stimulation. Taken together, these results indicate that enhanced eIF4F complex formation in the spinal cord promotes spinal plasticity and contributes to the development of central sensitization.

#### THERAPEUTIC APPROACHES TO TARGET eIF4E-DEPENDENT MECHANISMS TO ALLEVIATE PAIN

Several lines of evidence suggest that targeting eIF4E is a potentially promising therapeutic strategy to inhibit aberrant pain plasticity. First, due to low expression levels, eIF4E's activity is a rate-limiting factor for translation initiation and a central node of regulation. eIF4E integrates signals from two major signaling pathways, ERK and mTORC1, both of which have important functions in the development of pain. Second, eIF4E does not strongly affect general translation, but mainly regulates the translation of a subset of mRNAs involved in cell growth, proliferation, immune responses, and neuronal plasticity. Mice with partial reduction of eIF4E protein levels, such as eIF4E heterozygous mice (Truitt et al., 2015) or mice expressing short hairpin RNA against eIF4E (Lin et al., 2012) show no developmental abnormalities or changes in survival rate or body weight. Third, whereas acute inhibition of mTORC1 is effective in alleviating pain, long-term mTORC1 inhibition leads to the hyperactivation of ERK via a mTORC1-S6K1-IRS1 negative feedback loop (Veilleux et al., 2010; Melemedjian et al., 2013). Since ERK is a well-known sensitizer of neurons involved in pain transmission, both in the periphery and the spinal cord, chronic mTORC1 inhibition leads to mechanical hypersensitivity and pain. Thus, longterm treatment with compounds targeting mTORC1 is unlikely to be clinically applicable. Conversely, chronic inhibition of eIF4E does not activate these compensatory mechanisms. Mice lacking eIF4E phosphorylation do not exhibit alterations in pain sensation at baseline, but show reduced nociceptive plasticity in response to pro-inflammatory and nerve injury stimuli (Moy et al., 2017). Finally, compelling preclinical studies have demonstrated beneficial effects of pharmacologically targeting eIF4E in alleviating persistent pain using 4EGI-1, an inhibitor of eIF4 complex formation or cercosporamide, an inhibitor of MNK. Efforts to develop and test new translation inhibitors are fuelled by their potential use for treatment of cancer (Stumpf and Ruggero, 2011), malaria (Baragana et al., 2015), and bacterial infection (Bhat et al., 2015). Here, we overview the existing and newly developed pharmacological approaches to target eIF4Edependent translation.

#### MNK Inhibitors

CGP57380 and cercosporamide are two small molecule inhibitors targeting MNK1 and MNK2 (Bhat et al., 2015). Cercosporamide, extracted from the fungus Cercosporidium henningsii, is an antifungal agent and a phytotoxin. It has antiproliferative and proapoptotic activities in cancer cells in preclinical animal models of lung and colon carcinomas (Konicek et al., 2011). It readily crosses the blood-brain barrier (BBB) and efficiently reduces p-eIF4E in the brain after peripheral administration (Gkogkas et al., 2013). However, both CGP57380 and cercosporamide have been shown to exhibit off-target effects (Bain et al., 2007; Bhat et al., 2015). More specific MNK inhibitors have been recently developed. eFT508 is a new generation Mnk1/2 inhibitor, which is potent, selective and orally bioavailable (Dreas et al., 2017). Its efficacy has been assessed in preclinical models of diffuse large B-cell lymphoma, and it causes a dose dependent decrease in eIF4E-phosphorylation (Reich et al., 2018). eFT508 is now in phase II clinical trial for the treatment of colorectal cancer. A recent study showed that eFT508 efficiently reduces eIF4E phosphorylation in DRG without affecting other major signaling pathways (ERK, 4E-BP, and AKT) and general translation (Megat et al., 2018). eFT508 also alleviated paclitaxel-induced mechanical and thermal sensitivity, supporting its further testing in other chronic pain conditions. BAY 1143269 is another potent, and selective orally administered MNK1 inhibitor (Santag et al., 2017). Additional MNK inhibitors include: 5- (2-(phenylamino)pyrimidin-4-yl)thiazol-2(3H)-one derivatives (Diab et al., 2014), resorcylic acid lactone analogs (Xu et al., 2013), and retinoic acid metabolism blocking agents (RAMBAs) (Ramalingam et al., 2014). These compounds need to be better characterized in both in vitro and in vivo studies.

#### Inhibitors of eIF4F Complex

Three inhibitors disrupting eIF4G:eIF4E interaction have been described: 4EGI-1 (Moerke et al., 2007), 4E1RCat, and 4E2RCat (Cencic et al., 2011). 4EGI-1 is a small molecule, which binds eIF4E at the site distal to the eIF4G-binding epitope, causing localized conformational changes and dissociation of eIF4G from eIF4E (Papadopoulos et al., 2014). 4EGI-1 also impairs mitochondrial functions (Yang et al., 2015). 4EGI-1 has been used in studies examining the role of eIF4F complex in memory (Hoeffer et al., 2011) and autism (Gkogkas et al., 2013; Santini et al., 2013), where it was delivered directly to the brain (intracerebroventricular injection) as it does not readily penetrate the BBB. Rigidified analogs of 4EGI-1 have been developed, showing improved potency in inhibition of eIF4E/eIF4G interaction (Mahalingam et al., 2014).

4E1RCat, and 4E2RCat block the interaction of eIF4E with both eIF4G and 4E-BP1, and thereby prevent the eIF4F complex formation (Cencic et al., 2011). These compounds have not been used yet in the nervous system in vivo. Antisense oligonucleotide (ASO) targeting eIF4E (LY2275796) with improved tissue stability and nuclease resistance has been developed (Graff et al., 2007). Since eIF4E is overexpressed in many human cancers (by ∼3- to 10-fold) (Bhat et al., 2015), LY2275796 has been tested as an anti-cancer treatment. Administration of LY2275796 to patients resulted in a reduction of eIF4E mRNA and protein levels in tumor cells but caused dose-dependent toxicity (Hong et al., 2011). The antiviral drug ribavirin has been proposed to mimic the mRNA "cap" to inhibit eIF4E/mRNA interaction (Kentsis et al., 2004). This notion was later disputed, and ribavirin's biological effects were attributed to translationindependent activities (Westman et al., 2005; Yan et al., 2005).

#### eIF4A Inhibitors

fgene-09-00470 October 22, 2018 Time: 14:37 # 7

eIF4A helicase activity is critically required for the eIF4F complex formation and unwinding of the 5<sup>0</sup> UTR to allow scanning to occur. Therefore, targeting eIF4A might be an additional approach to inhibit eIF4F-dependent translation initiation, particularly for mRNAs with highly structured 5<sup>0</sup> UTRs. Pateamine A, hippuristanol, and recoglate family members [e.g., silvestrol and Rocaglamide A (RocA)] are the commonly known inhibitors of eIF4A, out of which only pateamine A is known to cause irreversible inhibition (Pelletier et al., 2015). Hippuristanol is a member of the polyoxygenated steroids family, and it blocks the helicase activity of eIF4A by binding to the C-terminal of eIF4A and imposing allosteric hindrance, thus preventing eIF4A to bind RNA (Sun et al., 2014). On the other hand, pateamine A increases the sequence non-specific RNA-binding activity of free eIF4A, thus preventing eIF4A from participating in the formation of eIF4 complex (Bordeleau et al., 2006; Cencic et al., 2009). Out of these eIF4A inhibitors, silvestrol has been most widely assessed in in vivo preclinical cancer models, owing to its high potency, bioavailability, and relatively low toxicity (Raynaud et al., 2007). Recently, RocA was identified as a sequence-selective inhibitor of translation which acts by stabilizing eIF4A binding on polyurine sequences, thus impeding 43S scanning and leading to upstream premature translation initiation (Iwasaki et al., 2016). The anticancer potential of rocaglates has been widely examined, however, the mechanisms underlying their cytotoxic and anti-proliferative effects have been studied only recently (Becker et al., 2016). The role of eIF4A inhibitors in pain has yet to be examined.

### CONCLUSION

A central role of eIF4E-dependent translational control in mediating maladaptive nociceptive plasticity provides an opportunity to develop new therapeutics to prevent the development of the hypersensitivity state or even reverse

#### REFERENCES


established pain states by weakening ongoing activity-dependent plasticity. Existing compounds targeting eIF4E (cercosporamide and 4EGI-1) lack specificity and have poor solubility and BBB permeability (4EGI-1). Therefore, validation of other existing inhibitors for in vivo applications and development of more specific and efficacious inhibitors are required. Another important research direction is uncovering cell type-specific translational landscapes (for example using TRAP) in different pain conditions. This work might reveal mRNAs whose aberrant translation drives the pain phenotype and allow targeting these transcripts or the encoded proteins to reverse the hypersensitivity. It is, however, conceivable that a complex pattern of translation drives the hypersensitivity, involving a combinatory effect of several translationally activated and repressed mRNAs. In this scenario, targeting upstream regulatory mechanisms, such as formation of eIF4F complex, might be a more feasible therapeutic approach. Combination of diverse inhibition strategies could be beneficial to achieve long-lasting effects on pain without triggering compensatory mechanisms.

In summary, a growing recognition of the importance of the eIF4E-dependent translational control in regulation of cellular functions in general and neuronal plasticity in particular, have substantially accelerated studies in the field of pain and advanced our knowledge of how eIF4E-dependent translational dysregulation causes maladaptive plasticity and contributes to the sensitization of the pain pathway. Identification of new molecular targets and pharmacological compounds to target these mechanisms might constitute a basis for next-generation pain therapeutics.

#### AUTHOR CONTRIBUTIONS

All authors participated in writing the manuscript.

#### FUNDING

This work was supported by QPRN grant (AK) and NIH grant R01NS065926 (TP).

rapid protein synthesis at the spinal level. Mol. Pain 5:27. doi: 10.1186/1744- 8069-5-27



eIF4E expression reduces tumor growth without toxicity. J. Clin. Invest. 117, 2638–2648. doi: 10.1172/JCI32044


and their in vitro characterization as inhibitors of protein-protein interaction. J. Med. Chem. 57, 5094–5111. doi: 10.1021/jm401733v


Melemedjian, O. K., Khoutorsky, A., Sorge, R. E., Yan, J., Asiedu, M. N., Valdez, A., et al. (2013). mTORC1 inhibition induces pain via IRS-1-dependent feedback activation of ERK. Pain 154, 1080–1091. doi: 10.1016/j.pain.2013.03.021



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Uttam, Wong, Price and Khoutorsky. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Exploring the Impact of Single-Nucleotide Polymorphisms on Translation

Francis Robert<sup>1</sup> and Jerry Pelletier1,2,3 \*

<sup>1</sup> Department of Biochemistry, McGill University, Montreal, QC, Canada, <sup>2</sup> Department of Oncology, McGill University, Montreal, QC, Canada, <sup>3</sup> Rosalind and Morris Goodman Cancer Research Centre, McGill University, Montreal, QC, Canada

Over the past 15 years, sequencing of the human genome and The Cancer Genome Atlas (TCGA) project have led to comprehensive lists of single-nucleotide polymorphisms (SNPs) and gene mutations across a large number of human samples. However, our ability to predict the functional impact of SNPs and mutations on gene expression is still in its infancy. Here, we provide key examples to help understand how mutations present in genes can affect translational output.

Keywords: translation initiation, eIF4F, ribosome recruitment, SNP, genetic variant

Edited by:

Maritza Jaramillo, University of Quebec, Canada

#### Reviewed by:

Jun Yasuda, Miyagi Cancer Center, Japan Eric Londin, Thomas Jefferson University, United States Tommy Alain, University of Ottawa, Canada

> \*Correspondence: Jerry Pelletier jerry.pelletier@mcgill.ca

#### Specialty section:

This article was submitted to RNA, a section of the journal Frontiers in Genetics

Received: 13 August 2018 Accepted: 10 October 2018 Published: 30 October 2018

#### Citation:

Robert F and Pelletier J (2018) Exploring the Impact of Single-Nucleotide Polymorphisms on Translation. Front. Genet. 9:507. doi: 10.3389/fgene.2018.00507

#### SEQUENCE VARIATION AND GENE EXPRESSION

In the last two decades, advances in genome sequencing has provided unprecedented access to the human genome landscape and enabled documentation of sequence variations among individuals. Humans share 99.5% identity at the genomic sequence level implying that the resulting phenotypic diversity stems from the remaining 0.5% difference as well as epigenetic modifications. Sequence differences arise due to the presence of short and variable number tandem repeats, insertion or deletion polymorphisms, and single-nucleotide polymorphisms (SNPs) (Mccarroll et al., 2006; Orr and Chanock, 2008). Among SNPs, transitions (A ↔ G or C ↔ T) are more prevalent than transversions (A ↔ C or T; and G ↔ C or T). There are at least 10 million SNPs within the genome, occurring approximately every 100–300 base pairs and with an allele frequency greater than 1%, making these by far the most common variant type in the human genome (Risch, 2000; Lander et al., 2001; Orr and Chanock, 2008). Recently, there has been a bloom in genome-wide association studies (GWAS) where the prevalence of specific SNPs is linked to phenotypes or disease (Srinivasan et al., 2016). As well, The Cancer Genome Atlas (TCGA) has identified sequence variations between tumor and normal cells and the current challenge is distinguishing between those mutations that exert effects on gene expression to drive tumor evolution versus irrelevant passenger mutations.

Mutations have the potential to alter all steps of gene expression depending on their genomic location. When present within transcriptional regulatory elements, they can affect mRNA expression. When arising in genes, SNPs can impact on mRNA splicing, nucleo-cytoplasmic export, stability, and translation. When present within a coding sequence and leading to an amino acid change (referred to as a non-synonymous SNP or mutation), they can modify the protein's activity. If the mutation is synonymous (i.e., does not change the nature of the amino acid), then translation rates or mRNA half-life may be affected. If the mutation causes a premature stop codon, this can lead to the production of a truncated protein product or a near-null phenotype due to nonsense mediated decay (Mendell and Dietz, 2001; Nicholson et al., 2010). The Encyclopedia of DNA

Elements (ENCODE)<sup>1</sup> project aims to identify and catalog functional elements in the human genome and has been quite useful in understanding the potential impact that sequence variations exert on gene expression (Consortium, 2012). However, the functional consequences of sequence variants that occur within mRNA 5<sup>0</sup> leader [i.e., the region upstream of the initiator codon of the main open reading frame (ORF)] and 3 <sup>0</sup> untranslated regions (UTRs) (i.e., the region downstream of the major ORF stop codon) are not always immediately obvious and often not characterized. Here, we provide some thoughts on how such variants could affect mRNA translation efficiency. We highlight the individual steps of translation that sequence variants can affect, citing choice examples when appropriate and these are summarized in **Table 1**.

### AN OVERVIEW OF EIF4F-DEPENDENT RIBOSOME RECRUITMENT

Mammalian protein synthesis is predominantly regulated at the step of translation initiation, with the rate-limiting step being the recruitment of ribosomes to mRNAs (**Figure 1A**; Sonenberg and Hinnebusch, 2009). The key mediator of this step is the eIF4F complex. The eIF4E subunit binds to the mRNA cap structure present on all eukaryotic cytoplasmic mRNAs. The eIF4G component has RNA binding domains and stabilizes the eIF4E: cap interaction (Marcotrigiano et al., 2001; Yanagiya et al., 2009). RNA structural elements are resolved by the eIF4A DEADbox RNA helicase in conjunction with RNA binding proteins, eIF4B and/or eIF4H (**Figure 1A**). The 43S pre-initiation complex (40S ribosome and associated factors) (PIC) is then recruited to the mRNA template via bridging interactions between eIF4G and ribosome-bound eIF3 (Sonenberg and Hinnebusch, 2009). This mode of initiation is referred to as cap- or eIF4E-dependent. The requirement for eIF4F by mRNAs to recruit ribosomes differs and appears to scale as a consequence of 5<sup>0</sup> leader secondary structure (Pickering and Willis, 2005; Bitterman and Polunovsky, 2015; Hinnebusch et al., 2016). Since eIF4E is thought to be limiting for translation, mRNAs must compete for access to eIF4F and those with structural barriers in their 5<sup>0</sup> leader region are at a disadvantage (Pelletier and Sonenberg, 1985; Babendure et al., 2006). Hence, altering the secondary structure landscape within the mRNA 5<sup>0</sup> leader region can significantly impact on translational efficiency by affecting ribosome recruitment rates (Pelletier et al., 2015). Once bound, the 43S PIC scans the mRNA 5 0 leader region in search of an initiation codon.

A second mechanism by which the 43S PIC can be recruited to mRNA templates is through direct internal binding within the 5<sup>0</sup> leader region to an internal ribosome entry site (IRES), obviating the requirement for the cap structure. Best characterized among these are the viral IRESes and these have been stratified in four classes, based on structural similarities, initiation factor and/or IRES trans-acting factor requirements (Mailliot and Martin, 2018). Some exceptional IRESes, such as the cricket paralysis virus IRES, bypass the need for any initiation factors and can directly bind to the ribosome.

The discovery and characterization of IRESes in cellular mRNAs is of significant interest since they have been implicated in allowing translation to proceed under conditions when capdependent translation is impaired, such as stress, apoptosis, nutrient limitation, and mitosis (Komar and Hatzoglou, 2015). Cellular IRES function is therefore thought to be important for allowing rapid adaptation to a quickly changing environment, with selective translational effects being the outcome. Influences of SNPs on cellular IRESes activity could affect response to stress such as hypoxia, heat shock, toxins, or drugs (chemotherapy). As well, mutations in cellular IRESes could lead to aberrant translational responses to drive a number of pathological disorders, ranging from autoimmune disease, neurodegeneration, and cancer (Holcik and Sonenberg, 2005).

### CHANGES IN SECONDARY STRUCTURE AFFECTING TRANSLATIONAL OUTPUT

By impeding eIF4F-cap interactions or ribosome scanning, structural features (e.g., stem-loops, RNA-protein complexes, G-quadruplexes) can act as barriers to translation initiation and negatively impact translational efficiency (Pelletier and Sonenberg, 1985; Babendure et al., 2006). A study by Shen et al. (1999) was one of the first to document the extensive impact that SNPs can have on mRNA secondary structure (**Table 1**). Analysis of two SNPs within the coding regions of mRNAs encoding alanyl tRNA synthetase and replication protein A uncovered allele-specific structural features that impacted on sequence accessibility. Evidence that such changes can affect translational output was provided by a study assessing the influence of G-quadruplex structures present in 5<sup>0</sup> leader regions on translation (Beaudoin and Perreault, 2010). A SNP (G to C change) was identified at position 7 of a G-quadruplex (a critical region for G-quadruplex formation) within the 5<sup>0</sup> leader of the AASDHPPT (aminoadipatesemialdehyde dehydrogenase-phosphopantetheinyl transferase) mRNA (Beaudoin and Perreault, 2010). Biophysical experiments showed that G-quadruplex formation was disrupted and this was associated with 1.5-fold increase in protein levels in cells, with no effect on mRNA levels (Beaudoin and Perreault, 2010). These experiments indicate that point mutations in 5<sup>0</sup> leader regions that alter secondary structure can impact on translational output.

Secondary structure immediately downstream of the AUG can also affect translational output. For example, an A to G synonymous SNP at a Leu codon, present in the coding region of the catechol-O-methyltransferase (COMT) mRNA, was identified in subjects with high pain sensitivity and at greater risk of developing temporomandibular joint disorder (Diatchenko et al., 2005; Nackley et al., 2006). The COMT protein is responsible for catecholamine degradation and is a regulator of pain perception. In humans, three major haplotypes are formed by four SNPs at the COMT locus: one located in the promoter and three in the coding region [two synonymous at his62his (C/T) and leu136leu (C/G) and one non-synonymous val158met

<sup>1</sup>https://www.encodeproject.org/

#### TABLE 1 | Summary of the SNPs described in this study.

fgene-09-00507 October 27, 2018 Time: 17:17 # 3


(A/G)] (Nackley et al., 2006). It was reported that the major COMT haplotypes varied with respect to local mRNA secondary structure with the most stable structure associated with the lowest levels of protein production (Nackley et al., 2006). Site-directed mutagenesis disrupting the structural element caused an increase in protein production. The authors did not, however, directly assess the effect of the different haplotype sequences on mRNA translation rates.

Conversely, secondary structure can also act to increase start codon recognition when appropriately located downstream of initiation codons – an effect presumably due to the slowing of scanning ribosomes and increased codon sampling time

(Kozak, 1991). Hence, sequence changes that increase the formation of structure in the AUG downstream proximal region could increase AUG utilization rates.

#### 5 <sup>0</sup> LEADER SEQUENCE VARIATION AND START SITE SELECTION

#### SNPs Affect Start Codon Recognition

The mechanism by which 43S PICs locate an initiation codon has consequences on how SNPs that generate new, or remove existing, start codons affect translation initiation. The sequence context of an initiation codon dictates the efficiency by which it is recognized by scanning 43S PICs (**Figure 1B**). The optimal context is A/GxxAUGg, referred to as the Kozak consensus sequence, with the −3 purine (relative to the A of the AUG) being the most important determinant (Kozak, 1986, 1987a). Mutations that change this context are predicted to affect start site selection efficiency.

There are many examples of mutations that alter the AUG sequence context to impact on translational efficiency. One such example is the description of a mutation within the BRCA1 gene in a 35 years old patient converting a G to C at the −3 position relative to the BRCA1 AUG initiation codon (Signori et al., 2001). This mutation changes an optimal purine to a less favorable pyrimidine and has been linked to sporadic breast and ovarian cancers (Hall et al., 1990; Szabo and King, 1995). In vitro and in vivo expression studies of reporter mRNAs harboring the C allele showed a 30–50% decrease in protein expression compared to control mRNAs harboring the G allele. As well, the transcript carrying the G allele was associated with heavier polysomes (and hence elevated translation rates) compared to the C allele containing mRNA.

The NCBI SNP database has been mined for the presence of variants spanning AUG initiation codons, with a focus on the −3 and +4 positions (Xu et al., 2010). This study identified SNPs in >45 genes that occurred at one of these two critical positions and could thus potentially affect AUG utilization. The variants of two genes were tested by transfection of reporter constructs into cells and revealed that mRNAs harboring SNPs with "weaker" or "stronger" Kozak consensus sequence produced reduced or elevated protein levels, respectively (Xu et al., 2010). No differences in mRNA levels were noted.

### SNPs Creating an In-Frame uAUG

Mutations that generate start codons upstream, and in frame with, the major initiation codon of an open reading frame will "catch" some scanning 43S PICs and redirect protein synthesis to the new start site to produce N-terminal extended protein products (**Figure 1B**, see "In-frame uAUG"). The efficiency with which this is achieved is dictated, in part, by the context surrounding the new initiation codon.

One bioinformatics tool which has been developed for categorizing effects of variants on genome function is SnpEff (Cingolani et al., 2012). This tool annotates variants based on genomic locations to include intronic, untranslated region, splice site, intergenic, non-synonymous coding, etc. . . (Cingolani et al., 2012). Among the effects listed by SnpEff are changes in initiation codons (AUG and the less common CUG and UUG codons) that occur in the 5<sup>0</sup> leader region. Of 297 SNPs that generated a new translation initiation codon in the 5 0 leader region when comparing two Drosophila melanogaster strains, ∼25% were in the same reading frame as the major ORF (Cingolani et al., 2012) and would produce N-terminally extended polypeptides.

### SNPs Creating an uORF Out-of-Frame With the Major ORF

If a mutation generates a new start codon out-of-frame with the major ORF AUG, some 43S PICs may initiate at the new upstream (u) ORF and by-pass the major ORF, resulting in a decrease in protein production from the major ORF-encoded product (**Figure 1B** and **Table 1**). The extent of re-routing will depend, in part, on the context of the novel initiation codon as well as AUG proximal secondary structure (Kozak, 1991; Barbosa et al., 2013).

As an example of such a scenario, a germline mutation in the β-globin gene 26 nt upstream of the initiator AUG codon leads to the creation of a new, out-of-frame uAUG (Oner et al., 1991; Cai et al., 1992). This uAUG is in a favorable sequence context (A in the −3 position) and initiation at this uAUG shunts ribosomes pass the authentic AUG, reducing β-globin production, and leading to β-thalassemia. Whether or not mRNA stability is affected by a particular 5<sup>0</sup> leader mutation and this also contributes to the phenotype needs to be carefully assessed.

A similar scenario has been documented in the GTP cyclohydrolase 1 gene (GCH1) in which heterozygous mutations are associated with Dopa-responsive dystonia (DRD) (Armata et al., 2013). Here, a germline C to T transition 22 nt upstream of the translation start site generates a novel start codon that is out-of-frame with the downstream GCH1 AUG codon and results in reduced GCH1 production (Armata et al., 2013). It will be important to extend these results to: (i) formally demonstrate that the C to T transition leads to translation of the newly created uORF (an assessment that can be made by ribosome footprinting) and (ii) demonstrate that the C to T alteration leads to changes on endogenous GCH1 protein output.

The impact that this class of mutations can have on tumor biology is significant and is exemplified by the identification of a germline mutation in the CDKN2A tumor suppressor gene mapping 34 nucleotides upstream of the normal start site (Liu et al., 1999; Orlow et al., 2007). In this case, a G to T mutation creates a novel initiation codon residing in a favorable context but out-of-frame with the CDKN2A AUG. The T allele thus generates an mRNA that produces less CDKN2A and substantially increases the risk of melanoma in carriers (Liu et al., 1999; Orlow et al., 2007).

### SNPs Creating an uORF Upstream of the Major ORF

If the presence of a SNP leads to creation of a new uORF, this may impact gene expression by: (i) affecting re-initiation efficiency at the downstream major ORF, and (ii) generating a

FIGURE 1 | Overview of ribosome recruitment and scanning. (A) Cap-dependent translation initiation. The eIF4F complex, in conjunction with eIF4B and eIF4H, serves to prepare the mRNA for 43S ribosomal complex recruitment. (B) Impact of uAUGs and uORFs on ribosome scanning. When bound to the mRNA, the 43S PIC (in light blue) scans the mRNA in search for an initiator AUG. An AUG codon in a favorable context is efficiently recognized by the scanning 40S subunit, at which point a 60S subunit will join and elongation begins. Mutations creating novel uAUGs or uORFs will influence the frequency of ribosomes that initiate at the major ORF AUG codon. The position of an uORF, relative to the major AUG codon is important in determining major AUG utilization since the distance from the uORF stop codon and the major AUG dictates the time it will take for a ribosome to re-acquire a eIF2∗GTP∗Met-tRNA ternary complex. (C) A G/A SNP in the ERCC5 mRNA 5<sup>0</sup> leader region controls expression and response to stress. The A allele containing mRNA has an additional uORF which allows for more efficient ERCC5 main ORF translation under situations when eIF2α is phosphorylated. See text for details.

novel peptide encoded by the uORF (Barbosa et al., 2013). The precise mechanism of how 40S ribosomes are able to resume protein synthesis after having translated an uORF is not well defined but is related to the length of the uORF (the longer the uORF, the less efficient the re-initiation process) as well as the presence of structural barriers in the uORF (which reduces reinitiation potential) (**Figure 1B**; Kozak, 1987b; Abastado et al., 1991; Poyry et al., 2004). It is thought that initiation factors critical for re-initiation remain ribosome-bound for some time following commencement of elongation, but at some point are lost or ejected from the translating ribosome. If termination of translation occurs before these factors are lost, then that ribosome maintains its ability to reinitiate (Poyry et al., 2004). An analysis of 11,649 matched mRNA and protein measurements from four published mammalian studies have indicated that the presence of uORFs within the 5<sup>0</sup> leader region is generally associated with reduced expression from the major ORF (Calvo et al., 2009).

The repressive nature of a newly created uORF can, in part, stem from the reduced efficiency associated with translation re-initiation compared to de novo, cap-dependent translation initiation. Calvo et al. (2009) undertook a search for ORFaltering nucleotide variants within 12 million SNPs present in dbSNP<sup>2</sup> . They found a number of novel and previously described polymorphisms predicted to create new, or remove existing, uORFs. For example, mutations within the 5<sup>0</sup> leader region of the SRY (Poulat et al., 1998) and SPINK1 (Witt et al., 2000) mRNAs introduced novel uORFs upstream of the major ORF. Testing of reporters with different 5<sup>0</sup> leaders showed that those harboring uORFs produce less major ORF protein compared to reporters expressing control, wild-type 5<sup>0</sup> leader sequences. In general, the occurrence of a new uORF is associated with a 30–80% decrease in protein synthesis from the major ORF (Calvo et al., 2009).

Mutations that lead to the loss of an uORF can increase translation output. Analysis of 404 uORFs present in the 5<sup>0</sup> leaders of mRNAs encoding 83 tyrosine kinases and 49 other proto-oncogenes in 308 human malignancies uncovered uORF mutations in the EPHB1 and MAP2K6 genes (Schulz et al., 2018). In the case of EPHB1, a mutation changed the only uAUG found in the 5<sup>0</sup> leader to a GUG codon, while the sole uAUG of MAP2K6 was modified to an ACG codon. Both of the identified mutations lead to an increase in translational output from their respective mRNAs. This was complemented by a computational analysis of whole exome sequencing data from 464 colon cancers which revealed somatic mutations leading to the loss of 22 uORF initiation and 31 uORF termination codons (Schulz et al., 2018).

The presence of an uORF has also been shown to confer resistance to cisplatin exposure by facilitating translation of the major ORF encoded polypeptide under stress conditions (Somers et al., 2015; **Figure 1C**). The ERCC5 gene encodes an endonuclease that cleaves 3<sup>0</sup> of DNA adducts and is required for nucleotide excision repair. The mRNA 5<sup>0</sup> leader region harbors an uORF. There is a G/A polymorphism, rs751402, located upstream of this uORF where the A allele containing mRNA has a novel uORF, but the G allele containing mRNA does not. Treatment of cells with cisplatin leads to induction of a stress response, resulting in phosphorylation of eIF2α and a longer persistence in translation of the A allele mRNA (Somers et al., 2015). Whereas eIF2α phosphorylation is generally associated with a global shut down of protein synthesis, the translational output from some mRNAs is paradoxically increased due to the uORF configuration within their 5<sup>0</sup> leader regions.

eIF2 is required for ternary complex formation (with tRNA and GTP). When the eIF2α subunit becomes phosphorylated, ternary complex formation becomes rate-limiting resulting in a global shut down of general translation. Ribosome re-initiation following the translation of an uORF must recruit de novo ternary complexes and increasing the distance of the uORF to the next downstream AUG codon allows more time for that event to take place (**Figure 1C**). In the case of the ERCC5 A allelecontaining mRNA, the creation of an uORF makes it such that under stress, most ribosomes that have completed translation of the A-encoded uORF will not re-acquire another ternary complex before having scanned past uORF2 (and hence uORF2 won't be translated), but will do so before reaching the ERCC5 ORF (**Figure 1C**). The creation of new uORFs and their location within the 5<sup>0</sup> leader region can thus alter how translation of specific mRNAs respond to signaling cues.

#### SNPs Affecting an uORF Coding Region

Mutations arising within the coding region of uORFs have the potential to exert two types of effects on translation – by affecting the nature of an encoded regulatory peptide and by altering elongation rates.

If they perturb the function of a regulatory peptide involved in dictating ribosome re-initiation rates, then they can affect the output from the major ORF. Such might be the case for a G/A SNP in the 5<sup>0</sup> leader of the transforming growth factor β3 (TGFβ3) mRNA and present in several members of a large pedigree with arrhythmogenic right ventricular cardiomyopathy type 1 (Beffagna et al., 2005). The TGFβ3 mRNA 5<sup>0</sup> leader contains 11 AUGs potentially encoding 11 polypeptides (Arrick et al., 1991). The G/A SNP does not alter uORF configuration but rather causes an Arg to His substitution at codon 36 of an 88 amino acid uORF that is out of frame and overlaps with the sequence of the TGFβ3 main AUG. When tested in the context of a luciferase reporter assay in transfected C2C12 myoblasts cells, the presence of the A variant lead to a 2.5-fold increase in luciferase production (Beffagna et al., 2005). A similar situation was reported for the serotonin receptor gene, HTR3A, where a C/T SNP located in the second uORF caused a Pro to Ser change in individuals with bipolar affective disorder (Niesler et al., 2001). The authors tested the consequences of this SNP in a luciferase-based transfection assay and found that the T allele caused a 2.5- to 2.9-fold increase in luciferase expression without altering mRNA levels. One interpretation of these results is that the uORF-encoded peptide plays an inhibitory role in translation and the G to A change impairs activity of this small polypeptide. In both the aforementioned studies, potential effects of the SNP on splicing need to be assessed to rule out other possible explanations for the observed effects.

Alternatively, if variants influence uORF elongation rates then they can influence the potential for re-initiation at downstream

<sup>2</sup>https://www.ncbi.nlm.nih.gov/projects/SNP/

AUG codons (Jackson et al., 2012; Gunisova et al., 2018). Slowing down elongation rates of ribosomes transiting the uORF is thought to increase the probability that initiation factors associated with elongating ribosomes, and necessary for elongation, will be released before completion of uORF translation. This would then lead to decreased re-initiation at downstream AUG codons. Conversely, if elongation rates within the uORF are increased, this might lead to increased re-initiation rates and protein production from downstream ORFs.

#### 5 <sup>0</sup> SNPs AND IRES ACTIVITY

Another manner by which sequence changes within 5<sup>0</sup> leader regions have been documented to alter translation is by affecting IRES activity. The c-Myc (MYC) proto oncogene has been reported to harbor an IRES which may contribute to translation mis-regulation of MYC during tumorigenesis (Stoneley et al., 1998). Interestingly, a point mutation within the MYC 5 0 leader region leading to a C to T transition was identified in a multiple myeloma cell line and associated with elevated MYC protein levels (Paulin et al., 1998). The 5<sup>0</sup> leader harboring the T allele showed enhanced binding of several RNA binding proteins, as revealed by Northwestern blotting and UV crosslinking approaches. The same C to T mutation was found in 42% of primary multiple myeloma samples and generated an IRES variant that appeared to be more active (Chappell et al., 2000). The underlying mechanism for how the C/T change can lead to alterations in IRES activity awaits further definition.

#### 5 <sup>0</sup> SNPs, TRANSCRIPTION INITIATION SITE SELECTION, AND ALTERNATIVE SPLICING

Single-nucleotide polymorphisms present within gene regulatory regions can affect transcription factor, as well as RNA Pol II binding (Kasowski et al., 2010). If RNA Pol II binding is redirected to a newly formed site, this could lead to usage of alternative transcription initiation sites – generating mRNAs with differing 5<sup>0</sup> leader sequences and which could affect translation initiation rates.

Sequence variation in the 5<sup>0</sup> leader region can also occur through alternative mRNA splicing to produce isoforms with different translational efficiency. The presence of SNPs that impact on alternative splicing can change the levels and nature of the resulting mRNA isoforms. For example, thrombopoietin (TPO) is a master regulator of megakaryopoiesis and platelet production and is under tight translational control. Its 5<sup>0</sup> leader has seven uORFs (Ghilardi et al., 1998). A SNP has been identified that increases TPO serum levels in patients with hereditary thrombocythaemia (Wiestner et al., 1998), a genetic disorder caused by elevated platelet levels due to sustained proliferation of megakaryocytes (Murphy et al., 1997). Specifically, a G → C transversion at the splicing donor site of TPO intron 3 is responsible for generating a shortened 5<sup>0</sup> leader where uORF 7, as well as the main AUG, is lost (Wiestner et al., 1998). Translation initiation thus occurs at the next downstream AUG and leads to a fully functional, although truncated, TPO protein product where levels produced are much higher than from the normal mRNA. This effect appears to be the result of increased translation, presumably through effects on the re-initiation process, since the SNP did not affect RNA levels (Wiestner et al., 1998). Whether SNPs affect splicing or transcription, can only be assessed through sequence characterization of mRNA 5<sup>0</sup> leader regions, an analysis that is all too frequently omitted.

#### 5 <sup>0</sup> SNPs AND RNA BINDING PROTEIN TARGET SITES?

Impacting on RNA binding protein target sites is another manner by which SNPs could affect translation. By measuring the ratio of polysome- to monosome-bound mRNAs in immortalized lymphoblastoid cell lines, a genome-wide search for SNPs affecting translational efficiency was undertaken (Li et al., 2013). This study found that a SNP within the 5<sup>0</sup> leader region of the small ribosomal protein S26 mRNA (rs1131017: C/G located 22 nucleotides upstream of the initiator AUG codon) was associated with altered protein production. Reporter mRNAs harboring the G variant produced more protein than mRNAs having the C variant. This SNP is in high linkage disequilibrium with the 12q13 locus for susceptibility to type I diabetes. It interrupts a polypyrimidine track (. . .. <sup>−</sup>28TCTCCT[C/G]TCTCC−<sup>17</sup> . . .) upstream of the rpS26 AUG codon. Whether this alters the binding of an RNA binding protein, such as polypyrimidine tract-binding protein (which has been implicated in translation initiation), remains to be determined (Kaminski and Jackson, 1998).

### SNPs AND ELONGATION RATES

The information contained within mRNAs that encode the proteome is encrypted by 61 possible codons. Codons encoding the same amino acid are decoded by cognate tRNAs, which are not equally expressed in cells. It is generally thought that codon decoding rates can vary as a function of tRNA abundance and this can have dramatic effects on elongation rates (Cannarozzi et al., 2010; Hanson and Coller, 2018). This has been borne out by ribosome footprinting data and by experiments where translational output has been increased simply by replacing rare codons with more frequent ones (Gardin et al., 2014; Lareau et al., 2014; Hussmann et al., 2015; Weinberg et al., 2016). However, rare codons are thought to play important roles in cellular homeostasis since stretches of rare codons induce ribosome pausing during elongation and this provides time for proper protein folding (Hanson and Coller, 2018). Thus, a SNP changing a rare codon to a more common one could, in principle, increase protein output but decrease the proportion of functional (i.e., correctly folded product) polypeptide synthesized.

An example where codon usage could affect protein activity is exemplified by a study assessing the impact of a synonymous SNP (C3435T) present within the multidrug resistance 1 (MDR1)

coding region on protein function (Kimchi-Sarfaty et al., 2007). The MDR1 gene encodes an ATP-driven drug efflux pump that contributes to drug resistance in tumor cells. The C3435T SNP had been previously associated with reduced MDR1 expression and function in human cells (Hoffmeyer et al., 2000; Drescher et al., 2002). This SNP changes the most common Ile codon (AUC) to a less prevalent one (AUU). Reporter constructs harboring the C or T variants show similar protein expression levels, but produce products with different activity (Kimchi-Sarfaty et al., 2007). Trypsin digestion experiments revealed that the MDR1 product from the two different haplotypes differ in their protease sensitivities indicating distinct conformations. Conversion of the Ile codon to an ever rarer one, AUA, generated an mRNA that produced MDR1 protein with even less drug transport activity.

A similar phenomenon was observed for the cystic fibrosis conductance transmembrane regulator (CFTR) gene, in which a T2562G synonymous SNP in the coding region was found to reduce protein levels by 30% without affecting mRNA levels or splicing (Kirchner et al., 2017). This SNP changed a threonine codon from the highly prevalent ACU sequence to the rarer ACG codon. A CFTR expression vector bearing the G allele showed reduced single-channel Cl<sup>−</sup> conductance function compared to a T allele expressing vector (Kirchner et al., 2017). The authors concluded that slower synthesis rate from the G allele encoded mRNA resulted in improper protein folding that targeted CFTR for degradation by the quality-control machinery (Kirchner et al., 2017). The reduced protein levels from the G allele mRNA were rescued by transfection of an expression vector driving synthesis of the SNP-corresponding cognate tRNA (Kirchner et al., 2017).

### SEQUENCE VARIATION IN 3<sup>0</sup> UTRs AFFECTING TRANSLATION

With the exception of histone mRNAs, cellular mRNAs have poly (A) tails at their 3<sup>0</sup> end. The poly (A) tail is important for translation initiation and its function is mediated by the poly(A) binding protein, PABPC1. PABPC1 also interacts with eIF4G at the 5<sup>0</sup> end of the mRNA to create an mRNA closed loop that is thought to stimulate translation by: (i) stabilizing the association of eIF4F with the cap, (ii) stimulating 60S ribosomal subunit binding, and (iii) increasing the effective concentration of terminating ribosomes in proximity of the cap structure. SNPs that mutate the polyadenylation signal will lead to the generation of isoforms with longer 3<sup>0</sup> ends due to usage of downstream polyadenylation sites (Thomas and Saetrom, 2012). If the extended sequence results in the acquisition of novel microRNA (miRNAs) binding sites, then regulation of the new mRNA isoform can be quite different than the wild-type mRNA (Sandberg et al., 2008).

As well, mutations that occur within miRNA target sites and alter miRNA recognition can exert effects on mRNA expression through reduced translation initiation and increased mRNA degradation (Mohr and Mott, 2015). The last decade has seen an extensive list of SNPs that map to miRNAs or their putative binding sites within mRNAs that could potentially affect miRNA response and these have been comprehensively reviewed (Detassis et al., 2017; Moszynska et al., 2017). For example, a G to A SNP has been described in miR-1269, a miRNA linked to increased risks of hepatocellular carcinoma (Min et al., 2017). SPATS2L and LRP6 encode for pro-oncogenic activities and are both targets of miR-1269. This study showed that when the miR-1269 A variant is expressed in cells, the repressive effect on SPATS2L and LRP6 is diminished, compared to the miR-1269 G variant. SNPs in microRNA target sites on mRNAs have also been documented. For example, an A to G SNP in the 3<sup>0</sup> UTR of TOMM20 mRNA was found to be associated with greater risks of colorectal cancer (Lee et al., 2016). The microRNA miR-4273- 5p was identified as being responsible for controlling the levels of TOMM20.

There are several examples of 3<sup>0</sup> UTR RNA binding proteins that can affect mRNA translation; both at the initiation and elongation steps (Szostak and Gebauer, 2013; Yamashita and Takeuchi, 2017). The best example is 4EHP (also known as eIF4E2), a cap binding protein known to also interact with specific mRNA-bound proteins present within the mRNA 3<sup>0</sup> UTR. 4E-HP thus forms a closed-loop structure and since it does not interact with eIF4G, this prevents ribosomes from being recruited to the cap structure and exerts mRNA-selective inhibition of translation (Morita et al., 2012; Szostak and Gebauer, 2013; Chapat et al., 2017; Yamashita and Takeuchi, 2017). SNPs affecting RNA binding proteins that interact with 4EHP could lead to alterations in expression of a specific set of 4EHP-responsive mRNAs.

#### CONCLUSION

Whereas significant effort has been placed on finding and annotating SNPs that can affect protein function using programs such as SIFT (Kumar et al., 2009) and PolyPhen (Adzhubei et al., 2010; Li and Wei, 2015), there is a recognized need for bioinformatics tool that can predict potential functional consequences of SNPs in mRNA 5<sup>0</sup> leader and 3<sup>0</sup> UTRs (Kumar et al., 2014). Advances have been made (i) regarding software that predicts the effects of SNPs on miRNA targets, with programs such as microSNiPer (Barenboim et al., 2010) and mrSNP (Deveci et al., 2014), (ii) identification of translation initiation sites using ATGpr, and (iii) ORF prediction software such as ORF Finder. What is now needed are tools like SnpEff that could link changes in 5<sup>0</sup> leader and 3<sup>0</sup> UTR sequences to predictions on major ORF expression. A better understanding of the variables involved in determining mRNA translation efficiency will help design algorithms with more quantitative predictive power.

Much has been learnt from the functional analysis of genetic variants within mRNA 5<sup>0</sup> leaders and their effects on translation. The majority of these were identified because they were associated with an observable phenotype. The lesions whose effects are easiest to predict are those affecting initiation codon context or leading to the generation of novel uAUGs. However, it is those whose effects remain unexplained that will likely lead to the uncovering of new biological mechanisms. For example, during

a search for oncogenic changes associated with prostate cancer, Wang et al. (2009) identified a G to A somatic mutation that mapped within the δ-catenin 5<sup>0</sup> leader region, nine nucleotides upstream of the AUG codon. The presence of the A allele in reporter mRNAs resulted in a threefold to sevenfold increase in protein expression relative to mRNAs harboring the G allele, with no effect on mRNA levels noted. The mechanism underlying this translational stimulation is unknown but points to some very interesting biology. It also underscores the need to carefully consider the functional consequence of 5<sup>0</sup> leader mutations uncovered by large scale cancer genome sequencing projects and their potential role in affecting translational output. This is currently difficult to do systematically due to deficiencies in our ability to predict RNA structural complexity, as well as a lack of knowledge on the RNA binding protein (RBP) landscape

#### REFERENCES


in vivo. Genome-wide RNA structure probing approaches, as well as efforts aiming to define the RBP interactome, are being undertaken to fill this void (Castello et al., 2013; Tenzer et al., 2013; Bevilacqua et al., 2016; Bisogno and Keene, 2018).

### AUTHOR CONTRIBUTIONS

All authors drafted and wrote the review.

### FUNDING

Research in the JP's lab is supported by the Canadian Institutes of Health Research (CIHR FDN#148366).




microRNA-regulation. PLoS Comput. Biol. 8:e1002621. doi: 10.1371/journal. pcbi.1002621


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Robert and Pelletier. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Impact of Post-transcriptional Control: Better Living Through RNA Regulons

#### Biljana Culjkovic-Kraljacic\* and Katherine L. B. Borden\*

Institute for Research in Immunology and Cancer, Department of Pathology and Cell Biology, University of Montreal, Montreal, QC, Canada

Traditionally, cancer is viewed as a disease driven by genetic mutations and/or epigenetic and transcriptional dysregulation. While these are undoubtedly important drivers, many recent studies highlight the disconnect between the proteome and the genome or transcriptome. At least in part, this disconnect arises as a result of dysregulated RNA metabolism which underpins the altered proteomic landscape observed. Thus, it is important to understand the basic mechanisms governing posttranscriptional control and how these processes can be co-opted to drive cancer cell phenotypes. In some cases, groups of mRNAs that encode protein involved in specific oncogenic processes can be co-regulated at multiple processing levels in order to turn on entire biochemical pathways. Indeed, the RNA regulon model was postulated as a means to understand how cells coordinately regulate transcripts encoding proteins in the same biochemical pathways. In this review, we describe some of the basic mRNA processes that are dysregulated in cancer and the biological impact this has on the cell. This dysregulation can affect networks of RNAs simultaneously thereby underpinning the oncogenic phenotypes observed.

#### Keywords: RNA regulon, USER code, RBP, cancer, eIF4E, SRSF3, UNR

## OVERVIEW

High-throughput studies revealed that the transcriptome does not always predict the proteome (Lu et al., 2006; Vogel et al., 2010; Zhang et al., 2014), highlighting the need for a better understanding of post-transcriptional regulation in order to explain this discrepancy. Post-transcriptional regulation is comprised of a complex and diverse set of processes that represent various maturation steps and regulatory modalities for mRNAs including (but not limited to): splicing, mRNA export, stability, polyadenylation, and translation (Keene, 2007, 2010).

This complexity gives rise to the question: How does the cell coordinate metabolism and regulation of mRNAs encoding proteins in the same biological process so that the proteins can be coordinately produced? In answer to this question, Keene and colleagues proposed the RNA regulon model (Keene and Tenenbaum, 2002; Keene and Lager, 2005; Keene, 2007), where mRNAs encoding functionally related proteins (i.e., involved in the same biochemical processes) contain the same RNA elements, known as USER codes (Untranslated Sequence Elements for Regulation). USER codes can be based on primary, secondary or tertiary elements in the RNA. These USER codes are recognized by RNA binding proteins (RBPs) or regulatory RNAs (such as microRNAs, siRNAs, or snRNAs) which can recruit mRNAs to various machineries for appropriate types of

#### Edited by:

Chiara Gamberi, Concordia University, Canada

#### Reviewed by:

Fátima Gebauer, Centre de Regulació Genòmica (CRG), Spain Scott A. Tenenbaum, University at Albany, United States Jack D. Keene, Duke University, United States

#### \*Correspondence:

Biljana Culjkovic-Kraljacic biljana.culjkovic@umontreal.ca Katherine L. B. Borden katherine.borden@umontreal.ca

#### Specialty section:

This article was submitted to RNA, a section of the journal Frontiers in Genetics

Received: 15 July 2018 Accepted: 12 October 2018 Published: 05 November 2018

#### Citation:

Culjkovic-Kraljacic B and Borden KLB (2018) The Impact of Post-transcriptional Control: Better Living Through RNA Regulons. Front. Genet. 9:512. doi: 10.3389/fgene.2018.00512

**80**

processing (Imig et al., 2012; Blackinton and Keene, 2014; Wurth and Gebauer, 2015). Typically, a given mRNA contains multiple USER codes which would enable coordinated and combinatorial regulation. The combinatorial effect of the USER codes and the context (the sequence context which can influence folding of neighboring USER codes and availability of RBPs and regulatory RNAs) will ultimately affect which kind of machinery will be recruited to a particular mRNA. In this way, the RNA regulon serves as an elegant model to understand how groups of mRNAs can be co-regulated in combination as they flux through the various RNA metabolism steps ultimately allowing coordinated production of their physiologically active forms, proteins.

RNA regulons are inherently dynamic, and enable cells to adapt to environmental stresses and cues in a rapid and effective manner. Operation and control of regulons are mediated through targeting RBPs which act as nodes or center-points for these networks. Factors that modulate the localization or activity of these RBPs or that modify the USER codes (such as RNA methylation) ultimately influence the activity of a given regulon. A key control step is the interaction between specific RBPs and their cognate USER codes in the groups of RNAs to be regulated. Here, we suggest the possibility some transcripts may require a two-tier system of USER codes which allow their correct channeling to the appropriate machinery. Here, we provide examples of single and multi-tier systems as a launch point for this notion.

Havoc ensues when RNA regulons become dysregulated contributing to a variety of diseases including cancer. Dysregulation of regulons can occur because of dysregulation of RBPs or mutation in the USER codes. Consistent with this, RBPs involved in all levels of mRNA metabolism were found dysregulated or mutated in cancers (Kechavarzi and Janga, 2014; Dvinge et al., 2016; Carey and Wickramasinghe, 2018; Seiler et al., 2018; Urbanski et al., 2018; Wang et al., 2018). Further, many oncogenic pathways involved in malignant transformation, metastasis and drug resistance are regulated by various RNA regulons (Corbo et al., 2013; Blackinton and Keene, 2014; Ye and Blelloch, 2014; Wurth and Gebauer, 2015; Bisogno and Keene, 2018; Tan et al., 2018). In this review, we focus on the eukaryotic translation initiation factor eIF4E, the splicing factor SRSF3 and the Upstream of N-Ras protein (UNR), as examples of RNA regulons which contribute to malignancy. Further, these provide examples of different modalities in terms of the employment of regulatory factors and USER codes, single or multi-tier USER codes systems and the diverse levels of mRNA metabolism that can be affected.

#### THE EUKARYOTIC TRANSLATION INITIATION FACTOR eIF4E

eIF4E is traditionally defined as a factor key to global translation initiation. eIF4E binds the 5<sup>0</sup> -methyl-7-guanosine (m7G) cap on RNAs to recruit these to the translation machinery, thereby increasing the number of polysomes per transcript, i.e., their translation efficiency. Over time it has become clear that eIF4E regulates the translation of only a subset of capped transcripts (Clemens and Bommer, 1999; De Benedetti and Graff, 2004; Truitt et al., 2015). For instance, eIF4E overexpression increases the translation of ornithine decarboxylase (Odc1) and Myc mRNAs but not that of Gapdh or Cyclin D1 (Rousseau et al., 1996); conversely, eIF4E reduction only suppresses Odc1, Myc, Bcl-2, Edn1 (Endothelin-1), Fth1 (Ferritin heavy chain) translation but not β-Actin or Gapdh (Graff et al., 1997; De Benedetti and Graff, 2004; Truitt et al., 2015). In addition, 25 years ago eIF4E was found localized in the nucleus as well as the cytoplasm where it played a role in the export of selected transcripts (Lejbkowicz et al., 1992; Rousseau et al., 1996). In this way, eIF4E can increase the levels of transcripts available to the translation machinery and thus the protein levels in the absence of increased translation efficiency or increased RNA levels. More recently, ∼10 years ago, eIF4E was found in cytoplasmic P-bodies which appear to be involved in protecting RNAs from turnover (Andrei et al., 2005; Ferraiuolo et al., 2005). Not all mRNAs are targeted by these pathways and further, being an eIF4E target for one level of regulation does not imbue sensitivity to other processes a priori. While eIF4E associates with mRNAs through binding the common m7G cap structure, other USER codes act in recruiting necessary co-factors to dispatch mRNAs to the specific export, translation and/or stability machinery. Thus, eIF4E serves as an excellent example of a two-tier (or perhaps multi-tier) USER code system, as described below.

There are multiple USER codes defined for export and translation to date. The ∼50 nucleotide eIF4E sensitivity element (4ESE) in the 30UTR required for export of its target transcripts is one of the best understood eIF4E USER codes. The 4ESE is defined by its secondary structure comprised of paired stem loops as determined by nuclease mapping experiments, and is necessary for export. For instance, lacZ-4ESE chimeric mRNAs are sensitive to eIF4E dependent mRNA export while lacZ is not (Culjkovic et al., 2005, 2006). At the translation level, USER codes are less well defined but can be found in both the 5<sup>0</sup> or 3 <sup>0</sup>UTRs of mRNAs. The 50UTRs of eIF4E-sensitive mRNAs at the translational level tend to be long and GC-rich, i.e., with complex tertiary structure and this comprises the translation USER codes (Hoover et al., 1997; Clemens and Bommer, 1999; Larsson et al., 2006). Other sequences have been identified, such as the CERT (Cytosine-Enriched Regulator of Translation) (Truitt et al., 2015), but further studies are needed to determine if this is sufficient to drive translation. Importantly, for both mRNA export and translation, eIF4E targets must also retain the m7G cap. Thus, there is a two-tier USER code system, with the m7G cap for eIF4E:mRNA binding and a 4ESE or translation USER code which direct mRNAs to their particular post-transcriptional machineries (**Figure 1A**).

Biochemical studies of the eIF4E-mRNA export complex elucidated the mechanisms by which the 4ESE directs mRNAs to this level of control (Volpon et al., 2017). Here, the Leucinerich Pentatricopeptide Repeat Protein (LRPPRC) simultaneously binds both the 4ESE USER code in the 30UTR of mRNA and eIF4E bound to the mRNA through the cap. Then, the nuclear export receptor CRM1 binds this complex through direct interactions with LRPPRC. In this way, the USER code recruits the export machinery to the given mRNA directing it through this

mRNAs. In the cytoplasm, long, highly structured typically GC-rich regions in 50UTR of target mRNAs serve as USER codes for translation are recognized by co-factors which enhance recruitment of eIF4F complex and initiation of translation. There are other elements, such as CERT, which can also be USER codes for translation. (B) eIF4E coordinately enhances mRNA export and/or translation of many oncogenic mRNAs involved in biological processes implicated in cancer development and metastases. Circles indicate the level of regulation these RNAs are subject to: either mRNA export (pink) or/and translation (blue). Note that sensitivity of targets can change depending on cell type.

non-canonical export pathway. In the cytoplasm, eIF4E interacts with an alternative set of proteins to act in either translation or recruitment of mRNAs to P-bodies, whether there is a USER code for P-bodies is not yet known (Andrei et al., 2005; Shatsky et al., 2014).

Through these activities eIF4E can elicit biological responses (**Figure 1B**). For instance, RIP-Seq analysis in lymphoma cells indicated that nuclear eIF4E binds over 3000 mRNAs that encode proteins acting in lymphoma-sustaining pathways such as B-cell receptor signaling (Bcl2, Bcl6) and DNA methylation/epigenetic regulation (DNMT1, DNMT3A, HDAC1) (Culjkovic-Kraljacic et al., 2016). In AML and osteosarcoma cells, eIF4E coordinately increases the export of transcripts encoding all the proteins involved in hyaluronan synthesis (Zahreddine et al., 2017). Hyaluronan is a large polysaccharide with traditional roles in building the extracellular matrix, and more recently was found to encapsulate some tumor cells (Setala et al., 1999; Auvinen et al., 2000; Kemppainen et al., 2005). Indeed, Hyaluronan (HA) production was found to be required for the metastatic and invasive properties associated with eIF4E, and thus serves as the first case where this HA coat was shown to contribute to the oncogenic phenotype (Zahreddine et al., 2017). Indeed, inhibition of this regulon with RNAi to eIF4E or treatment with the cap competitor ribavirin impaired the export of the RNAs encoding the HA machinery, reduced HA production and decreased the invasive and metastatic activities of these cells. Indeed, eIF4E overexpression in the presence of RNAi knockdown to Has3 (hyaluronan synthase 3) mRNA, similarly reduced invasion and metastatic potential indicating that the HA pathway is critical for these eIF4E-driven activities (Zahreddine et al., 2017).

eIF4E can also reprogramme the cellular machinery to enhance its mRNA export activity and its nuclear import both of which are associated with an increase its oncogenic potential. For instance, eIF4E alters the composition of the nuclear pore complex, allowing it to facilitate export of its target mRNAs (Culjkovic-Kraljacic et al., 2012b). Specifically, eIF4E overexpression leads to downregulation and relocalization of Nup358/RanBP2, redistribution of Nup214 from the nuclear rim and increased levels of RanBP1 through elevated mRNA export of RanBP1 transcripts. Reduction in RanBP2 with concomitant elevation of RanBP1 likely enhances efficiency of mRNA cargo release on the cytoplasmic side thereby enhancing eIF4E mRNA export efficiency. The effects of eIF4E on RanBP2 are required for its oncogenic activities in vitro. eIF4E also enhances the mRNA export of Gle-1 and DDX19 mRNAs which encode proteins acting in the release of bulk mRNA cargoes (Kendirgi et al., 2003; Culjkovic-Kraljacic et al., 2012a). Interestingly, even these workhorses of the bulk mRNA export pathway have additional functions in stress granule formation and translation (Aditi et al., 2015; Aryanpur et al., 2017; Mikhailova et al., 2017). Further, beyond common mRNA targets, these export regulators have their own distinct target transcripts, which results in differing cellular phenotypes observed upon their depletion (Okamura et al., 2018). In all, this provides an example of how eIF4E can rewire the nuclear pore to enhance export of its target transcripts while simultaneously modulating the machinery for bulk mRNA export.

One obvious way to alter the activity of a regulon is to alter the localization of its key components. eIF4E modulates its own subcellular localization through its interaction with and effects on Importin 8. Importin 8 directly binds and imports eIF4E into the nucleus, enabling eIF4E to be quickly recycled after each round of mRNA export (Volpon et al., 2016). Importin 8 only associates with eIF4E when eIF4E is not bound to capped mRNAs, providing an interesting surveillance mechanism to inhibit import of actively translating eIF4E or of eIF4E which has not yet released its mRNA cargo from an export cycle. Depletion of Importin 8 impairs nuclear entry of eIF4E, eIF4Edependent mRNA export and oncogenic activities. eIF4E nuclear entry can also be impaired by addition of m7G cap analogs or ribavirin triphosphate (RTP). In this case, the cap or ribavirin analogs prevent association of eIF4E with Importin 8, correlating with reduced nuclear entry of eIF4E, reduced mRNA export and reduced oncogenic activity. Interestingly, Importin 8 also provides evidence of a feedback mechanism whereby eIF4E promotes the export of Importin 8 mRNAs to increase production of this protein and thus its own nuclear entry (Volpon et al., 2016). Thus, like its effects on the nuclear pore, eIF4E can modulate a variety of its control points and the machinery it engages.

eIF4E expression is also controlled by HuR/ELAV1, a factor involved in many levels of RNA metabolism, the most well described being mRNA stability. HuR increases the stability of eIF4E transcripts thereby interconnecting the HuR/ELAV1 and eIF4E regulons (Topisirovic et al., 2009). Indeed, HuR is amongst the first RNA regulons to be described and the eIF4E-HuR overlap provides a case whereby regulons intersect (Tenenbaum et al., 2000; Keene and Tenenbaum, 2002). Indeed, many mRNA stability targets of HuR such as cyclin D1, are also mRNA export targets of eIF4E (Rousseau et al., 1996; Tenenbaum et al., 2000).

It is also interesting to note, that eIF4E can directly contact RNAs beyond the m7G cap (Borden, 2016). As described above, the sequence context can alter the activity of a USER code. For instance, a 4ESE-like element found in the coding region of histone mRNAs recruited eIF4E-in cap-independent manner (Martin et al., 2011). While the affinity of eIF4E for the 4ESE element is lower than for m7G cap, in non-replicative histone H4 it is important for translation. In the nucleus, it seems that the ability of eIF4E to bind the 4ESE in the 30UTR might be used to inhibit export of uncapped mRNAs, and in this way acts as a surveillance mechanism (Volpon et al., 2017). Another type of USER code are the Cap-Independent Translational Elements (CITEs) found in the 30UTR of plant viruses such as Panicum mosaic virus and Pea enation mosaic virus 2 translation enhancers (PTE), and the I-shaped structures (ISS) from Maize necrotic spot and Melon necrotic spot viruses (Miras et al., 2017). The PTE directly binds eIF4E and initiates translation without using the m7G cap (Miras et al., 2017). In all, there are multiple USER codes to engage eIF4E and further, the same USER code in different contexts can have alternative functions.

Coordinated regulation implies that nodes in RNA regulons could also be valuable therapeutic targets as well as important

**83**

control points for regulation of normal cellular physiology. eIF4E expression is elevated in wide variety of cancers (De Benedetti and Graff, 2004; Borden and Culjkovic-Kraljacic, 2010). The first clinical studies targeting eIF4E in humans used ribavirin, a cap competitor of eIF4E, and thus an inhibitor of all of eIF4E's cap-dependent activities (Kentsis et al., 2004, 2005; Volpon et al., 2013). These studies led to clinical responses including remissions in refractory and relapsed AML patients (Assouline et al., 2009, 2015), patients with prostate cancer (Kosaka et al., 2017), lymphoma (Rutherford et al., 2018), and head and neck cancers (Dunn et al., 2018). Consistent with these clinical observations, eIF4E activity was impaired and levels of eIF4E target proteins were reduced in responding AML patients (Assouline et al., 2009, 2015). Indeed, AML patients have highly elevated nuclear levels of eIF4E, consistent with elevated Importin 8 levels (Volpon et al., 2016). In AML patients, ribavirin therapy was associated with reduced nuclear levels of eIF4E and impaired RNA export during response; and at relapse, eIF4E nuclear levels increased as did its mRNA export activity (Assouline et al., 2009). In this way, reprogramming the eIF4E regulon by preventing nuclear entry led to therapeutic benefit at least in this context.

### THE SERINE AND ARGININE RICH SPLICING FACTOR 3 SRSF3

SRSF3 (also known as SRp20) provides another example of a protein which turns out to function beyond its traditional roles. SRSR3 associates with the spliceosome and was thought to act in the splicing of all intron-containing RNAs (Corbo et al., 2013). However, recent identification of SRSF3 targets using iCLIP-seq (individual-nucleotide resolution crosslinking and immunoprecipitation sequencing) suggests that specific transcripts are targeted by this factor rather than all introncontaining mRNAs (Ratnadiwakara et al., 2018). SRSF3 controls establishment and maintenance of pluripotency through its functions in alternative slicing and 3<sup>0</sup> end mRNA processing, mRNA export and mRNA stability (Ohta et al., 2013; Cieply et al., 2016; Ratnadiwakara et al., 2018). For instance, SRSF3 increases the export of Nanog mRNA, which encodes one of the master regulators of pluripotency maintenance.

According to iCLIP studies, SRSF3 binds a consensus pentanucleotide element found in RNA segments including exons and introns of both coding and non-coding transcripts. Many pre-mRNAs encoding pluripotency factors contain SRSF3 binding-sites including Nanog, Sox2, Kif4, and Myc, and their levels were downregulated in SRSF3-depleted cells (Ratnadiwakara et al., 2018). SRSF3 also binds mRNAs encoding various RBPs with previously established roles in pluripotency and reprogramming including the MBNL2 splicing factor (Han et al., 2013) and the polyadenylation factor FIP1 (Lackford et al., 2014). Indeed, RNAi knockdown of SRSF3 led to failure to induce pluripotency in OKSM MEFs (OCT4, KLF4, SOX2, and Myc overexpressing Mouse Embryonic Fibroblasts) as well as loss of pluripotency and differentiation in iPSC (induced pluripotent stem cells) indicating that this regulon is important for cell reprogramming and maintenance of pluripotency.

Aside from its role in splicing, ∼400 transcripts were predicted to be SRSF3 nuclear export targets including Nanog mRNA, a key factor in stem cell pluripotency (Muller-McNicoll et al., 2016). This export activity of SRSF3 occurred even in intronless Nanog constructs indicating that this was a splicing-independent activity of SRSF3. Further, deletion of the SRSF3 binding sites impaired the ability of the bulk mRNA export factor NXF1 to bind Nanog mRNA suggesting that SRSF3 association is required to form this export complex (Ratnadiwakara et al., 2018). Consistent with this notion, NXF1/TAP directly binds SRSF3 proteins (Huang et al., 2003; Muller-McNicoll et al., 2016).

SRSF3 affects alternative splicing of many RNAs, including its own, and its depletion increases exon skipping and intron retention (Anko, 2014; Ratnadiwakara et al., 2018). Interestingly, a significant proportion of SRSF3 consensus binding-sites were found in introns of target mRNAs, including detained introns (DI). Indeed, SRSF3 is involved in retention of Nxf1 intron 10 affecting isoform expression and potentially impacting on the export of many mRNAs (Li et al., 2016; Muller-McNicoll et al., 2016; Ratnadiwakara et al., 2018). DIs with SRSF3 consensus sequences were found in mRNAs encoding other RBPs, including Fip1/1 and Mbnl2. Further, nearly half of NMD-regulated transcripts contained SRSF3-binding sites suggesting that this factor could also play a role in mRNA stability (Ratnadiwakara et al., 2018). However, further studies are needed as its effects may be limited to distinct NMD-sensitive transcript variants.

Only a single USER code, or one tier system, has been reported for SRSF3 despite the fact it recruits mRNAs to different machineries. The features that allow recruitment to the appropriate machinery are not yet known, so it is possible that a second USER code(s) is required. More studies into the minimal domains required to imbue SRSF3 sensitivity are important to understand how this USER code enables recruitment of different complexes to act in splicing, export and/or stability (**Figure 2**).

Through its role as a center-point in a RNA regulon, SRSF3 has been implicated in cellular senescence, cell adhesion and migration, proliferation, resistance to apoptosis, as well as establishment and maintenance of pluripotency (**Figure 2**). For instance, Nanog, Sox2, Kif4, and Myc are SRSF3 targets (Ratnadiwakara et al., 2018). SRSF3 regulates the global chromatin state of pluripotent cells by controlling mRNAs coding chromatin modifiers such as components of Polycomb repressive complex 2 (PRC2), Ezh2, and Epop (Zhang et al., 2011) and DNA methyl-transferase 3A (Dnmt3a) also involved in gene silencing (Ratnadiwakara et al., 2018). Additionally, by regulating other RBPs (FIP1, MBNL2, NXF1) and its own mRNA, SRSF3 is a part of interconnected network which coordinately regulates pluripotency gene expression program. SRSF3 also regulates FoxM1 transcripts (Forkhead box transcription factor M1, transcriptional regulator involved in regulation of cell cycle and proliferation), and the transcriptional targets of FOXM1 including Cdc25B (member of CDC25 family of phosphatases, required for mitosis) and PLK1 (Polo like kinase 1, highly expressed during mitosis, and frequently elevated in cancers) to control cell cycle progression and proliferation. Depletion of

SRSF3 in cancer cells induced G2/M arrest, growth inhibition and apoptosis, while SRSF3 overexpression in rodent fibroblasts induced cell transformation and tumor formation and growth in nude mice (Jia et al., 2010). Additionally, through regulation of TP53 alternative splicing SRSF3 is implicated in cellular senescence. Indeed, downregulation of SRSF3 induced cellular senescence in human fibroblasts (Tang et al., 2013). All these activities can contribute to human diseases including cancer. Given its affects on cell physiology it is not surprising that SRSF3 protein expression is elevated in a variety of cancers (Jia et al., 2010), while its mRNA levels are downregulated in de novo diagnosed AML patients (Liu et al., 2012) suggesting that SRSF3 levels could be crucial for maintaining normal cellular homeostasis in that context.

information about the first-tier motif, and not a priori provide information about the second tier involved in recognition process. Through its effects on different levels of RNA metabolism SRSF3 impacts cellular reprogramming and

### UPSTREAM OF N-RAS UNR

Upstream of N-Ras, also known as CSDE1 in mammals, is an RBP comprised of five cold-shock domains which bind singlestranded RNAs (Mihailovich et al., 2010). Global studies using iCLIP-Seq, RNA-Seq and ribosome profiling revealed that many target mRNAs and a wide variety of RNA processes are potentially impacted by UNR (Wurth et al., 2016). A majority of the 1532 RNAs found by iCLIP were mature mRNAs, with the UNR consensus binding-site most often located in the CDS or 30UTR. Bioinformatic analysis suggested that UNR has a preference for unstructured and/or single-stranded RNAs. UNR binds its own mRNA at the 50UTR, consistent with previously reported translational inhibition from its own IRES (Schepens et al., 2007). A comparison of the iCLIP and RNA-Seq data after UNR depletion indicated that there are ∼100 direct targets regulated by UNR at the stability level with many of these mRNAs being indirect targets of UNR. While UNR does not affect global translation, ribosome profiling experiments revealed that UNR regulates specific transcripts preferentially (451 genes), with 127 of these being direct targets of UNR (Wurth et al., 2016). A subgroup of mRNAs regulated by UNR at the level of translational initiation showed preferential UNR binding in the 50UTR, possibly representing novel IRESs given previously reported roles for UNR in IRES translation (Evans et al., 2003; Mitchell et al., 2003; Schepens et al., 2007). However, these studies suggested possible roles for UNR in elongation and termination of translation for the majority of these transcripts, with other stages of RNA metabolism possibly affected (Wurth et al., 2016).

Like SRSF3, UNR seems to use a single-tier strategy to associate with RNAs and modulate disparate steps in RNA processing. Interestingly, its can have opposing effects on the same processes, e.g., UNR inhibits translation of its own IRES (Schepens et al., 2007), but stimulates IRES translation for cMyc and Apaf-1 mRNAs (Evans et al., 2003; Mitchell et al., 2003). This suggests some context specific features are also at play, whether these are RNA elements or protein co-factors is not yet known. Further, even with the same partner proteins like PABP, UNR can have disparate effects, such as c-fos mRNA decay (Chang et al., 2004), and translational repression of pabp mRNA (Patel et al., 2005). Studies in Drosophila showed that UNR binds its targets either alone, e.g., roX2 lnRNA (Militti et al., 2014), or with cofactors, as in case of msl-2 mRNA where USER code recognition is achieved by cooperative complex formation with SXL proteins (Hennig et al., 2014; **Figure 3A**). Bioinformatic analyses suggests that there may be different binding modes for UNR depending on the location of the consensus motif within the transcript (Wurth et al., 2016). This suggests that UNR either binds several types of motifs or needs additional RBPs to aide in binding to mRNAs which do not contain UNR consensus binding sites (**Figure 3B**). Thus, UNR may well have a multi-tier system, at least for some mRNAs to dispatch them to their appropriate pathway.

As expected of an oncogenic RNA regulon protein, UNR controls a series of RNAs involved in metastasis and invasion, particularly in melanoma (Wurth et al., 2016). UNR protein levels are elevated in a high percentage of primary and metastatic melanoma specimens and cell lines, and its depletion reduced

oncogenesis.

pluripotency, in melanoma cells UNR enhances translation of the same mRNA without altering its steady-state levels. This is an example of the different effects of

the oncogenic potential of melanoma cells in vitro and in mice (Wurth et al., 2016). Overall, UNR is a major node in a melanoma regulon, where it is thought to regulate over 60% of the transcripts considered to be involved in development of this malignancy. Additionally, UNR is highly expressed in human embryonic stem cells where it coordinatively regulates multiple nodes of networks essential for maintaining pluripotency (Ju Lee et al., 2017). UNR stimulates the translation of RAC1 (Ras-related C3 botulinum toxin substrate 1, guanosine triphospatase belonging to the Ras superfamily), VIM (Vimentin, component of intermediate filaments important for mechanical integrity of cells during invasion, and also marker of epithelial-to-mesenchymal transition) and TRIO (Rho guanine nucleotide exchange factor which activates RAC1, implicated in uveal melanoma), and increases the stability of SDC4 (trans-membrane receptor which activates RAC1 to transduce signals from extracellular matrix to the cytoskeleton and modulate adhesion and migration), TNC

UNR on the same target depending on the context, where different sets of RBPs are most probably involved.

(extracellular matrix protein which interacts with SDC4 and is involved in regulation of cell adhesion) and CTTN (Cortactin, actin binding protein, implicated in tumor cell invasion and metastasis). Overexpression of VIM and RAC1 can overcome UNR depletion and fully restore colony growth of melanoma cells (Wurth et al., 2016). UNR regulates the stability of the tumor suppressor PTEN and the inflammatory factor CCL2 transcripts which are downstream effectors of c-Jun, a protooncogene hyperactivated in malignant melanoma. Thus, through its combinatorial affects on the melanoma pathway, UNR contributes to this oncogenic phenotype (**Figure 3B**).

#### CONCLUSION

Here, we focussed on eIF4E, SRSF3, and UNR as examples of RNA regulons involved in cancer progression. There are

clearly many other physiologically important regulons, such as those centered upon HuR and ARE elements (Tenenbaum et al., 2000; Keene and Tenenbaum, 2002; Mazan-Mamczarz et al., 2003; Tiruchinapalli et al., 2008; Bisogno and Keene, 2018), IFN response and GAIT elements (Anderson, 2010; Arif et al., 2018), and others which we could not cover due to space restrictions. The described regulons not only highlight their biological relevance, but also the utility of exploiting these therapeutically. RBPs acting in these regulons are mutated and/or aberrantly expressed in a variety of cancers (Xu and Powers, 2009; Culjkovic-Kraljacic and Borden, 2013; Hautbergue, 2017; Carey and Wickramasinghe, 2018; Urbanski et al., 2018). Disrupted RBP activity has been reported for nearly every step of mRNA metabolism including splicing (such as U2AF1, SRS2, ZRSR2, SR3B1, SRSF3), export (including THO, ALYREF, Luzp4, GANP, CRM1, eIF4E, SRSF3, UNR), nuclear pore (e.g., Nup88, Nup96/98, Nup214, TPR), and translation (eIF4E, UNR, eIF4A, eIF3), etc., Interestingly, mutations in spliceosome factors are frequent in hematological malignancies but rare in solid tumors (Dvinge et al., 2016; Carey and Wickramasinghe, 2018), highlighting their contextual importance in driving specific pathways in malignant transformation. Clearly, versatile modes of molecular recognition by RBPs are highly dependent on the context, where RNA structure complexity, available partner RBPs and cofactors as well as potential inhibitors or modulators of binding (regulatory RNAs, signalling molecules, etc.), all contribute to the biological outcome. Indeed, depending on a cell type, CSDE1/UNR may promote or inhibit differentiation and apoptosis (Dormoy-Raclet et al., 2007; Elatmani et al., 2011; Horos et al., 2012). Thus, deeper insight into the workings of regulon networks in healthy and malignant cells could provide information on critical nodes that can be exploited in cancer.

#### REFERENCES


From the RNA biology perspective, utilization of the same USER codes and their readers-RBPs in multiple complexes, suggest that RBPs become escorts for mRNAs with certain USER code(s). In this way, RBPs can act in multiple steps in RNA metabolism by virtue of their function as defined by the recognition of specific RNA binding motifs. In this way, RBPs may be much broader actors in RNA metabolism thereby facilitating the wiring of RNA regulons in the cell. Further, given the RNA world theory, while it has been posited that RNA regulons can recapitulate transcriptional programs, perhaps it is possible that RNA regulons came first. Interestingly, analysis of ancestral stem cells revealed that RBPs are more evolutionarily conserved than transcription factors suggesting that RNA regulons have played a key role in animal stem cell biology for millions of years, even playing roles in sponges and premetazoans (Alie et al., 2015). Indeed, RNA regulons are employed by single celled organisms such as yeast and across kingdoms being present in plants as well as animals (Keene and Tenenbaum, 2002; Chinnusamy et al., 2008). Further dissection of the regulons themselves and their intricate feedback systems will undoubtedly be central in developing our understanding of oncogenesis.

#### AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

#### FUNDING

This work was supported by grants from LLS Canada, LLS USA, CIHR, and NIH to KLBB. She holds a Canada Research Chair.





Zhang, Z., Jones, A., Sun, C. W., Li, C., Chang, C. W., Joo, H. Y., et al. (2011). PRC2 complexes with JARID2, MTF2, and esPRC2p48 in ES cells to modulate ES cell pluripotency and somatic cell reprogramming. Stem Cells 29, 229–240. doi: 10.1002/stem.578

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Culjkovic-Kraljacic and Borden. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Sorting mRNA Molecules for Cytoplasmic Transport and Localization

#### Nathalie Neriec<sup>1</sup> and Piergiorgio Percipalle1,2 \*

<sup>1</sup> Biology Department, New York University Abu Dhabi, Abu Dhabi, United Arab Emirates, <sup>2</sup> Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden

In eukaryotic cells, gene expression is highly regulated at many layers. Nascent RNA molecules are assembled into ribonucleoprotein complexes that are then released into the nucleoplasmic milieu and transferred to the nuclear pore complex for nuclear export. RNAs are then either translated or transported to the cellular periphery. Emerging evidence indicates that RNA-binding proteins play an essential role throughout RNA biogenesis, from the gene to polyribosomes. However, the sorting mechanisms that regulate whether an RNA molecule is immediately translated or sent to specialized locations for translation are unclear. This question is highly relevant during development and differentiation when cells acquire a specific identity. Here, we focus on the RNAbinding properties of heterogeneous nuclear ribonucleoproteins (hnRNPs) and how these mechanisms are believed to play an essential role in RNA trafficking in polarized cells. Further, by focusing on the specific hnRNP protein CBF-A/hnRNPab and its naturally occurring isoforms, we propose a model on how hnRNP proteins are capable of regulating gene expression both spatially and temporally throughout the RNA biogenesis pathway, impacting both healthy and diseased cells.

Keywords: mRNA transport and localization, hnRNP proteins, protein-RNA binding, G4 quadruplex, oligodendrocytes, neurons, spermatogenic cells

#### INTRODUCTION

A fascinating question in gene expression regulation is to understand how from the onset of transcription, cells regulate mRNA molecules into degradation, localization, storage, and/or translation. Several decades of mRNA biology have shown that regulation primarily happens at the level of ribonucleoprotein (RNP) particles, composed of RNA molecules and RNA-Binding Proteins (RBPs) (Dreyfuss et al., 2002). Within RNP particles, the protein composition evolves as the RNA is synthesized and matured. Different sets of RBPs join nascent RNP particles at specific steps of mRNA synthesis and maturation, such as splicing or nuclear export, others accompany the mRNA from the onset of transcription all the way to translation. One of the most intriguing aspects is, therefore, to understand how and why protein-RNA interactions are established from gene to polyribosomes (or polysomes), whether and how they lead to specific fates for the mRNA.

In this mini-review, we concentrate on two key steps in the mRNA regulation by focusing on a representative of a large family of RBPs, the heterogeneous nuclear ribonucleoprotein ab (hnRNPab) also referred to as CBF-A (CArG box-binding factor A). After a brief review of the different stages of mRNA biogenesis, we will address the role of hnRNPab in the formation and

#### Edited by:

Pascal Chartrand, Université de Montréal, Canada

#### Reviewed by:

Alexander F. Palazzo, University of Toronto, Canada Yaron Shav-Tal, Bar-Ilan University, Israel

#### \*Correspondence:

Piergiorgio Percipalle pp69@nyu.edu

#### Specialty section:

This article was submitted to RNA, a section of the journal Frontiers in Genetics

Received: 19 June 2018 Accepted: 12 October 2018 Published: 06 November 2018

#### Citation:

Neriec N and Percipalle P (2018) Sorting mRNA Molecules for Cytoplasmic Transport and Localization. Front. Genet. 9:510. doi: 10.3389/fgene.2018.00510

**91**

integrity of RNP particles, and in the regulation of the translatability of the carried mRNA. Finally, we will discuss the relevance of those mechanisms in cell specification and development.

### mRNA BIOGENESIS FROM THE GENE TO POLYSOMES

### Nascent Transcripts and Nuclear Organization

mRNA biogenesis is fundamentally affected by the organization of the cell nucleus. During differentiation, tissue-specific promoters are switched on or off to consolidate specific cellular identities and this coincides with changes in the localization of genes within the nucleus. Actively transcribed genes are believed to be located in a chromosome domain that borders with interchromosomal spaces, the perichromatin region. In that region, gene-rich chromosome loops, characterized by decondensed chromatin, project into the DNA-depleted interchromosomal space (Cremer et al., 2015; **Figure 1**). Although not all studies agree with the existence of the interchromatin space (Branco and Pombo, 2006), work on the polytene chromosome from the dipteran insect Chironomus tentans has shown in situ evidence of RNP particles decorating chromosome loops and being released after the maturation in the interchromatin space (Daneholt, 1997; Daneholt, 2001; Percipalle et al., 2001). Recently, Cremer et al. (2015) reviewing all literature from imaging to electron microscopy proposed a formalized nomenclature for the architectural organization of the nucleus. In the model, there are two coaligned three-dimensional networks termed Active and Inactive Nuclear Compartments (ANC and INC, respectively) (Cremer et al., 2015; Hübner et al., 2015). The INC contains the silenced chromatin, whereas the ANC, divided in the perichromatin and the interchromosomal space, contains the active DNA regions. In this model, the nucleus is represented as a sponge-like structure where the INC is perforated with channels of interchromosomal space connecting adjacent nuclear pores. The linings of those channels constitute the perichromatin regions where the contents of the interchromosomal space (including transcription factors and RBPs) can interact with the active unpacked DNA (Cremer et al., 2015; Hübner et al., 2015; **Figure 1**).

In the above model, the perichromatin region becomes its own nuclear subcompartment where transcription and cotranscriptional events take place (**Figure 1B**), acting as a hub for chromatin remodelers and histone-modifying enzymes to maintain an open chromatin state required for transcription. At the onset of transcription, nascent transcripts exiting the RNA polymerase machinery promote recruitment of RBPs. Among RBPs, hnRNP proteins are believed to be among the first ones to bind the nascent transcript, protecting it from degradation and facilitating cotranscriptional RNP assembly. The protein composition of an RNP particle depends on the specific mRNA, cell type, and stage and is remodeled throughout mRNA capping, splicing, cleavage, and polyadenylation (**Figure 1B**; for review see Singh et al., 2015). At the end of transcription newly formed RNP particles are released in interchromatin spaces. The initial steps in the biogenesis of RNP particles, in particular cotranscriptional RNP particle assembly, are therefore exquisitely integrated into the architecture of the cell nucleus. However, how this integration is maintained within the perichromatin region while particles move on the chromatin loop is unclear. Most likely, RNP particles are somehow connected to the chromatin as the mRNA is transcribed to protect it from being pulled into the interchromosomal space. The mechanisms by which such flexible anchoring could happen are unknown. Although their existence is not fully proven, transcription factories – where polymerases remain anchored and the DNA moves through the factory itself – may play an important role in maintaining nascent RNP particles connected to the chromatin but in this case the RNP particle would be a relatively static entity (Sutherland and Bickmore, 2009).

### From the Gene to Polysomes, Sorting Transcripts for Localized Translation

In the interchromatin space, mature RNP particles are believed to migrate by passive diffusion toward the nuclear envelope (Singh et al., 1999; Shav-Tal et al., 2004). Once at the nuclear pore complex (NPC), RNP particles are exported, a process that is considerably more rapid than the passive diffusion across the nucleoplasm (Bjork and Wieslander, 2017). As the RNP particle is routed toward the NPC, its composition changes with certain proteins being dynamically added or shed away from the transcript (Dreyfuss et al., 2002; Oeffinger and Montpetit, 2015). This fundamentally affects the intrinsic properties of the RNP particle. For instance, work performed by electron microscopy in C. tentans demonstrated that RNP particles unfold as they pass through the NPC, exposing the 5<sup>0</sup> end for immediate translation on the polysomes (Daneholt, 1997, 2001). In mammals, probably not all RNP particles completely unfold during passage through the NPC. However, RNP particles clearly transition from a highly compact macromolecular assembly to a more loosely organized entity, demonstrating a considerable degree of intrinsic plasticity. Although the mechanisms are not fully understood, remodeling of the mRNA molecule performed by RNA helicases in combination with changes in the polymerization state of actin has been suggested to be the driving forces (**Figure 1B**; Percipalle et al., 2001; Percipalle, 2014).

All RNP particles are not immediately translated as they exit from the nucleus. A subset of RNP particles is transported to cellular compartments where they are either stored in a translationally inactive form or locally translated. Examples of sites where transcripts are stored are provided by transport granules in neurites (reviewed in Lee et al., 2016) and chromatoid bodies in spermatogenic cells (Kotaja and Sassone-Corsi, 2007). To reach specialized sites for local translation, transcripts are rapidly transported. In polarized cells such as neurons and oligodendrocytes there are several well-studied examples of transcripts being transported to dendrites and myelin compartment, respectively (Martin and Ephrussi, 2009). Although the mechanisms are not fully understood, prior to

transport to such specialized locations, RNP particles are believed to assemble into large granules that probably contain many copies of the same transcript and are actively transported via the microtubule system (**Figure 1A**; for reviews see Carson and Barbarese, 2005; Kindler et al., 2005). Cytoplasmic RNA transport requires specific cis-acting elements within the mRNA termed zip codes that are presented to cellular transacting factors such as RBPs. These interactions are likely to stabilize transportcompetent RNP particles and possibly, the formation of granules that are then transported to their final cytoplasmic destinations where the mRNA is released and localized for translation. All these mechanisms require several coordinated steps that are not fully understood.

### THE INVOLVEMENT OF CBF-A/hnRNPab IN RNPs ASSEMBLY, TRANSPORT, AND LOCALIZATION

A central question is at what stage and which cis-acting elements are targeted by specific cellular transacting factors to regulate the different layers of RNA biogenesis. A good example is provided by the hnRNP protein CBF-A/hnRNPab that is known to interact with several RNAs through a cisacting element termed RNA trafficking sequence (RTS) or A2 Response Element (A2RE), in order to regulate cytoplasmic mRNA transport (Raju et al., 2008, 2011; Fukuda et al., 2013). CBF-A/hnRNPab was identified as a single-stranded DNAbinding protein interacting with CarG boxes, CC(A/T-rich)6GG, present in the α-Smooth Muscle Actin (Kamada and Miwa, 1992) and several others including apoVLDII and RSV CarGboxes (Smidt et al., 1995), Ig κ promoter (Bemark et al., 1998), Arginine VasoPressine (AVP) (Murgatroyd et al., 2004). We showed that CBF-A/hnRNPab also binds to poly(A) mRNA in vitro and in living cells (Percipalle et al., 2002). From an evolutionary point of view, CBF-A/hnRNPab actually belongs to the conserved hnRNP subfamily of the "2∗RNA Binding Domain (RBDs) and Glycine-rich auxiliary domain" (2∗RBD-Gly) proteins (Aranburu et al., 2006; **Figure 2**). As all 2∗RBD-Gly proteins, CBF-A/hnRNPab is composed of a unique nonconserved N-terminal region, a highly conserved central region that contains two RNA-binding domains (RBDs) and a conserved C-terminal Glycine-rich region (Dreyfuss et al., 1993; Smidt et al., 1995; Lau et al., 1997; Rushlow et al., 1999, 2000; Weisman-Shomer et al., 2002; Khateb et al., 2004; Aranburu et al., 2006). The closest homolog to CBF-A/hnRNPab, hnRNPD is also a member of the 2∗RBD-Gly family together with hnRNPA0 to A3 and Musashi (Aranburu et al., 2006). CBF-A/hnRNPab and the other members of the 2∗RBD-Gly family undergo remarkably similar alternative splicing, which generates

different proteins differing by just few kilodaltons (Dean et al., 2002; Kroll et al., 2009; Gueroussov et al., 2017). Conserved among mammals, CBF-A/hnRNPab has two isoforms, p37 (284 amino acids) and p42 (331 amino acids) (Khan et al., 1991; Lau et al., 1997; Yabuki et al., 2001). The two isoforms p37 and p42 have been shown to have different RNA and DNAbinding properties, they bind to different proteins and appear to have different roles in the cell (Yabuki et al., 2001; Fomenkov et al., 2003; Fukuda et al., 2013). Both the isoforms have been located in the nucleus and in the cytoplasm within RNA granules and they appear to be functionally different in the context of RNA regulation. For instance, the p42 isoform, but not p37, is involved in alternative splicing via binding of the specific α sterile motif of the p53 family member p63α. This interaction regulates alternative splicing of the Fgfr2 mRNA from a mesenchymal form to an epithelial form. Suppression of the CBF-A/hnRNPab-p63α interaction has been suggested to be the cause of craniofacial disorders such as the Hay-Wells syndrome (Fomenkov et al., 2003). Such proteinprotein interactions are also known to lead to the production of the dominant negative mRNA isoforms α and β of the Tert telomerase (Vorovich and Ratovitski, 2008). However, the molecular mechanisms by which the hnRNPab-p63α interaction affects mRNA splicing are not understood. Furthermore, there is evidence that CBF-A/hnRNPab is involved in ApoB editing by recruiting APOBEC1 and possibly disrupts the secondary structure of ApoB mRNA. Whether both the isoforms are similarly engaged in the process remain to be elucidated (Lau et al., 1997).

Here, we will focus on how the two CBF-A/hnRNPab isoforms have been suggested to be involved in the regulation of RNP particles from the onset of transcription in the nucleus to the mRNAs translatability upon transport.

#### Sorting Transport-Competent RNA Occurs During Nuclear Preparatory Events

CBF-A/hnRNPab is among the RBPs that seem to interact with the transcript at an early stage during the RNA biogenesis pathway. In fact, in thin sections of adult mouse brain, antibodies to CBF-A/hnRNPab decorated electrodense structures located in the interchromosomal space and in the perichromatin area (Raju et al., 2011), where active transcription takes place (Fakan and Puvion, 1980). In contrast the same antibodies to CBF-A/hnRNPab did not stain patches of dense chromatin. Based on location and morphology, CBF-A/hnRNPab seems to be excluded from INC while it is enriched at the ANC compartment associating with (pre)-mRNP complexes at sites of transcription and in the interchromosomal space. In the same study, CBF-A/hnRNPab was also found to be associated with electrodense structures, presumably mRNP particles, passing through the nuclear pores and in transit to the cytoplasm (Raju et al., 2011; Fukuda et al., 2013). Therefore, seeing that CBF-A/hnRNPab binds to poly(A) mRNA, it seems conceivable that CBF-A/hnRNPab cotranscriptionally associates with the transcripts and accompanies them to the cytoplasm. We speculate that binding of specific RBPs to nascent transcripts is a way of sorting them for specialized functions and CBF-A/hnRNPab may perform this specific task through its specific interaction with the RTS sequence.

Insights into sequence-specific recognition of single-stranded nucleic acids by hnRNP proteins came from the crystal structure of the two RNA-binding motifs (RRM) of hnRNP A1 in complex with single-stranded guanine-rich telomeric DNA. Guanine-rich DNA and RNA sequences have a tendency to form tetrahelical G4-quadruplex structures in vitro and in vivo, which appear to be stabilized by the cooperative interactions of the two hnRNP A1 molecules (Ding et al., 1999). Although not all RNAs form a G4-quadruplex, this mode of binding may explain how specific hnRNP-RNA interactions are established. For instance, similarly to singlestranded guanine-rich telomeric DNA, the hnRNPab sequence target, the CarG boxes, contains clusters of adjacent guanine residues. Furthermore, CBF-A/hnRNPab interacts with and can either disrupt the DNA quadruplex structure-like in the case of the d(CGG)n repeats in the Fmrp1 3 <sup>0</sup> UTR region or destabilize quadruplexes formed by the sequence [d(TTAGGG)n] at telomeres (Sarig et al., 1997a,b; Weisman-Shomer et al., 2002). These binding properties are conserved among the members of the 2∗RBD Gly hnRNP protein family. In fact, hnRNPA2, similarly to CBF-A/hnRNPab, also interacts with Fmrp1 DNA quadruplexes and destabilizes the quadruplex structure. In addition, hnRNPA2 and CBF-A/hnRNPab bind r(CGG) quadruplexes. However, while hnRNPA2 efficiently disrupts such a structure, CBF-A/hnRNPab has the opposite effect and stabilizes the RNA G4-quadruplexes (Weisman-Shomer et al., 2000, 2002; Khateb et al., 2004). While the role of RNA quadruplexes is still unclear, more and more proteins involved in their recognition, folding, and unfolding are being isolated. Conserved quadruplex forming sequences have been shown to be enriched at telomeres, origin of replication, promoter region, within RNA transcripts at 3<sup>0</sup> and 5<sup>0</sup> UTR as well as spliced introns (Rhodes and Lipps, 2015). Not only are DNA and RNA G4 quadruplexes believed to be involved in the regulation of transcription and RNA processing (Rhodes and Lipps, 2015), but more and more studies suggest that RNA G4 quadruplexes could have an essential role in the control of translation (Song et al., 2016).

With this in mind, CBF-A/hnRNPab may cotranscriptionally target cis-acting elements within nascent RNA and stabilize the formation of RNA G4 quadruplexes to sort transcripts that are not translationally active and can therefore be transported to the cellular periphery. Indeed, CBF-A/hnRNPab binds to the RTS located in the 3<sup>0</sup> UTR of several transcripts, including the Myelin Basic Protein (MBP), β-actin, Arc, BDNF, CAMKIIα, and Protamine 2 mRNAs (Ainger et al., 1997; Czaplinski et al., 2005; Czaplinski and Mattaj, 2006; Raju et al., 2008, 2011; Kroll et al., 2009; Fukuda et al., 2013; Andreou et al., 2014). RTS binding by CBF-A/hnRNPab is required for transport and localization of all of the above transcripts to the cellular periphery where they are translated and deletion studies by siRNA or gene knockout have demonstrated impaired RTSdependent mRNA transport in oligodendrocytes, neurons, and spermatogenic cells to specific cellular locations (reviewed in Percipalle, 2014). The RTS element is recognized by other members of the 2∗RBD-Gly-rich family such as hnRNPA2

(Hoek et al., 1998) and hnRNPA3 (Ma et al., 2002). CBF-A/hnRNPab, however, seems to exhibit a higher RTS-binding affinity, at least in vitro (Fukuda et al., 2013). Given that RTSs are guanine-rich sequences, we speculate that RTS binding primarily by CBF-A/hnRNPab may result in a stable RNA secondary structure reminiscent of RNA quadruplexes that may require synergy with other RTS-binding hnRNP proteins. We hypothesize that this stabilization leads to a translationally repressed form of the transcript. CBF-A/hnRNPab, by interacting to the RTS of nascent RNA molecules, may regulate their translatability at a cotranscriptional stage and contribute to sort transcripts for cytoplasmic transport and localization at an early stage during the gene expression process (see **Figure 3**).

### Cytoplasmic Transport Granules and Their Final Destinations

As mentioned above, upon nuclear export, translationally repressed RNPs are further assembled into larger granules to be transported to specific cellular locations for storage or for translation. Although poorly understood, assembly of RNP particles into transport granules has been proposed to be mediated by homo-dimerization of RNA-bound hnRNPs and actin polymerization from within individual RNP particles (Kanai et al., 2004; Carson and Barbarese, 2005; Percipalle, 2014). The homodimerization model is in line with the idea that granules are believed to contain only one type of

mRNA and a specific set of RBPs (Sinnamon and Czaplinski, 2011). CBF-A/hnRNPab, bound to the RTS element, may be important for granule formation as there is evidence that it preferentially homo-dimerizes in the cytoplasm and directly interacts with actin within the RNP particle (Percipalle et al., 2002; Aranburu et al., 2006). In addition, CBF-A/hnRNPab, similarly to hnRNP D, is present in several RNA granules, including Stau2, Btz (Fritzsche et al., 2013), kif5a (Kanai et al., 2004; Elvira et al., 2006), imp (Jonson et al., 2007), IMP1 (Weidensdorfer et al., 2009), hmm A3G (Chiu et al., 2006), RNA granules but not within the RNP granules of Stau1 (Brendel et al., 2004), and nor Ago1 and Ago2 (Hock et al., 2007).

How RNA transcripts become available to the translation machinery remains a major question for future studies. Insights recently came from evidence of different roles performed by the CBF-A/hnRNPab isoforms in regulating translatability of the Protamine 2 mRNA (**Figure 3**). While both the isoforms interact with the same RTS sequence, in vitro p37 shows a higher affinity than p42 and hnRNPA2 for the same RTS target (Fukuda et al., 2013). p42 and hnRNPA2 both interact with the RTS and 5<sup>0</sup> Cap-binding complex. In contrast, although p37 tightly binds to the RTS it does not interact with the 5 <sup>0</sup> Cap-binding complex (Fukuda et al., 2013; Tcherkezian et al., 2014). Furthermore, both the isoforms are in the RNP (Kumar et al., 1987; Percipalle et al., 2002; Czaplinski et al., 2005; Fukuda et al., 2013) but only p42 interacts with the Protamine 2 mRNA when it is associated with translating polysomes (Fukuda et al., 2013). Altogether, these observations suggest that p37 and p42 binding to the RTS facilitate remodeling or structural disruptions of the RNP particle/granule, exposing the transcript to different molecular machinery and leading to translation (Fukuda et al., 2013). A similar "switch" between RBPs has been shown to happen on the Cox2 mRNA in macrophages where the RBP Tristetraprolin TTP is replaced by HuR once at destination (Tiedje et al., 2012). In addition, in oligodendrocytes, hnRNPA2 phosphorylation leads to the replacement of the translation repressor hnRNPE1 with the activator hnRNPK (Muller et al., 2013; Torvund-Jensen et al., 2014). Further studies will possibly address if switch mechanisms are general or transcript-specific and if other hnRNPs such as hnRNPD cooperate with the p37/p42 isoforms.

One of the open questions is how differential RTS binding of the two isoforms is achieved. Recently, hnRNPA2 has been shown to be involved in alternative splicing of miRNAs by recognizing methylation on adenosine residues (Alarcón et al., 2015). Since both CBF-A/hnRNPab and hnRNPA2 bind the same RTS site (Fukuda et al., 2013), it is intriguing to speculate that the methylation state of the RTS sequence may be involved in the binding affinity of CBF-A/hnRNPab, promoting binding of p37 or p42 together with other hnRNP proteins. Whether and how all of the above mechanisms in a coordinated manner lead to optimal RNP particle remodeling at different stages of RNA biogenesis remains, however, to be understood. Some of these questions may become clearer once we understand the full spectrum of protein modifications involved and if RNA methylation plays a role in regulating differential RNA-binding affinities.

### CBF-A/hnRNPab Regulation of Cell Specification and Development

The mechanisms of RNA trafficking are important to ensure spatial and temporal regulation of gene expression, which is, in turn, required during development and differentiation. Understanding how the CBF-A/hnRNPab isoforms promote efficient mRNA trafficking might, therefore, provide an interesting paradigm to study cell specification and neuronal development. Indeed, high levels of CBF-A/hnRNPab expression can be found in neuronal cells, in the developing neural tissues and neurogenic regions of the brains (Rushlow et al., 1999; Gong et al., 2003) In Xenopus laevis, depletion of CBF-A/hnRNPab orthologs led to a decrease in eye size due to a general increase in apoptosis, as well as a decrease in proliferative neural tissues, with cranial neurons not being properly formed, motor neurons missing and defects in migration. Depleted neurons also show a thinner and disorganized tubulin network (Andreou et al., 2014). In mice, neurospheres produced from CBF-A/hnRNPab−/<sup>−</sup> knock-out mice have reduced expression in the stem cell marker Nestin and an increase in the differentiated marker dcx. This suggests that CBF-A/hnRNPab is involved in the regulation of stem cell maintenance and neuronal precursor differentiation. Furthermore, CBF-A/hnRNPab−/<sup>−</sup> neurons in vivo have neurites length increased by 40% while their longest neurite is 32% longer than the wild-type condition (Sinnamon et al., 2012). Finally, nerve growth stimulation resulted in increased CBF-A/hnRNPab expression (Rushlow et al., 1999, 2000). How CBF-A/hnRNPab−/<sup>−</sup> is involved in neuronal development is not known. In both neuronal cells and spermatogenic cells CBF-A/hnRNPab was found to interact with the 5<sup>0</sup> Cap-binding complex, facilitating translation (Fukuda et al., 2013; Tcherkezian et al., 2014). An interesting possibility is that the p37–p42 relay mechanism might be important to translationally repress and/or derepress transcripts that are important for neuronal development.

The above mechanisms are likely to also occur in the adult brain since CBF-A/hnRNPab is expressed in mature neurons, oligodendrocytes, and astrocytes (Rushlow et al., 1999). Lack of CBF-A/hnRNPab in cultured oligodendrocytes results in impaired transport and localization of MBP mRNA at the myelin compartment (Raju et al., 2008). In primary neurons, localization of CBF-A/hnRNPab at postsynaptic compartments is enhanced by the treatment with NMDA and AMPA (Raju et al., 2011), suggesting an activity-dependent role for CBF-A/hnRNPab. Furthermore, CBF-A/hnRNPab has been shown to repress in vivo excitotoxicity, a phenomenon that is a direct consequence of over stimulation of glutamatergic neurons that can lead to cell stress and neuronal cell death (Sinnamon et al., 2012). Interestingly, hypersensitivity to excitotoxicity revealed in hnRNPab-/- glutamatergic neurons has been proposed as a mechanism involved in neurodegenerative disorder (Lau and Tymianski, 2010). Consistently, emerging evidence suggests that

the members of the 2∗RBD-Gly family mimic Alzheimer's disease phenotypes at the cellular level when the proteins are depleted in vitro (Berson et al., 2012). In light of these observations, a fascinating avenue to be explored in the future would be to find out whether there is a connection between suppression of specific 2 <sup>∗</sup>RBD-Gly family members and the onset of neurodegenerative disorders.

#### CONCLUSION

We suggest that the mechanisms described above based on the ubiquitously expressed CBF-A/hnRNPab isoforms overall contribute to the general translatability of mRNA transcripts. This is initially achieved in the cell nucleus; we propose that at this stage nascent transcripts are translationally repressed and consequently sorted for cytoplasmic transport and localization (**Figure 3**). The contribution of chromatin at this stage remains to be understood. Once they reach their final cytoplasmic location, however, transcripts are translationally derepressed. Control of translatability may be achieved by a relay mechanism based on the different RNA-binding affinities of the two CBF-A/hnRNPab isoforms. Different binding affinities may be a consequence of RNA methylation. Although much remains to be uncovered, these sorting mechanisms are likely to be important for cell development and differentiation. The knockout mouse model for CBF-A/hnRNPab already displays brain developmental issues (Sinnamon et al., 2012). Further, in the same knockout mouse model spermatogenesis is impaired (Fukuda et al., 2013). We predict that tissue-specific factors that differentially interact with the CBF-A/hnRNPab isoforms and RNA sites proximal to the

#### REFERENCES


trafficking elements play an important role, contributing to the specific function of CBF-A/hnRNPab in cell specification and development. This is a possible general scenario that, in principle, is applicable to how other hnRNP proteins perform specialized tasks in mRNA trafficking.

Future work will need to proceed toward controlled cell differentiation systems in combination with genome-wide analyses to understand how cell development is controlled by CBF-A/hnRNPab. As more molecular mechanisms are being revealed that regulate mRNA biogenesis from the gene to polysomes, systems biology will provide a powerful approach to understand the importance of specific hnRNP proteins in tissue development and differentiation of complex multicellular organisms and how these mechanisms are potentially impaired in human diseases.

#### AUTHOR CONTRIBUTIONS

All authors wrote the paper, read, and approved the final manuscript.

#### FUNDING

This work was partly supported by grants from New York University Abu Dhabi, the Swedish Research Council (Vetenskapsrådet), and the Swedish Cancer Society (Cancerfonden) to PP. We are grateful to the computational platform provided by NYUAD HPC team at New York University Abu Dhabi.


single-stranded telomeric DNA. Genes Dev. 13, 1102–1115. doi: 10.1101/gad.13. 9.1102


telomeric DNA-binding protein from rat hepatocytes. J. Biol. Chem. 272, 4474–4482.


regulates AU-rich element-dependent translation. PLoS Genet. 8:e1002977. doi: 10.1371/journal.pgen.1002977


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Neriec and Percipalle. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Translating the Game: Ribosomes as Active Players

Piera Calamita1,2 \*, Guido Gatti1,2, Annarita Miluzio<sup>1</sup> , Alessandra Scagliola1,2 and Stefano Biffo1,2 \*

1 INGM, National Institute of Molecular Genetics, "Romeo ed Enrica Invernizzi", Milan, Italy, <sup>2</sup> Dipartimento di Bioscienze, Università Degli Studi Di Milano, Milan, Italy

Ribosomes have been long considered as executors of the translational program. The fact that ribosomes can control the translation of specific mRNAs or entire cellular programs is often neglected. Ribosomopathies, inherited diseases with mutations in ribosomal factors, show tissue specific defects and cancer predisposition. Studies of ribosomopathies have paved the way to the concept that ribosomes may control translation of specific mRNAs. Studies in Drosophila and mice support the existence of heterogeneous ribosomes that differentially translate mRNAs to coordinate cellular programs. Recent studies have now shown that ribosomal activity is not only a critical regulator of growth but also of metabolism. For instance, glycolysis and mitochondrial function have been found to be affected by ribosomal availability. Also, ATP levels drop in models of ribosomopathies. We discuss findings highlighting the relevance of ribosome heterogeneity in physiological and pathological conditions, as well as the possibility that in rate-limiting situations, ribosomes may favor some translational programs. We discuss the effects of ribosome heterogeneity on cellular metabolism, tumorigenesis and aging. We speculate a scenario in which ribosomes are not only executors of a metabolic program but act as modulators.

#### Edited by:

Chiara Gamberi, Concordia University, Canada

#### Reviewed by:

Kim De Keersmaecker, KU Leuven, Belgium Gary Loughran, University College Cork, Ireland

#### \*Correspondence:

Piera Calamita calamita@ingm.org Stefano Biffo biffo@ingm.org

#### Specialty section:

This article was submitted to RNA, a section of the journal Frontiers in Genetics

Received: 09 August 2018 Accepted: 22 October 2018 Published: 15 November 2018

#### Citation:

Calamita P, Gatti G, Miluzio A, Scagliola A and Biffo S (2018) Translating the Game: Ribosomes as Active Players. Front. Genet. 9:533. doi: 10.3389/fgene.2018.00533 Keywords: ribosomal proteins, ribosomopathies, ribosome heterogeneity, metabolism, Shwachman-Diamond syndrome, eIF6, RACK1

### INTRODUCTION

Translation the process by which mRNAs are translated into proteins by ribosomes. Eukaryotic ribosomes are evolutionarily conserved ribozymes constituted by ribosomal proteins (RPs) and rRNAs, whose structure has been spectacularly resolved (Ben-Shem et al., 2011; Klinge et al., 2011; Khatter et al., 2015). Ribosome biogenesis is a massive process occurring in the nucleolus of all cells. Recent progress, combining biochemical techniques, with structural and genetic evidence, has shown that ribosome synthesis is catalyzed and coordinated by more than 200 biogenesis factors. Ribosome biogenesis, therefore, proceeds through precise assembly steps that include several quality checkpoints, both in the nucleus and in the cytoplasm (Kressler et al., 2017; Pena et al., 2017). Furthermore, impairment of these checkpoints leads to defects in maturation that are associated with disease (Narla and Ebert, 2010; Ruggero and Shimamura, 2014).

In the cytoplasm, ribosomes are thought to constitute the hardware of the protein synthesis machinery, which fulfills its activity through four main phases: initiation, elongation, termination, and recycling. The initiation step is one of the most important steps of translation regulation, involving initiation factors, mRNAs, tRNAs, and ribosomes. Briefly, 40S subunits directly bind

**100**

mRNAs in a way that is dependent on initiation factors and on mRNA structure and, after mRNA binding and scanning to an appropriate start codon, 60S subunits are recruited. Several studies elucidated how translation initiation is affected by alteration in mRNAs-binding factors (Loreni et al., 2014; Chu et al., 2016; Truitt and Ruggero, 2016) and by different features in mRNAs structures, i.e., Untranslated regions (UTRs). Recently, also tRNA has been linked to selective translation, and reprogramming of metabolism since codon reprogramming leads to HIF1α synthesis and an increase of glycolytic factors (Rapino et al., 2018).

Evidences that ribosomes exist in different forms in different cell types or during different stages of development (Milne et al., 1975; Bortoluzzi et al., 2001; Volarevic and Thomas, 2001) have suggested the presence of ribosome heterogeneity. It has been recently demonstrated that mutations in some RPs result in selective translation (Shi and Barna, 2015) and mutations in proteins causing an impairing in ribosome maturation and function, as in the case of ribosomopathies, show a specific mRNA translation signature (Brina et al., 2015; In et al., 2016) In conclusion, in recent years there has been growing evidence that translation is driven by ribosome heterogeneity, manifested as ribosome populations differing in ribosomal components. In this review we discuss ribosome heterogeneity in physiological, and pathological conditions, highlighting the role of translation machinery in driving the last step of the molecular biology central dogma, which elects ribosomes as players in specific mRNAs translation.

### Ribosome Heterogeneity in Physiological Conditions May Account for Differential Translation

This topic has been recently discussed (Genuth and Barna, 2018a,b) and we will give a simple summary of some perspectives. Ribosomes are constituted by approximately 80 RPs. For many years now, it is known that the relative abundance of different RPs, in different tissues, or in different growth conditions, may vary (Milne et al., 1975; Bortoluzzi et al., 2001; Volarevic and Thomas, 2001). This is a sine qua noncondition for ribosomal heterogeneity. An obvious alternative explanation for an imbalance of the stoichiometry of RPs within a cell is that RPs perform ribosome-independent functions. An experimental complexity is, therefore, to define whether a differential translation is due to the direct action of heterogeneous ribosomes or to regulatory pathways affected by free RPs. This is the case for RACK1 that was originally isolated as a PKC receptor (Ron et al., 1994; Gallo and Manfrini, 2015). RACK1 is a structural protein of 40S subunits (Gerbasi et al., 2004), involved in several extraribosomal functions (Mamidipudi et al., 2004; Robles et al., 2010; Wehner et al., 2011; Gandin et al., 2013; Fei et al., 2017). RACK1 may affect the efficiency of ribosomes directly (Ceci et al., 2003; Shor et al., 2003; Guo J. et al., 2011; Dobrikov et al., 2018a,b) or indirectly through signaling pathways (Gandin et al., 2013; Volta et al., 2013). In conclusion, data demonstrate that in physiological conditions, ribosomal networks may be more complex than expected and perform choices in translational regulation.

Ribosomal heterogeneity exists in physiological conditions. Accurate proteomics studies have identified sub-stoichiometric relationships within translating polysomes (Shi et al., 2017), showing that ribosomes may preferentially translate specific mRNAs. An experimental validation shows that ribosomes devoid of either RPS25 (eS25) or RpL10A (uL1), in vivo, translate specific mRNAs. Mechanistically, this study shows that the 60S subunits may affect mRNA recruitment through the binding of RPL10A (uL1) to IRES (Internal Ribosome Entry Site) sequences in the 50UTR (Shi et al., 2017). In monocytes, interferon gamma driven phosphorylation results in RPL13A (uL13) detachment, but here it is still unknown whether ribosomes devoid of RPL13A (uL13) are able to translate selectively (Jia et al., 2012). Furthermore, RPL10 (uL16) R98S mutant leukemia cells are able to survive high oxidative stress levels by increasing IRESdependent BCL-2 translation (Kampen et al., 2018).

Thus, the concept of a monolithic ribosome (Moore et al., 1968; Yusupova and Yusupov, 2017) may be accompanied by the existence of a more flexible ribosomal platform that performs further tuning on gene expression (Shi and Barna, 2015).

### Ribosome Heterogeneity in Pathological Conditions Affects Translation and Gene Expression

Ribosomopathies are inherited diseases caused by the loss of ribosomal component functionality. Some examples of ribosomopathies include Diamond-Blackfan Anemia syndrome (DBA), Shwachman-Diamond syndrome (SDS), Treacher Collins syndrome, 5q-myelodysplastic syndrome, and Dyskeratosis Congenita (DKC). Notably, all of these syndromes are characterized by variably penetrant phenotypes in which specific tissue deficits are found (Narla and Ebert, 2010). Early on, it was shown that DKC1 mutations reduce pseudouridylation and impair IRES mediated translation (Yoon et al., 2006).

As a case for study, we will focus our discussion on SDS. Signs of SDS include a peculiar exocrine pancreatic insufficiency, along with neutropenia and variable abnormalities in the skeleton and other organs. In addition, SDS is characterized by a reduction in growth, accompanied by an increased incidence of Acute Myeloid Leukemia, (AML; Dror, 2008). At the ribosomal level, SDS is characterized by the partial loss of free 60S ribosomal subunits due to, in most cases, mutations in the SBDS gene that is necessary for 60S maturation (Boocock et al., 2003; Wong et al., 2011). In a minority of cases, mutations of EFL1p, which acts in synergy with SBDS, have been found (Stepensky et al., 2017; Tan et al., 2018). Overall, the reduced functionality of 60S ribosomes is a common theme for SDS (Warren, 2018). All together these findings generate three questions: (a) how the loss of functionality of ubiquitous 60S ribosomes can generate tissue-specific defects, (b) how specific translational programs can be affected by the lack of 60S subunits, (c) how can we reconcile increased tumor with reduced growth.

Addressing this last question helps to put in the right context the other two. We have recently demonstrated in our lab that

cells with mutant Sbds have reduced colony formation ability and are transformed less efficiently by oncogenes (Calamita et al., 2017). In this context, we demonstrated that Sbds deficiency directly acts by reducing the maximal oncogenic and translational capability of cells (Calamita et al., 2017). The paradox of reduced growth associated with tumor predisposition may not necessarily be associated with specific translation in tumor cells, but with a general impairment of tissue homeostasis that favors the appearance of mutant clones. For instance, increased tumor formation is observed in immunocompromised individuals (Verhoeven et al., 2018). To support this interpretation, the relationship between neutropenia and AML was described by different groups (Freedman et al., 2000; Link et al., 2007; Touw and Beekman, 2013). In conclusion, different cell types can be differentially affected by the reduction of RPs, i.e., thresholds can be different depending on the specific cellular demand of ribosomes for translation.

The question of the mechanism by which defects in 60S ribosomes lead to differential translation is more challenging since to our knowledge mRNA selection is driven by 40S subunits, prior to 60S engagement. However, the effects of 60S levels on specific translation are pervasive, and, as described before, IRES mRNA binding can be affected by RPL10 (uL16). In the case of Sbds depletion, characterized by reduced free 60S, two studies have addressed the question of preferential translation performing either microarray (Nihrane et al., 2009), or RNA-Seq on polysomes (Calamita et al., 2017). In addition, a reporter-based study has addressed the effect of SBDS depletion on reinitiation (In et al., 2016). Together, these studies support a model in which the SBDS deficiency reduces free 60S levels diminishing the maximal translational capability, and simultaneously changing translational selectivity. In this context, mRNAs that are intrinsically poorly translated because of uORFs (upstream Open Reading Frames) that require reinitiation are particularly disfavored. Similarly, mouse models have underscored that the reduction of 60S RPs affects the translational program of IRES containing mRNAs (Barna et al., 2008; Kondrashov et al., 2011; Xue et al., 2015).

Finally, mathematical modeling of translation suggests that a quantitative reduction in the translational output may result in strong alterations of specific mRNA translation due to stochastic events (Heinrich and Rapoport, 1980; Mills and Green, 2017). We conclude that some mRNAs can be particularly sensitive to ribosomal availability, and we speculate that this property has been evolutionarily exploited to connect ribosomes with other cellular events. What we still lack is understanding the precise mechanisms.

### A Common Theme for the Regulatory Function of Ribosomes?

Metabolic pathways are necessary for converting essential nutrients into energy and macromolecules that sustain cell growth and proliferation. Nutrients and metabolic pathways control all facets of cellular functions. Nutrient and growth factors converge on the translational machinery through signaling pathways that, in turn, regulate the synthesis of ribosomes and the activity of translation factors (Roux and Topisirovic, 2018). Then, translation factors crosstalk to metabolic choices (Biffo et al., 2018). Some well-established observations are the following. mTORC1 controls mitochondrial activity and biogenesis by selectively promoting translation of nucleus-encoded mitochondria-related mRNAs, via inhibition of the eukaryotic translation initiation factor 4E (eIF4E)-binding proteins (4E-BPs; Morita et al., 2013). ROS generation is also controlled partly at the translational level through eIF4E (Truitt et al., 2015). Glutamine metabolism is controlled by eIF4B-mediated translation downstream of mTORC1 pathway (Csibi et al., 2014). eIF3 complex mediates energy metabolism (Shah et al., 2016). Rate-limiting initiation factors that link 60S ribosome biogenesis to translation as eIF6 hierarchically control lipid synthesis and metabolism, through uORF and G/C rich 50UTR sequences (Brina et al., 2015). eIF5A2 accelerates lipogenesis in hepatocellular carcinoma (Cao et al., 2017). In general, translation and metabolism are dysregulated in a coordinated fashion (Leibovitch and Topisirovic, 2018), and initiation factors may act upstream of metabolic reprogramming (Biffo et al., 2018). The next question is whether ribosomes also control metabolic pathways.

In Zebrafish, rpl11 mutation decreased the glycolytic rate and the lower activity of glycolytic enzymes is rescued by p53 inhibition (Danilova et al., 2011). Moreover, defects, mutations or imbalance of RPs stabilized p53 and changed metabolic flux, specifically by decreasing glycolysis and enhancing aerobic respiration (Deisenroth and Zhang, 2011). Albeit these data

E-MTAB-5089, and analyzed in our previous work (Calamita et al., 2017).

do not support a direct crosstalk between ribosome activity and metabolism, they suggest overall that when the translation machinery is perturbed, coordinated pathways involved in cell homeostasis and metabolism are also altered.

Recently, it has been shown that SDS cells display an impairment in Complex IV activity, which causes an oxidative phosphorylation metabolic defect, with a consequent decrease in ATP production (Ravera et al., 2016). The authors suggest an indirect effect of SBDS mutation on energy production levels, indicating a possible role of calcium homeostasis in altering complex IV activity. In our lab we performed a characterization of a cellular model for SDS by immortalizing Mouse Embryonic Fibroblasts (MEFs; Calamita et al., 2017) derived from an SDS mouse model carrying the R126T mutation in homozygosity (SbdsR126T/R126<sup>T</sup> MEFs) (Tourlakis et al., 2012). Briefly, we established a model for studying SBDS function by retrasducing SbdsR126T/R126<sup>T</sup> MEFs with either wild-type Sbds (SbdsRESCUE), or mock control (SbdsMOCK) vectors. In this way, we can separate direct events due to a lack of SBDS from indirect effects. We confirmed a decrease in ATP levels associated with Sbds mutation. In addition, our RNA-Seq analysis revealed that genes belonging to complex IV were less expressed when Sbds was mutated (**Figure 1**). This downregulation could explain an impairment in cytochrome C oxidase activity and a consequent defect in ATP production. Moreover, there is a defect in oxygen consumption rate in SDS cells (Ravera et al., 2016; Calamita et al., 2017), as well as a reduction in the lactate/pyruvate ratio (Calamita et al., 2017). The mechanistic connection between ribosome function and the metabolic effects of its impairment is still to be clarified. Overall, a reduction in ribosomal efficiency seems to associate with a reduction in energy levels and lipid biosynthesis. We suggest that ribosomal capability has coevolved with other cellular functions and, specifically, ribosomes are intimately linked to nutrient levels and cellular growth.

The connection between ribosomes and growth is indeed strong and well-known. In Drosophila melanogaster the haploinsufficiency of RPs results in the minute phenotype, which includes short and thin bristles and smaller flies (Lambertsson, 1998; Marygold et al., 2007). Moreover, as shown by a myriad of papers, depletion of RPs causes a delay/arrest in cell cycle progression. In several cases, the regulation of growth is associated with ribosome independent function of RPs (Dai and Lu, 2004; Mamidipudi et al., 2004, Dutt et al., 2011; Yao et al., 2016). In other cases, the inhibition of growth has been directly

Keersmaecker et al., 2013.

linked to translational control driven by ribosomes (Barna et al., 2008). Depletion of different RPs may result in different types of inhibition of cell cycle progression, in line with the concept of heterogeneity in ribosomes (Badhai et al., 2009). Conversely, nucleolar enlargement grossly equals an increased production of ribosomes and is observed in many cancers (Montanaro et al., 2008). In many models, some heterozygous deletions of RPs reduce tumor growth (Barna et al., 2008; Chen et al., 2014; Wilson-Edell et al., 2014), while some others are associated with cancer development as demonstrated for the first time in zebrafish mutants for RPs in 2004 (Amsterdam et al., 2004). In the last years, several somatic mutations have been linked to tumor progression and belong to both 60S subunits such as RPL5 (uL18) and RPL10 (uL16) (De Keersmaecker et al., 2013), RPL 11 (uL5) (Tzoneva et al., 2013; Fancello et al., 2017), RPL22 (eL22) (Rao et al., 2012) and RPL 23 (uL23) (Fancello et al., 2017) and to 40S subunits such as RPS15 (uS19) (Landau et al., 2015; Ljungstrom et al., 2016), RPS27 (eS27) (Dutton-Regester et al., 2014) and RPSA (uS2) (Fancello et al., 2017). On the contrary, RPs overexpression has been also identified in cancer progression (Artero-Castro et al., 2011; Guo X. et al., 2011; Yang et al., 2016). Several recent reviews provide a comprehensive discussion on how, in some cases, loss of RPs contributes to cancer (Sulima et al., 2017; Genuth and Barna, 2018a; Pelletier et al., 2018).

The ribosomal apparatus also appears to affect longevity. Alterations in ribosomal protein expression result in an extension of eukaryotic lifespan (Hansen et al., 2007; Steffen et al., 2008).

In short, the persistent link between ribosomal function in growth and metabolism makes us speculate that there may be a yet-to-be-unveiled mechanistic connection. We favor a model in which mRNAs important for cell cycle progression or for key metabolic pathways contain UTRs that have coevolved with the translational machinery in order to be preferentially translated in conditions of optimal ribosomal capability. In this context, ribosomal heterogeneity may further tune the cell<sup>0</sup> s translational capabilities.

#### Mitochondrial Ribosomes

Several mitochondrial ribosome proteins are also involved in different cellular processes, such as cell cycle, apoptosis and mitochondrial homeostasis regulation. Mutations in mt-RPs genes are associated with mitochondrial dysfunctions and disorders (Saada et al., 2007; Smits et al., 2011; Serre et al., 2013; Menezes et al., 2015; Richman et al., 2015). For instance,

### REFERENCES


mutant MRPS16 (bS16m) causes mitochondrial respiratory chain disorders (Miller et al., 2004) and loss of MRPL10 (uL10m) diminished mitochondrial respiration and intracellular ATP levels (Li et al., 2016). In addition, a recent study claims the regulation of cytoplasmic protein homeostasis by mitochondrial translation (Suhm et al., 2018). These studies elucidate the fact that a crosstalk between the cytoplasmic and the mitochondrial ribosomal machinery may be present.

### CONCLUSION

Ribosomes have been long considered as monolithic structures ensuring mRNAs translation in a passive way. Nowadays, it has been well established that ribosomes can affect not only mRNA selection but also other fundamental processes such as cell growth and lately, cell homeostasis and metabolism (**Figure 2**). There is an increasing number of studies evidencing that the inter-correlation between ribosomes and metabolic pathways leads to a common cellular phenotype. Since ribosomes are a rate-limiting component of the translational program, further studies are needed to elucidate specific molecular mechanisms by which ribosome heterogeneity, supported by the translational apparatus, sustain cell growth and metabolic homeostasis.

## AUTHOR CONTRIBUTIONS

SB and PC reviewed and edited the manuscript. SB, PC, GG, and AS reviewed the literature. SB, PC, and AM wrote the manuscript. GG conceived and prepared figures, and edited the manuscript. All authors contributed, read, and approved the manuscript.

## FUNDING

This work was supported by Grant ERC TRANSLATE 338999 and IG 2014 AIRC to SB. PC was supported by Fondazione Umberto Veronesi.

## ACKNOWLEDGMENTS

We apologize for the excellent works that could not be cited due to space constraints.




Shi, Z., Fujii, K., Kovary, K. M., Genuth, N. R., Rost, H. L., Teruel, M. N., et al. (2017). Heterogeneous ribosomes preferentially translate distinct subpools of mRNAs genome-wide. Mol. Cell 67, 71–83.e7. doi: 10.1016/j.molcel.2017.05.021

Shor, B., Calaycay, J., Rushbrook, J., and McLeod, M. (2003). Cpc2/RACK1 is a ribosome-associated protein that promotes efficient translation in

Schizosaccharomyces pombe. J. Biol. Chem. 278, 49119–49128. doi: 10.1074/jbc. M303968200


deletion of 40S ribosomal protein S6. Science 288, 2045–2047. doi: 10.1126/ science.288.5473.2045


doi: 10.1098/rstb.2016.0184

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Calamita, Gatti, Miluzio, Scagliola and Biffo. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Multiples Fates of the Flavivirus RNA Genome During Pathogenesis

#### Clément Mazeaud† , Wesley Freppel† and Laurent Chatel-Chaix\*

Institut National de la Recherche Scientifique, Centre INRS-Institut Armand-Frappier, Laval, QC, Canada

The Flavivirus genus comprises many viruses (including dengue, Zika, West Nile and yellow fever viruses) which constitute important public health concerns worldwide. For several of these pathogens, neither antivirals nor vaccines are currently available. In addition to this unmet medical need, flaviviruses are of particular interest since they constitute an excellent model for the study of spatiotemporal regulation of RNA metabolism. Indeed, with no DNA intermediate or nuclear step, the flaviviral life cycle entirely relies on the cytoplasmic fate of a single RNA species, namely the genomic viral RNA (vRNA) which contains all the genetic information necessary for optimal viral replication. From a single open reading frame, the vRNA encodes a polyprotein which is processed to generate the mature viral proteins. In addition to coding for the viral polyprotein, the vRNA serves as a template for RNA synthesis and is also selectively packaged into newly assembled viral particles. Notably, vRNA translation, replication and encapsidation must be tightly coordinated in time and space via a fine-tuned equilibrium as these processes cannot occur simultaneously and hence, are mutually exclusive. As such, these dynamic processes involve several vRNA secondary and tertiary structures as well as RNA modifications. Finally, the vRNA can be detected as a foreign molecule by cytosolic sensors which trigger upon activation antiviral signaling pathways and the production of antiviral factors such as interferons and interferonstimulated genes. However, to create an environment favorable to infection, flaviviruses have evolved mechanisms to dampen these antiviral processes, notably through the production of a specific vRNA degradation product termed subgenomic flavivirus RNA (sfRNA). In this review, we discuss the current understanding of the fates of flavivirus vRNA and how this is regulated at the molecular level to achieve an optimal replication within infected cells.

Keywords: flavivirus, dengue virus, Zika virus, West Nile virus, viral RNA replication, translation, RNA encapsidation, innate immunity

## INTRODUCTION

Infections with flaviviruses constitute a major public health concern worldwide since they cause several human diseases with a wide range of symptoms that can potentially lead to lifelong impairment or even death. The genus Flavivirus within the Flaviviridae virus family comprises almost 70 reported species including the most studied yellow fever virus (YFV), dengue virus (DENV), Zika virus (ZIKV), West Nile virus (WNV), Japanese encephalitis virus (JEV), and tickborne encephalitis virus (TBEV). The vast majority of flaviviral infections in humans occur through

#### Edited by:

Chiara Gamberi, Concordia University, Canada

#### Reviewed by:

Susann Friedrich, Martin Luther University of Halle-Wittenberg, Germany Fatah Kashanchi, George Mason University, United States

#### \*Correspondence:

Laurent Chatel-Chaix Laurent.Chatel-Chaix@iaf.inrs.ca

†These authors have contributed equally to this work

#### Specialty section:

This article was submitted to RNA, a section of the journal Frontiers in Genetics

Received: 10 August 2018 Accepted: 15 November 2018 Published: 04 December 2018

#### Citation:

Mazeaud C, Freppel W and Chatel-Chaix L (2018) The Multiples Fates of the Flavivirus RNA Genome During Pathogenesis. Front. Genet. 9:595. doi: 10.3389/fgene.2018.00595

**108**

the biting by arthropods such as Aedes-type mosquitoes (mostly Aedes aegypti and Aedes albopictus) in the case of YFV, DENV, and ZIKV or Culex pipiens mosquitoes in the case of WNV. Vaccines do exist for YFV, DENV and TBEV. However, in the case of DENV, the cause of the most prevalent arthropodborne viral disease, the only available vaccine shows limited efficacy against all DENV serotypes and safety concerns have recently arisen in the Philippines in vaccinated children (Dyer, 2017). Importantly, no antivirals against flaviviruses are currently available partly because of our limited understanding of their life cycle and pathogenesis when compared to other virus groups. Interestingly, it appears that the general features of the life cycle are conserved across flaviviruses. Hence, there have been tremendous efforts by both industry and academia to identify or engineer antiviral drugs with a panflaviviral spectrum. This illustrates the importance of deciphering the molecular mechanisms underlying the flavivirus life cycle in order to identify novel antiviral targets.

The flavivirus life cycle is completely dependent on the cytoplasmic fate of only one RNA species, namely the genomic viral RNA (vRNA) whose replication entirely occurs in the cytoplasm and does not generate any DNA intermediates. Most notably, vRNA contains all the genetic information necessary for optimal virus replication. Hence, targeting vRNA or viral processes involved in its metabolism constitutes an attractive strategy for the development of novel antivirals. Moreover, fundamental virology often provides crucial insight into cellular machinery and processes at the molecular level. In this respect, flavivirus vRNA constitutes an exciting and excellent model for investigating the spatiotemporal regulation of RNA metabolism. With that in mind, we focus this review on our current understanding of the multiple fates of vRNA and how it orchestrates the viral life cycle and creates a cellular environment favorable to infection.

Flaviviruses are enveloped positive-strand RNA viruses that presumably contain a single copy of the genome RNA. Following receptor-mediated endocytosis of the virion and fusion with the endosomal membrane (reviewed in Perera-Lecoin et al., 2013), the vRNA is uncoated and released into the cytosol. The flaviviral vRNA genome contains all the genetic information required for efficient viral replication by hijacking the intracellular resources. With a single open reading frame, vRNA encodes an endoplasmic reticulum (ER)-associated transmembrane polyprotein (**Figure 1A**) (Garcia-Blanco et al., 2016; Neufeldt et al., 2018).

Upon translation, the polyprotein is subsequently processed by both cellular and viral proteases to generate 10 mature viral proteins. Structural proteins Capsid (C), Envelop (E) and prM assemble new viral particles while non-structural (NS) proteins NS1, NS2A, NS2B, NS3, NS4A, NS4B, and NS5 are responsible for vRNA replication (**Figure 1A**). vRNA synthesis relies on NS5, the RNA-dependent RNA polymerase as well as on critical vRNA secondary and tertiary structures. NS5 is also responsible for the capping of the neosynthetized vRNA. NS3 is a protease which, together with its co-factor NS2B, participates to the processing of the viral polyprotein. It also possesses helicase, NTPase and triphosphatase activities, all required for efficient vRNA synthesis and capping. vRNA is then encapsidated into assembling viral particles which bud into the ER. Assembled viruses egress through the secretory pathway where they undergo furin-mediated maturation in the Golgi apparatus, allowing fully infectious virions to be released via exocytosis (Apte-Sengupta et al., 2014; Neufeldt et al., 2018).

In order to efficiently complete the flaviviral life cycle, vRNA translation, synthesis and encapsidation must be tightly coordinated in both time and space since these processes cannot occur simultaneously and hence, are mutually exclusive. However, the molecular mechanisms underlying orchestration of these events remain mostly enigmatic. To achieve such a tight spatiotemporal regulation, flaviviruses, like the vast majority (if not all) of positive-strand RNA viruses, induce massive rearrangements of ER membranes to create a replicationfavorable microenvironment that is generically called "replication factories (RF)" (Chatel-Chaix and Bartenschlager, 2014; Paul and Bartenschlager, 2015). These organelle-like ultrastructures host vRNA synthesis among other functions (discussed in more detail below). They are believed to spatially segregate the different steps of the viral life cycle although this model is primarily based on descriptive ultrastructural studies using electron microscopy (Welsch et al., 2009; Gillespie et al., 2010; Miorin et al., 2013; Junjhon et al., 2014; Bily et al., 2015; Cortese et al., 2017). However, there remains a knowledge gap regarding the fate of the vRNA in the cytoplasm of the infected cell. The majority of imaging studies have relied on antibody-based detection of the viral double-stranded (ds) RNA, the positive strand/negative strand hybrid replication intermediate and hence, do not take into consideration vRNA populations engaged in translation or encapsidated into virions. In addition, no detailed fluorescence in situ hybridization (FISH) based sub-cellular distribution analyses of flaviviral vRNA and negative strand intermediate RNA have been reported to date in contrast with those of hepatitis C virus, a nonflavivirus member of the Flaviviridae family (Shulla and Randall, 2015).

The 10–11 kb-long flavivirus vRNA genome is composed of one open reading frame (ORF) flanked by highly structured 5<sup>0</sup> and 3<sup>0</sup> untranslated regions (UTR) (Selisko et al., 2014; Ng et al., 2017). The viral 50UTR and 30UTR have been demonstrated to engage in interactions with both host and viral proteins. The 30UTR can be sub-divided into three sub-domains: (1) a highly variable region located immediately after the stop codon, which is implicated in viral adaptation to the host (Villordo et al., 2015); (2) the moderately conserved region and (3) a highly conserved region (**Figure 1A**). Most importantly, the vRNA shows a high structural plasticity, as it must undergo conformational changes implicated in the different steps of the viral life cycle. For instance, for efficient genome replication, the genome adopts a "pan-handle"-like circularized structure, which is achieved through long range RNA-RNA interactions between 5 0 and 3<sup>0</sup> termini. Several circularization motifs (discussed in more detail below) have been identified and are depicted in **Figure 1**. Interestingly, several sequences and structures in the 5<sup>0</sup> and 30UTRs can harbor multiple functions during distinct steps of the life cycle.

### VIRAL TRANSLATION

Translation of the vRNA occurs at the surface of the ER and results in the synthesis of a highly membrane-associated polyprotein product. This polyprotein is further processed by host and cellular proteases co- and post-translationally through an ordered process that presumably dictates the ER topology of the mature viral proteins. Following the entry of the virion into the target cell, the first round of translation must take place in order to produce all viral proteins (including the viral RNA polymerase), which are absent from the infectious virus particle. Hence, vRNA translation is a critical step for the initiation of vRNA synthesis and subsequent amplification. The flaviviral genome, like cellular messenger RNAs (mRNA), contains a cap structure at the 5<sup>0</sup> end which enables translation through canonical cap-dependent translation initiation (Garcia-Blanco et al., 2016). The addition of the cap is mediated by NS5 protein's methyltransferase activity in combination with the nucleotide triphosphatase activity of NS3. NS3 removes a phosphate from the 5<sup>0</sup> terminus of the vRNA and NS5 catalyzes the addition of guanosine monophosphate (GMP) as well as the methylation of both this guanine on N-7 and the ribose-2<sup>0</sup> OH of the first adenosine to form a type 1 5<sup>0</sup> cap structure (m7GpppAm2) (Egloff et al., 2002; Ray et al., 2006; Klema et al., 2016; Wang et al., 2018). Remarkably, in contrast to cellular mRNAs, vRNA lacks a 3<sup>0</sup> poly-A tail. The poly-A tail is typically important for stability and to stimulate translation initiation of cellular mRNAs due to its strong association with poly-A-binding protein (PABP), which interacts with the cap-binding complex eIF4F and mediates the circularization of the mRNA. Despite the lack of a poly-A tail, the 3 0 end of DENV RNA can associates with PABP in vitro (Polacek et al., 2009). This interaction appears to be specific for A-rich sequences flanking the DB structures upstream the terminal 3<sup>0</sup> SL motif of the 30UTR (**Figure 1A**). Moreover, in in vitro translation assays using cell extracts, treatment with Paip2, an inhibitor of PABP, repressed translation of a reporter mRNA containing the DENV2 50UTR, the first 72 nt of capsid coding sequence and the 30UTR in a dose-dependent manner. This suggests that

PABP/30UTR interaction mimics the role of mRNA poly-A tail and presumably stimulates translation initiation. In addition, the 3 0 SL structure also modulates DENV translation. However, since this motif is functional in vitro within mRNAs containing either an internal ribosome entry site (IRES) or a non-functional cap, it was proposed that the 3<sup>0</sup> SL independently influences translation after cap binding by the small 40S ribosomal subunit (Holden and Harris, 2004).

Another stem loop structure in 50UTR named "capsidcoding region hairpin" (cHP) is also implicated in flaviviral translation (Clyde and Harris, 2006). Mutations in the cHP sequence which abrogate its secondary structure decreased initiation from the first AUG codon resulting in the production of shorter capsid protein products expressed from reporter RNAs in human hepatoma Hep3B and mosquito C6/36 cells. This highlights that cHP is important for initiating translation from the correct start codon and for generation of a functional capsid protein. Moreover, viral translation also relies on two pseudoknot motifs within the 30UTR called 50ψ and 30ψ. They involve two identical dumbbells structures termed 5<sup>0</sup> -DB and 3<sup>0</sup> -DB (or DB1 and DB2, respectively) which are both flanked by A-rich regions. Their formation is promoted by the presence of "conserved sequences" RCS2 and CS2 in 5<sup>0</sup> - DB and 3<sup>0</sup> -DB, as well as their respective terminal loops which both contain five nucleotide sequences named TL1 and TL2 (**Figure 1A**). TL1 and TL2 are complementary to pentanucleotide sequences PK1 and PK2 downstream of each DB. TL1/PK2 and TL2/PK1 tertiary interactions constitute the 50ψ and 30ψ pseudoknots, respectively (Olsthoorn and Bol, 2001; Manzano et al., 2011). Similar structures have also been reported for other flaviviruses such as JEV and YFV (Olsthoorn and Bol, 2001). Manzano and colleagues have also reported that the TL1 and TL2 are important for flavivirus translation in BHK-21 cells, but their respective contributions to translation appear unequal (Manzano et al., 2011). Indeed the deletion of TL2 impaired translation only modestly while disruption of TL1 had no effect. However, the deletion of both sequences resulted in a more severe phenotype strongly suggesting that TL1 and TL2 act synergistically to enhance translation from the DENV vRNA. A similar phenotype was observed when TL1 and TL2 were swapped. Importantly, mutations abrogating TL/PK complementarity impeded translation, which returned to wildtype levels by mutations that restored base pairing, highlighting the importance of these tertiary interactions. However, in contrast to TL1 and TL2, PK1 and PK2 are not absolutely necessary for translation suggesting that alternative TL receptors within the vRNA might exist. For instance, when the PK1 sequence is mutated, TL2 might interact with the top loop of 3<sup>0</sup> -SL. Taken together, these observations highlight that this core RNA region is crucial for the regulation of efficient viral translation.

In addition to canonical initiation of translation, capindependent mechanisms of translation have also been described for DENV. Indeed, DENV can achieve vRNA translation and wild-type production of infectious viral particles when capdependent translation is inhibited by treating the cells with drugs that impair the phosphoinositol-3 kinase (PI3K) pathway. Moreover, expression knockdown of eIF4E (a component of the eiF4F cap-binding complex) in hamster BHK-21 or monkey Vero cells led to a 60% decrease in total cellular protein synthesis, whereas DENV NS5 protein levels decreased by just 10% (Edgil et al., 2006). This data suggests that DENV translation initiation can also occur in a cap-independent manner. DENV capindependent translation appears to be regulated by both 5<sup>0</sup> and 3 <sup>0</sup>UTRs. Nonetheless, no IRES has been identified for flaviviruses in contrast to virus from other genus within the Flaviviridae such as hepatitis C virus (HCV) (Perard et al., 2013).

Using polysome profiling, Roth and coworkers have demonstrated that all tested flaviruses (namely all DENV serotypes, pathogenic WNV, historical and contemporary ZIKV strains) induce a general shut-off of host cell translation early following infection in human hepatocarcinoma Huh7 cells (Roth et al., 2017). This DENV-induced translation inhibition occurs at the initiation step. Interestingly, another group has recently shown that DENV infection in Huh7 cells negatively regulates the translation of host mRNAs that are associated with the ER without impacting the synthesis cytosolic proteins (Reid et al., 2018). In contrast, translation of vRNA appears to be unaffected by this global inhibition strongly supporting that flaviviruses specifically divert host protein synthesis for the benefit of viral translation and/or other steps of the life cycle. Importantly, cellular stresses such as infection or oxidative stress induce perturbations in cell translation (Anderson and Kedersha, 2008). More specifically, such stresses can induce translational arrest associated with polysome disassembly, and a concomitant appearance of stress granules (SG). SGs are cytoplasmic granules composed of untranslated mRNAs and the translation initiation machinery comprising proteins of the 48S preinitiation complex including eIF3, eIF4A, eIF4G, PABP1 and small ribosomal subunits (Anderson and Kedersha, 2008). In most of the cases, the formation of SGs requires the phosphorylation of eIF2α by protein kinase R (PKR) or PKR-like endoplasmic reticulum kinase (PERK). In its phosphorylated form, eIF2α inhibits global protein translation by reducing levels of the eIF2α-GTPtRNAMet ternary complex, which is absolutely required for translation initiation. Surprisingly, it appears that DENVmediated repression of translation initiation is not functionally linked to PKR, infection-associated eIF2α phosphorylation or SG induction. These observations are consistent with several studies that have reported that DENV, ZIKV and WNV infection inhibits the formation of SGs, especially when cells are under oxidative stress following treatment with the SG inducer sodium arsenite (Emara and Brinton, 2007; Amorim et al., 2017; Roth et al., 2017). In such conditions, reductions in the number of formed SGs and a decrease of phospho-eIF2α levels are observed in infected cells. This inhibition seems to be specific to eIF2α-specific SGs since ZIKV infection did not impact of the formation of SGs upon pateamine A or sodium selenite treatments which do not require prior eIF2α phosphorylation and are devoid of the SG marker TIA-1-related protein (TIAR) (Amorim et al., 2017). Interestingly, eIF2α-specific SG components T cell internal antigen-1 (TIA-1) and TIAR, which are known to induce translational silencing (Anderson and Kedersha, 2008) are diverted by flaviviruses to regulate replication (Li et al., 2002; Emara and Brinton, 2007). Indeed, in infected BHK-21 cells, these factors colocalize with viral proteins and dsRNA within the replication complex. Such relocalization has also been observed in TBEV-infected cells and it was proposed that these host factors inhibit the translation of the TBEV vRNA (Albornoz et al., 2014). Overall, these studies support the idea that flaviviruses manipulate host cell gene expression at the translational level to favor viral protein production and generate a cellular state which is favorable to replication.

#### vRNA REPLICATION

fgene-09-00595 December 1, 2018 Time: 14:6 # 5

### Overview of the vRNA Synthesis Process

vRNA replication is the core step leading to virus amplification and consists of de novo RNA synthesis (i.e., without initiation from a preexisting primer). Within RFs (see below), it generates a pool of neosynthesized vRNA molecules that are subsequently used for the formation of new replication complexes, for translation-driven production of viral proteins and for packaging into assembling virus particles. vRNA replication relies on the RNA-dependent RNA polymerase (RdRp) activity of flaviviral NS5 protein, through an asymmetric process. vRNA synthesis is initiated by binding of NS5 to a secondary structure located at the 5<sup>0</sup> terminus of the genome called "stem loop A" (SLA) which is critical for the initiation of vRNA synthesis (Filomatori et al., 2006). NS5 synthesizes first one molecule of negativestrand intermediate RNA using the positive-strand vRNA as a template. Subsequently, new copies of vRNA are made from this negative strand RNA, with a higher proportion of positive-strand vRNAs produced (You and Padmanabhan, 1999; Guyatt et al., 2001). NS5 requires both 5<sup>0</sup> and 30UTRs to initiate negativestrand RNA synthesis (Filomatori et al., 2006; Hodge et al., 2016). For the synthesis of this antigenome, the vRNA must adopt a circularized panhandle-shaped structure formed through long range interactions between the 5<sup>0</sup> and 30UTRs. This conformation allows to position the 5<sup>0</sup> and 30UTR in close proximity and to transfer SLA-bound NS5 from the 50UTR to the 3<sup>0</sup> stem loop (30 SL) located at the terminus of the 30UTR. More specifically, NS5 interacts with the top loop of the 3<sup>0</sup> SL which is highly conserved across the Flavivirus genus (Hodge et al., 2016). This configuration of NS5 enables the initiation of negative-strand synthesis using a pppAG dinucleotide as a primer. Following antigenome synthesis, NS5 can polymerize many copies of the positive-strand vRNA from the negative-strand intermediate. NS5 is optimized to specifically use the pppAG dinucleotide as a primer. As a result, 3<sup>0</sup> CU and 5<sup>0</sup> AG (3<sup>0</sup> CU in the antigenome) termini of vRNA are strictly conserved among flaviviruses (Selisko et al., 2012). The reasons why the flavivirus RNA synthesis is asymmetric in favor of the positive-strand RNA (i.e., vRNA) are still unclear. However, it has been proposed that in the double-stranded RNA state (vRNA/antigenome hybrid), vRNA SLA-bound NS5 molecules would be directly transferred from neosynthetized vRNA to the 3<sup>0</sup> end of the negative-strand (instead of vRNA 3<sup>0</sup> SL) to directly reinitiate positive-strand synthesis (Garcia-Blanco et al., 2016). This is consistent with a JEV study that indicated a greater affinity of NS5 for the 3<sup>0</sup> end of the negative-strand RNA than for vRNA 30UTR (Kim et al., 2007). This model of asymmetric viral RNA replication supports the idea that the negative-strand would not be free in the cell but rather annealed with both template and/or neosynthesized vRNA molecules.

Other RNA secondary structures in the 30UTR have also been reported to influence replication. For instance, in addition to their role in translation (as discussed above), the 50ψ and 30ψ tertiary structures in the DENV vRNA regulate RNA synthesis (Olsthoorn and Bol, 2001; Manzano et al., 2011). Indeed, mutations in PK or TL sequences disrupting the pseudoknots result in a decrease of viral replication. Similarly to the translation phenotype, the contributions of TL1 and TL2 to viral RNA replication are not equivalent. However, in contrast to what has been observed for translation, restoration of base pairing between the TL and PK sequences does not rescue the replication defects caused by individual mutations. This pinpoints that there are differences between the roles of 50ψ and 30ψ in translation and replication, in line with the idea that changes in the conformation of the vRNA modulate the different steps of the viral life cycle.

It is believed that NS3 helicase assists vRNA synthesis presumably through direct interactions with NS5 (Johansson et al., 2001; Takahashi et al., 2012). Although this helicase activity is absolutely required for flavivirus life cycle, it remains unclear which exact step of vRNA synthesis it regulates in infected cells. Nevertheless, based on in vitro analyses, several models have been proposed. NS3 helicase activity is most likely involved in vRNA synthesis from the negative strand. According to the model described above in which vRNA synthesis is initiated from dsRNA, NS3 would be required to unwind this molecule and displace the original vRNA molecule in favor of the nascent genome. Moreover, it cannot be excluded that NS3 also contributes to the synthesis of the antigenome by unwinding secondary and tertiary structures in vRNA. Finally, NS3 might unwind the RNA duplex so that neosynthesized positive-strands can be translated or packaged into assembling viral particles.

#### Genome Circularization

As discussed above, the cyclization of the vRNA is critical for the initiation of genome replication through the recruitment of NS5 to SLA in the 50UTR. vRNA shows a high structural plasticity with ample evidence to suggest that several sub-domains act as riboswitches to regulate the different steps of the viral life cycle, including initiation of RNA synthesis (**Figure 1B**). Firstly, DENV vRNA can adopt different conformations during infection, and switching from circular to linear conformations modulates negative and positive-strand RNA synthesis (Villordo et al., 2010). This structural plasticity relies on the highly structured 5 0 and 30UTRs, with the presence of a variety of stem loop structures termed "cyclization sequences" (or elements). Through sequence complementarity, they contribute to long range RNA-RNA interactions and hence, promote the circularization of the flaviviral genome. Notably, evidence of individual molecules of circularized DENV vRNA has been provided in vitro using atomic force microscopy (Alvarez et al., 2005). One of the main cyclization elements involved in this process are the "conserved sequences" (CS). They are constituted of 8 or more nucleotides

located in the 5<sup>0</sup> region of the capsid-coding sequence and in the 30UTR. CSs were first identified in WNV and have been demonstrated to be essential for flaviviral replication in BHK-21 cells (Khromykh et al., 2001a). However, Alvarez and colleagues later demonstrated that the base pairing between the 5 0 and 3<sup>0</sup> CS alone was necessary, but not sufficient for vRNA circularization in vitro. They demonstrated that other cyclization motifs contribute to this long range RNA-RNA interaction (Alvarez et al., 2005). Indeed, the "upstream AUG regions" (5<sup>0</sup> UAR) located before the polyprotein start codon in "stem loop B" (SLB) interacts with the 3<sup>0</sup> UAR which overlaps with the "small hairpin" (sHP) in the highly conserved 3<sup>0</sup> SL at the terminus of 3 <sup>0</sup>UTR (**Figure 1**). The "downstream AUG region" (DAR) long range interaction is also an important determinant of genome circularization. Of note, DAR are not very conserved among flaviviruses and while DENV show only one DAR interaction, WNV and YFV seem to rather possess a bipartite element (named DAR1 and DAR2) (Friebe et al., 2011; Brinton and Basu, 2015). In the case of DENV, a region of 6 nucleotides identified in the 5<sup>0</sup> region of the genome (50DAR) is involved in DENV replication and possibly also RNA cyclization. The 30DAR sequence mapping to the 5<sup>0</sup> stem of sHP is complementary to the 50DAR (Friebe and Harris, 2010; Villordo et al., 2010). Consistently, sHP has been demonstrated to be implicated in long-range RNA–RNA interactions, with a major contribution from the UAR-containing stem. Moreover, DENV harboring mutations in the stem of this structure replicates less efficiently than wild-type virus. However, while mutations in the 3<sup>0</sup> DAR sequence impact viral replication, its role in genome circularization through interactions with 5<sup>0</sup> DAR is less clear. In fact, 3<sup>0</sup> DAR mutations would rather influence the stability and/or the formation of sHP and this may explain their overall impact on viral genome replication. Friebe and Harris have hypothesized that the 5<sup>0</sup> -3<sup>0</sup> DAR interaction might not be needed to make the 30UAR accessible to the 5<sup>0</sup> UAR. Instead, they propose that the UAR, CS, DAR and cHP sequences constitute a functional unit essential for the circularization of vRNA (Friebe and Harris, 2010). In addition to its role in start codon selection during translation (see above), the cHP structure has also been shown to be important for DENV and WNV vRNA synthesis in a sequence-independent manner. How cHP influences vRNA synthesis remains to be determined; however, one possibility is that, within this functional unit, it contributes to the formation and/or stabilization of the vRNA panhandle structure (Clyde et al., 2008).

A sequence present downstream of the 5<sup>0</sup> CS element in the capsid-coding sequence called "downstream CS" (dCS) is important for flavivirus replication and this sequence impacts genome circularization by modulating the topology of the 5<sup>0</sup> end. In line with this, changes in the dCS sequence composition affects the formation of the 5<sup>0</sup> -30 long range RNA-RNA interactions (Friebe et al., 2012). Moreover, an RNA motif termed "downstream of 5<sup>0</sup> CS pseudoknot" (DCS-PK) also enhances replication in BHK-21 cells by regulating circularization (Liu et al., 2013). This tertiary interaction localizes to the capsidcoding region and appears to be constituted by a three-stem pseudoknot structure. Disruption of the DCS-PK structure hinders the ability of the 5<sup>0</sup> RNA to bind 3<sup>0</sup> RNA, while the rescue of DCS-PK structure recovered the formation of this 5 0 -30 interaction. It was proposed that both dCS and DCS-PK contribute to the function of the cyclization unit containing 5 <sup>0</sup>UAR, 50CS, 50DAR, and cHP. In this model, the DCS-PK sequence might help this unit to adopt specific conformations which favor genome circularization.

A structure present downstream SLA in the 50UTR has been recently identified as an important riboswitch, which controls the equilibrium between NS5 recruitment to SLA and the circularization of the vRNA. This motif named "5<sup>0</sup> UAR-flanking stem" (UFS) is a U-rich region located in SLB that promotes the formation of a conserved duplex RNA (Liu et al., 2016). The conformation of UFS is critical for the recruitment of NS5 to SLA and the SLA-dependent initiation of RNA synthesis. Indeed, mutations disrupting UFS result in a decrease in NS5 binding to the 50UTR in vitro and consequently of replication. In contrast, increasing the stability of the UFS does impede vRNA circularization. If UFS is too stable, this could result in a "lockedup" conformation of the UAR sequence which is known to be implicated in long range RNA-RNA interactions. Consistently, the circularization of vRNA induced the melting of the UFS structure resulting in a decrease in affinity of NS5 for the 5<sup>0</sup> end of the vRNA. These data support a model in which the UFS functions as a riboswitch during RNA replication, which dictates vRNA circularization and NS5 recruitment. Following the binding of NS5 to SLA, the circularization of the vRNA would induce a disruption of the UFS structure, leading to a decrease in the affinity between NS5 and the 50UTR. This would favor NS5 transfer to the 30UTR, hence properly positioning the polymerase for negative-strand RNA synthesis.

### Viral Replication Factories

A striking feature of flaviviral infections is the appearance of organelle-like membranous replication factories (RF) resulting from severe alterations of ER membranes. The detailed tridimensional architecture of RFs from several flaviviruses has been reconstructed using electron tomography (Welsch et al., 2009; Gillespie et al., 2010; Miorin et al., 2013; Junjhon et al., 2014; Bily et al., 2015; Cortese et al., 2017). RFs are constituted of several sub-structures namely vesicle packets (VP), convoluted membranes (CM), and virus bags (VB) which are morphologically different and can be found within the same ER network.

Vesicle packets are spherical vesicles which are induced by invaginations of the ER (**Figure 2**). They show similar morphology in both mosquito and mammalian cells suggesting that their biogenesis relies on evolutionary conserved host machineries and pathways. In mammalian cells, their diameter is approximately 90 nm and they are connected to the cytoplasm by a 10 nm-wide pore. Interestingly, it was shown that in the case of WNV and TBEV, vesicles within the same ER cisternae are also connected to each other by pore-like openings suggesting that they exchange material (Gillespie et al., 2010; Offerdahl et al., 2012; Miorin et al., 2013). The determinants of both types of pores are completely unknown. Immunogold labeling combined with electron microscopy has revealed that VPs contain dsRNA, the replication intermediate as well as

several viral non-structural proteins absolutely required for replication such as NS5, NS3, NS1, NS4A and NS4B (Welsch et al., 2009; Miorin et al., 2013; Junjhon et al., 2014). Hence, it is strongly believed that vRNA synthesis takes place in this compartment. Nevertheless, it remains unclear if VPs are absolutely required for replication or if other ER subcompartments can support replication. Furthermore, it is also thought that VPs constitute an environment favorable to vRNA synthesis. Indeed, they may play a role in protecting the vRNA from degradation by nucleases or recognition by cytosolic sensors of RNA, dampening the activation of antiviral signaling pathways. Finally, VPs would allow the concentration of metabolites, as well as cellular and viral factors required for efficient vRNA synthesis. However, these models remain to be experimentally validated.

Convoluted membranes are large reticular structures enriched in NS2B/3, NS4A and NS4B that are induced by membrane curvature and morphologically resemble tight accumulations of smooth ER membranes (Westaway et al., 1997; Miller et al., 2007; Welsch et al., 2009; Chatel-Chaix et al., 2016). The exact role of CMs is not well understood; however, they have been recently proposed to modulate cellular processes such as innate immunity or inter-organellar communication in order to create a proviral cytoplasmic environment rather than to directly regulate vRNA synthesis per se (Chatel-Chaix et al., 2016).

Newly assembled virions accumulate in regular arrays into VBs which are dilated ER cisternae (Welsch et al., 2009; Cortese et al., 2017). VPs and VBs may be found in close proximity within the same ER network, which contains ribosomes on its cytosolic side. This suggests that RFs provide a platform for the transfer of viral genomes between replication complexes, ribosomes and assembling virus particles. Moreover, this confers a spatial segregation of the different vRNA-containing complexes allowing the coordination of vRNA translation, replication and encapsidation in both space and time.

### Trans Co-factors Involved in vRNA Replication

All flaviviral NS viral proteins are absolutely required for vRNA synthesis (Apte-Sengupta et al., 2014; Selisko et al., 2014); yet, only NS5 and NS3 possess enzymatic activities. The transmembrane proteins NS1, NS4A and NS4B are believed to be implicated in the formation of RFs. Notably, when transiently expressed alone, DENV and WNV NS4A are able to induce to some extent the formation of CMs in Huh7 and Vero cells, respectively (Roosendaal et al., 2006; Miller et al., 2007). NS1 was shown to alter liposome membrane in vitro (Akey et al., 2014). Considering that all of the NS proteins physically and/or genetically interact, it is tempting to speculate that they synergistically act to coordinate the different steps of vRNA replication. For instance, the interaction between NS5 and NS3 seems to be important to functionally couple vRNA synthesis and dsRNA unwinding (Yu et al., 2013; Tay et al., 2015). Moreover, the DENV NS3 helicase domain associates with the cytosolic loop of the NS4B transmembrane protein (Umareddy et al., 2006; Chatel-Chaix et al., 2015; Zou et al., 2015a). NS4B mutants that lose the capacity to interact with NS3 are defective in replication suggesting that the integrity of this complex is critical for DENV life cycle. Interestingly, NS4B was shown to promote the dissociation of NS3 from single-stranded RNA (Umareddy et al., 2006) implying that it would indirectly stimulate the recruitment of the helicase toward newly formed replication intermediates and would promote their unwinding. Finally, NS4B homodimerizes and interacts with both NS1 and NS4A (Youn et al., 2012; Zou et al., 2014, 2015b; Chatel-Chaix et al., 2015; Li et al., 2015). This further supports that protein-protein

interactions coordinate the activity of the replication complexes with a precise ER membrane topology within VPs.

Finally, numerous cellular RNA-binding proteins have been reported to interact with flaviviral vRNA and to modulate genome replication. Some examples are listed in **Table 1**. While vRNA binding motifs have been identified in some studies, the precise molecular mechanisms by which these proteins modulate the viral life cycle remain unclear in most cases. Some proteins show specificity for vRNA motifs. For example, TIA-1 and TIAR interact with negative-strand RNA 3<sup>0</sup> SL of WNV and the knockout of these proteins decreases viral titers implicating these interactions in efficient viral replication (Li et al., 2002). In contrast, DDX6 and NF90 modulate DENV replication presumably through their interaction with the DB and 3 0 SL structures in the 30UTR of the vRNA, respectively (Gomila et al., 2011; Ward et al., 2011). Finally, the isoform p45 of host protein AUF1 (also named hnRNP D) was reported to positively regulate the replication of WNV, DENV and ZIKV in Huh7 cells by promoting vRNA circularization. Indeed, AUF1 destabilizes SLB and the 3<sup>0</sup> SL thereby exposing the UAR circularization elements (Friedrich et al., 2014, 2018) (**Figure 1B**). This illustrates that host factors are able to impact viral genome plasticity and to regulate important riboswitches in the flaviviral vRNA.

#### vRNA PACKAGING

During virus assembly, the vRNA genome must be encapsidated into neosynthetized viral particles. This step of the flavivirus life cycle which is a prerequisite for full infectivity is one of the least characterized and understood at the molecular level. Our understanding of the vRNA packaging process remains limited due to the lack of identification of (presumably short-lived) assembly intermediates. Moreover, comprehensive studies about

TABLE 1 | Host RNA-binding proteins involved in flavivirus life cycle.


For each cellular co-factor, the virus(es), the known RNA-binding site(s) and the step(s) of its life cycle which is regulated are indicated. For simplification purposes, an indicated role in translation does not exclude an impact on vRNA stability. (-)RNA, Minus strand viral RNA.

the intracellular distribution of the different vRNA populations are lacking.

The vRNA encapsidation process must face several "challenges" that intuitively, have to be tightly regulated. First, this process must be specific. Only vRNA is encapsidated while cellular RNA and viral negative-strand intermediate RNA must be excluded from the capsid. Second, the stoichiometry of the viral genome inside the virion (i.e., vRNA copy number per virion) is important for optimal infectivity (Kuhn et al., 2002; Byk and Gamarnik, 2016). The highly basic viral capsid protein binds with high affinity to the negatively charged vRNA in what is presumed to be a rather non-specific manner through electrostatic interactions (Pong et al., 2011; Byk and Gamarnik, 2016). However, as capsid molecules far outnumber the copies of vRNA in the virion, vRNA packaging must be regulated to achieve optimal vRNA intraviral stoichiometry (presumably one genome copy per virion) and full infectivity. In contrast to several other viruses like HIV-1 (Comas-Garcia et al., 2016), no bona fide RNA packaging signal has been identified for members of the Flavivirus genus. If packaging signal do exist, they are most likely to be located within the highly structured untranslated regions. Indeed, in the case of the related Hepacivirus HCV, the 3 <sup>0</sup>UTR was shown to be important for RNA trans-encapsidation while mechanistic details are still being characterized (Shi et al., 2016). Identifying putative flaviviral packaging signals remains challenging because, if located in the UTRs, they are likely to overlap with motifs important for translation and vRNA synthesis. Hence, mutation of these putative motifs may also potentially affect viral replication (and indirectly downstream virus assembly), making it difficult to functionally segregate replication from vRNA packaging. Nevertheless, one study has identified a cis-acting RNA motif that influences virus assembly. Using a silent mutagenesis approach, Groat-Carmona and colleagues demonstrated that the conserved DENV "capsidcoding region 1" (CCR1) influences the production of infectious particles in both insect C6/36 and mammalian BHK-21 cells without affecting vRNA stability, translation and synthesis (Groat-Carmona et al., 2012). Importantly, DENV replication as well as dissemination from the mid-gut to the salivary glands in the mosquito vector relied on the integrity of the CCR1 structure, highlighting the importance of this RNA motif in vivo. Since CCR1 mutations resulted in a drastic reduction in infectious titers without affecting the levels of extracellular vRNA, a contribution of CCR1 in vRNA packaging was ruled out by the authors and its exact role in virus assembly is still unknown. Nevertheless, a putative reduction of vRNA packaging may have been masked by the presence of non-encapsidated newly synthesized vRNA in the cell supernatants, which could have been released from the cell within exosomes in a viral assembly independent manner. Hence, a possible role of CCR1 in vRNA packaging should likely be re-evaluated. As discussed above (see vRNA replication), the structural dynamics of the vRNA itself allow it to orchestrate the different steps of vRNA replication including vRNA circularization, NS5 binding and RNA synthesis. Considering that some vRNA domains can function as riboswitches, it is tantalizing to speculate that conformational changes in vRNA secondary and tertiary structures drive vRNA transfer from replication complexes in VPs to assembling virions. Moreover, the methylation status of the vRNA might contribute determining its fate. Indeed, it was recently shown that DENV, WNV, YFV, ZIKV and HCV vRNAs are N<sup>6</sup> -methylated on adenosines by the host methyltransferases METTL3 and METTL14 in infected cells (Gokhale et al., 2016; Lichinchi et al., 2016). Very interestingly, N6A-methylated ZIKV vRNA is associated with cellular YTHDF proteins that inhibit infectious particle production (Lichinchi et al., 2016). In the case of HCV, the same inhibition is observed and it correlates with the redistribution of YTHDF proteins to lipid droplets (the virus assembly site) while this did not influence the vRNA replication process. Thus, this strongly suggests that N6A methylation specifically regulates virus assembly (Gokhale et al., 2016). Based on these results and the possible conservation across the Flaviviridae family, one might hypothesize that only vRNA molecules that are not N6A-methylated are packaged into assembling viruses. In addition, it is reasonable to consider that the methylation of vRNA influences its folding and hence, its functions during the different steps of the viral life cycle. Such hypotheses will likely be challenged in future studies.

Although no vRNA packaging signal has been identified, it is well established that trans-encapsidation is possible for flaviviruses. Indeed, when structural proteins are expressed in trans, they form virus-like particles that can encapsidate subgenomic replicons, i.e., replication-competent genomes that express only NS proteins (Khromykh et al., 1998; Ansarah-Sobrinho et al., 2008; Qing et al., 2010; Suzuki et al., 2014; Scaturro et al., 2015). The resulting trans-complemented particles are infectious and are able to undergo a single round of infection. Interestingly, the structural proteins are able to encapsidate genomes from other flaviviruses (Yoshii et al., 2008; Shustov and Frolov, 2010; Suzuki et al., 2014). This strongly suggests that the cis RNA and trans protein determinants of the vRNA packaging process are conserved across the Flavivirus genus. However, it remains elusive how the flaviviral genome is specifically selected for encapsidation. Like HCV core protein, flaviviral C protein accumulation on lipid droplets is important for the generation of infectious virus particles (Miyanari et al., 2007; Samsa et al., 2009; Carvalho et al., 2012; Martins et al., 2012; Iglesias et al., 2015). However, this pool of structural proteins may represent a storage compartment for assembly competent capsid rather than the actual site of genome selection and particle assembly. Interestingly, early studies on YFV and Murray Valley encephalitis virus (MVEV), another flavivirus, have highlighted that polyprotein processing and virus morphogenesis are functionally linked (Lobigs, 1993; Lee et al., 2000; Lobigs and Lee, 2004; Lobigs et al., 2010). Indeed, uncoupling these two processes by introducing mutations altering the processing kinetics of the signal peptide between capsid and prM, critically impaired nucleocapsid envelopment and the production of infectious viral particles. Most strikingly, several independent ultrastructural studies on DENV and ZIKV based on 3D reconstruction of replication factories revealed structures budding into the ER and juxtaposed to the pore of the VPs, the presumed site of vRNA replication (see above and **Figure 2**) (Welsch et al., 2009; Junjhon et al., 2014; Cortese et al., 2017). This pore

was observed in 90% of DENV VPs and is homogenous in size (diameter of ∼10 nm) (Welsch et al., 2009). In the case of WNV, the RNA inside the VP is aligned with the pore (Gillespie et al., 2010). However, nothing is known about its morphogenesis and dynamics, and the viroporin activity of VP-associated NS2A, NS2B and NS4B transmembrane proteins might participate to this process (Chang et al., 1999; Leon-Juarez et al., 2016; Shrivastava et al., 2017). In addition, the juxtaposed budding structures have the size of assembled virions and contain an electron dense core which may correspond to a vRNAcontaining capsid. Based on this observation, it is tempting to speculate that the vRNA replication and packaging processes are coordinated in time and space. The newly synthesized positivestrand genome molecule would exit the VP through the pore and be directly encapsidated into budding virions, hence conferring the selectivity of genome encapsidation. Such a coordinated model implies that the replication process and/or the presence of VPs would be required for vRNA encapsidation and envelopment by the ER membrane. This is further supported by early studies showing that replication is required for virus production in BHK-21 cells (Khromykh et al., 2001b). Indeed, DNA-launched WNV replication-deficient genomes fail to generate extracellular viral particles despite the presence of vRNA and structural proteins. It should be noted that replication is not required for the formation of sub-viral particles (i.e., devoid of the viral genome and noninfectious) whose budding can occur upon expression of prM/E alone (Schalich et al., 1996; Wang et al., 2009). Thus, it would be interesting to analyze the cellular RNA content of sub-viral particles since it is not known if they contain non-specifically enveloped cellular RNA or if they are free of nucleic acids. Importantly, it remains unclear how budding structures are physically juxtaposed to VPs and whether this event is absolutely required for the production of fully infectious virus. In addition to its critical function in replication, ZIKV and DENV NS1 were recently demonstrated to be important for both virus assembly and release (Scaturro et al., 2015; Yang et al., 2017). Consistently, ultrastructural studies have demonstrated that a fraction of DENV NS1 is associated with virions. Interestingly, mutants of the NS1 β-ladder domain lost their ability to indirectly associate with capsid while their interaction with glycoproteins E and prM was maintained (Scaturro et al., 2015). This suggests that NS1 might assist the specific envelopment of capsid/vRNA complexes in structures budding into the ER (**Figure 2**). Additionally, the expression level of WNV NS1', an alternative larger form of NS1 resulting from a translational frameshift was shown to influence the specific infectivity in trans-complementation experiments in BHK-21 cells (Winkelmann et al., 2011). In addition, NS2A also has an influence on both RNA replication and viral particle production (Liu et al., 2003; Leung et al., 2008; Vossmann et al., 2015; Wu et al., 2015; Xie et al., 2015; Yang et al., 2017). Finally, NS3 has specific functions during particle assembly independently from its enzymatic functions. Indeed the W349A mutation in YFV NS3 impacted infectious particle production while vRNA replication and the release of sub-viral particles remained unaffected (Patkar and Kuhn, 2008). Very interestingly, DENV NS3 helicase domain was shown to possess an RNA annealing activity in vitro (Gebhard

et al., 2012), implying that it can influence the conformation of vRNA in infected cells. This suggests that through specific interactions with the vRNA (Swarbrick et al., 2017), NS3 might promote the exposure of a putative packaging motif and directly regulate genome encapsidation and/or capsid envelopment. This regulation might also involve a contribution of assembling virions since WNV capsid protein, similarly to NS3, possesses an RNA chaperoning activity in vitro (Ivanyi-Nagy and Darlix, 2012). Taken together, all these findings support the idea that viral proteins bring together replication and assembly complexes to orchestrate an efficient and selective vRNA encapsidation process.

Host factors may also play a role in the tight regulation of vRNA packaging during virus assembly. Several cellular RNAbinding proteins have been reported to associate with the flaviviral genome mostly through the UTRs, and to be important for the viral life cycle (**Table 1**). In most studies, the authors did not identify the exact step controlled by their candidate protein or may not have considered vRNA packaging in the analyses. Interestingly, the RNA-binding protein DDX56 appears to be important for the production of infectious WNV particles, but not for vRNA replication, strongly suggesting that it acts during vRNA selection for encapsidation (Xu et al., 2011; Xu and Hobman, 2012; Reid and Hobman, 2017). Nevertheless, while virions released from DDX56 depleted cells contained less encapsidated vRNA, a DDX56-vRNA interaction remains to be demonstrated.

Several of the identified flaviviral replication co-factors such as YBX1, hnRNP K, DDX6 or DDX3 were reported to also associate with the genome of HCV (Ariumi et al., 2007; Paranjape and Harris, 2007; Jangra et al., 2010; Chatel-Chaix et al., 2011; Chahar et al., 2013; Chatel-Chaix et al., 2013; Li et al., 2014; Brunetti et al., 2015; Poenisch et al., 2015; Phillips et al., 2016). Those host factors are components of the same ribonucleoprotein complex (RNP) (Vashist et al., 2012; Chatel-Chaix et al., 2013; Upadhyay et al., 2013) and some of them have been reported to regulate the equilibrium between HCV RNA replication and the production of infectious viral particles suggesting that they control the transfer of vRNA from replication to assembly complexes (Chatel-Chaix et al., 2011; Chatel-Chaix et al., 2013). Whether the viral co-opting of this host RNP is conserved across the Flaviviridae family will have to be evaluated in the future. Nonetheless, it is likely that flaviviruses, as obligatory intracellular parasites, hijack the function of several host RNA-binding proteins during vRNA encapsidation. One can envisage that such co-opting would influence or be modulated by the various 3D structures and modifications of the vRNA. Interestingly, several of Flaviviridae vRNA-binding proteins, such as hnRNP C, hnRNP A2/B1 and RBMX (see **Table 1**) were showed to have enhanced affinity for N6A methylated RNAs whose local conformation is changed by this modification (Alarcon et al., 2015; Liu et al., 2015, 2017). This suggests a functional link between vRNA modifications, riboswitches and riboproteomic profiles. Thus, integration of all currently known models will likely help to provide a clearer understanding of how flaviviruses control genome selection for encapsidation.

### FLAVIVIRAL RNA AND INNATE IMMUNITY

fgene-09-00595 December 1, 2018 Time: 14:6 # 11

### vRNA and Pattern Recognition Receptors

During viral entry or RNA amplification, flaviviral RNA can be sensed as foreign RNA by the cell and trigger antiviral innate immunity in mammalian cells. This first line of defense involves RNA sensors that, once activated, trigger a signaling cascade leading to the production of interferons (IFN) and interferon-stimulated genes (ISG). ISGs are antiviral effectors that in some cases, specifically target vRNA, may be secreted as proinflammatory cytokines or generate an overall antiviral state to impede virus replication (Adachi et al., 1998). Pattern recognition receptors (PRR) such as Toll-like receptor 3 (TLR3), retinoic acid-inducible gene I (RIG-I) as well as melanoma differentiation-associated protein 5 (MDA5) are expert sensors of highly structured viral RNAs or dsRNA, and consequently, are implicated in antiflaviviral host responses (Loo et al., 2008; Nasirudeen et al., 2011; Sprokholt et al., 2017).

TLR3 is a member of the Toll-like receptor family and plays a crucial role in activation of the immune response by recognition of dsRNA in endosomes, presumably during viral entry (Leifer and Medvedev, 2016; Gao and Li, 2017). TLR3 recognizes DENV RNA in infected cells and its overexpression or stimulation reduces viral replication (Tsai et al., 2009; Liang et al., 2011). In a pathological context, TLR3 knockout mice are more susceptible to lethal WNV infection (Daffis et al., 2008).

RIG-I belongs to the RIG-I-like receptor (RLR) family and possesses a dsRNA helicase activity. It is a cytosolic PRR that targets specifically dsRNA and the 5<sup>0</sup> tri/diphosphate moiety of short structured uncapped RNAs (Yoneyama et al., 2004; Pichlmair et al., 2006; Takahasi et al., 2008; Goubau et al., 2014). It also has been shown to have affinity for the polyuridine tract of HCV 30UTR (Schnell et al., 2012). Such sequences are very unusual in cellular RNAs and hence, constitute foreign signatures. Interestingly, treatment of cells or mice with U-rich 5 <sup>0</sup>ppp-based RIG-I agonists protect from infection with variety of viruses (Chiang et al., 2015). MDA5 is another RLR family member related to the RIG-I protein. MDA5 targets long viral dsRNAs and activates the same innate antiviral response as RIG-I (Schlee, 2013). Once activated by RNA recognition, RIG-I and MDA5 interact with "mitochondrial antiviral-signaling protein" (MAVS) at the surface of mitochondria through their CARD domains. This interaction results in a signaling cascade to the nucleus via transcription factors NF-κB and IRF3. This ultimately leads to the induction of type I IFN, proinflammatory cytokines and ISGs expression (Gack and Diamond, 2016).

More recently, it has been shown that the cyclic GMP-AMP synthase (cGAS)/stimulator of IFN genes (STING) pathway, which normally senses DNA virus infection and mitochondrial DNA damage, is also activated upon RNA virus infection (including DENV and WNV) and induces type I IFN production (Schoggins et al., 2014). The vRNA sensing mechanism is still not well understood but DENV-induced mitochondrial damage may be involved in this process (Aguirre et al., 2017; Sun et al., 2017). Moreover, the relevance of this pathway to flavivirus infection is further highlighted by several evidence that DENV NS2B and NS3 are able to counteract the functions of cGAS and STING, respectively (Aguirre et al., 2012, 2017; Yu et al., 2012).

### Flaviviral RNA-Based Evasion From the Innate Immune System

From a virus-host co-evolution viewpoint, the mammalian innate immune response has evolved to counteract viral infection. Of course, this also implies an adaptation from the pathogens in order to evade innate immunity to the benefit of replication. To this end, several interference mechanisms involving flaviviral proteins have been described over the last decade. Indeed, these viruses can dampen the antiviral signaling pathways by inhibiting for instance, cGAS, STING, RIG-I, MAVS, TBK1 and STAT1/2 functions through interactions with NS5, NS3, NS2B or NS4B (Chatel-Chaix et al., 2016; Gack and Diamond, 2016; Aguirre et al., 2017; Miorin et al., 2017). In addition, the viral genome itself and its degradation by-products also contribute to the efficient evasion of innate immunity. Firstly, as mentioned above, vRNA most likely replicates within VPs, constituting a confined environment providing limited access to cytosolic vRNA sensors. Hence, from an ultrastructural perspective, it is tempting to speculate that VPs "hide" vRNA and the dsRNA replication intermediate from the innate immune detection machinery. Nevertheless, such models remain to be addressed specifically.

#### vRNA Methylation and Innate Immunity

Interestingly, several studies have shown that vRNA modifications may confer the vRNA with the ability to be marked as "self " and evade recognition by host sensors of foreign RNA. Indeed, in addition to vRNA capping, NS5 also possesses a 2<sup>0</sup> -Omethyltransferase activity (Bradrick, 2017). 2<sup>0</sup> -O-methylation is an RNA modification on the first and second nucleotides of the mRNA cap structures in which the ribose is methylated at the 2 0 -OH position by cellular nucleoside 2<sup>0</sup> -O-methyltransferases (MTase) contributing to form cap 1 (m7GpppNm) or cap 2 (m7GpppNmNm) structures (Furuichi and Shatkin, 2000). Hence, through NS5-mediated 2<sup>0</sup> -O methylation of its cap, the flaviviral vRNA mimics cellular mRNAs. Moreover, DENV and WNV NS5 proteins were demonstrated to also perform internal RNA methylation on vRNAs that lack the 5<sup>0</sup> cap structure (Dong et al., 2012). In this case, these modifications occur specifically at the 2<sup>0</sup> -OH position of adenosine residues.

By mimicking cellular mRNA, modified vRNAs appear to evade the host immune response during infection. Indeed, DENV 2<sup>0</sup> O-MTase deficient viruses are severely attenuated and do not properly spread in cell lines that possess a functional IFN response system (like lung carcinoma A549 cells) (Schmid et al., 2015; Chang et al., 2016). In the case of WNV, a virus expressing the NS5 E218A mutant which lacks the 2<sup>0</sup> - O MTase activity is attenuated in primary cells and mice with a strongly reduced pathogenicity including the complete loss of virus-induced lethality (Daffis et al., 2010). Importantly, the pathogenicity of this mutant virus in vivo was restored in

mice harboring a deficiency in type I interferon signaling. This strongly supports the idea that 2<sup>0</sup> -O-methylation is crucial to evade the type I IFN-dependent antiviral response. Notably, mutant and wild-type viruses induce comparable levels of IFNs suggesting that WNV vRNA sensing by RLR or TLR3 is not involved in this evasion strategy. Importantly, this methylation-dependent antiviral effect was attributed to IFNinduced proteins with tetratricopeptide repeats (IFIT). More specifically, replication and pathogenicity of WNV E218A mutant virus was rescued and comparable to wild-type virus in Ifit1 knockout mice demonstrating the key role of this ISG in antiviral immunity (Daffis et al., 2010). As compared to other IFITs, IFIT1 recognizes with high specificity RNAs lacking a 2<sup>0</sup> -O methylated cap. This results in the sequestration of these RNAs from translation initiation factors and consequently, in the inhibition of their translation (Habjan et al., 2013; Kimura et al., 2013; Kumar et al., 2014). Nevertheless, Ifit1 deficiency did not rescue the replication WNV E218A in brain endothelial cells in contrast to other cell types of the central nervous system (Szretter et al., 2012). Consistently, overexpression of IFIT1 in 293-DC-SIGN cells only partially inhibited the replication of DENV2 2<sup>0</sup> -O MTase mutant (Zust et al., 2013). This highlights that the 2-O-methylation of vRNA allows evasion from innate immunity and relies on both IFIT1 dependent and independent mechanisms according to the cell type.

Interestingly, the role of virus-mediated 2<sup>0</sup> -O-methylation as a countermeasure against innate immunity has also been recognized in mouse and human coronaviruses. In this case, 2 0 -O MTase-deficient viruses induced a stronger type I IFN response resulting in attenuation of viral replication (Zust et al., 2011; Schmid et al., 2015). However, coronaviral replication was restored upon suppression of type I IFN receptor (IFNAR) or cytosolic RNA sensor MDA5 expression suggesting that 2<sup>0</sup> -O methylation of the coronavirus RNA directly evades early RNA sensing by MDA5. Whether the same strategy is also employed by flaviviruses (other than WNV) remains to be elucidated. Interestingly, a recent study has demonstrated that a DENV E216A 2<sup>0</sup> -O MTase-deficient mutant induced an early innate immune response after just a few hours of infection, consistent with a putative detection of unmethylated vRNA by RLRs such as MDA5 or RIG-I (Chang et al., 2016).

Since 2<sup>0</sup> -O-methylation is important for optimal viral replication and also has structural similarities among flaviviruses, it was proposed that 2<sup>0</sup> -O MTase-deficient viruses could be exploited as attenuated vaccines. Indeed, several groups have engineered attenuated DENV or JEV that lack 2<sup>0</sup> -O-methylation activity and are thus more sensitive to IFN inhibition than parental viruses (Li S.H. et al., 2013; Zust et al., 2013). Robust humoral and cellular immune responses protecting against both viruses were obtained after inoculation of mice with these attenuated viruses. In the case of DENV, protection was also achieved in rhesus macaques after a single administration of the vaccine candidate (Zust et al., 2013). These results pinpoint the potential success of such attenuated vaccine-based approach, which may be efficacious against a wide range of flaviviruses.

#### The Action of sfRNA Against Innate Immunity

During the infection, the accumulation of viral genome generates several by-products which do not encode any viral proteins. Three classes of non-coding RNAs have been described to date: viral small RNAs (vsRNAs) (Hamilton and Baulcombe, 1999), defective interfering genomes (DIGs) (Li and Brinton, 2001; Pesko et al., 2012; Juarez-Martinez et al., 2013) and most relevant to this review, the subgenomic flavivirus RNA (sfRNA) (Urosevic et al., 1997; Pijlman et al., 2008). DIGs and vsRNAs are not well described yet and remain to be characterized in detail. In contrast, the sfRNA has been comprehensively investigated during the last decade, especially with regards to its role in modulating host biological processes.

Produced by all tested members of the Flavivirus genus, sfRNA is a highly structured 0.3–0.7 kb-long non-coding RNA and apparently is the most abundant viral RNA species in the infected cell (Pijlman et al., 2008; Bidet et al., 2014; Manokaran et al., 2015; Akiyama et al., 2016; Donald et al., 2016; Bidet et al., 2017). It is well established that sfRNA is produced by an incomplete 5<sup>0</sup> -3<sup>0</sup> degradation of the viral genome by the cellular XRN1/Pacman exonuclease. During RNA degradation, XRN1 is stalled at the 30UTR extremity, more precisely at stemloops/pseudoknots of the highly variable region upstream the DB structures causing the accumulation of different species of sfRNA (Pijlman et al., 2008; Funk et al., 2010; Silva et al., 2010; Chapman et al., 2014a,b; Akiyama et al., 2016). While XRN1 is required for sfRNA production, sfRNA is also able to sequester this cellular protein and to inhibit its endogenous functions (Silva et al., 2010; Moon et al., 2012; Chapman et al., 2014a). In 293T cells, this results in the accumulation of uncapped cellular mRNAs in the cytosol (Moon et al., 2012). However, the consequences of such inhibition are still unclear and remain to be further deciphered.

Despite its high levels in the cytosol, sfRNA does not appear to play a direct role in replication since mutations impairing sfRNA production do not affect vRNA synthesis in WNV-, YFV-, and DENV-infected cells (Funk et al., 2010; Schuessler et al., 2012; Szretter et al., 2012). Rather, sfRNA contributes to viral cytopathicity both in cellulo and in vivo partly by interfering with innate immune responses. For instance, in the case of WNV and DENV infection, mutations inhibiting sfRNA production lead to a decrease in the viral replication in cells that possess functional type I IFN responses, supporting the idea that sfRNA aids in evasion of the IFN response (Schuessler et al., 2012; Bidet et al., 2014, 2017). Moreover, a recent study shows that DENV sfRNA negatively impacts IFN induction through the inhibition of TRIM25, an E3 ubiquitin ligase required for RIG-I activation (Manokaran et al., 2015). Indeed, the interaction of sfRNA with TRIM25 in a sequence-dependent manner prevents its deubiquitylation. As a result, the decrease in IFN induction is consistent with an impairment in TRIM25 mediated polyubiquitylation and subsequent activation of RIG-I. Consistent with a conservation of this evasion strategy among flaviviruses, it was shown that JEV sfRNA overexpression in infected A549 cells inhibits IRF3 phosphorylation and its nuclear translocation that are required for type I IFN transcription (Chang et al., 2013). Finally, the RLR-dependent IFN induction

pathway was also inhibited upon ZIKV sfRNA overexpression in stimulated cells (Donald et al., 2016).

Downstream of IFN production and signaling, the sfRNA also plays a role in ISG expression at the post-transcriptional level. Indeed, host RNA-binding proteins G3BP1, G3PB2 and CAPRIN, which are involved in ISG translation are inhibited by their association with DENV sfRNA in Huh7 cells (Bidet et al., 2014). G3BP proteins are core components of SGs (Anderson and Kedersha, 2006) whose formation and functions are modulated by flavivirus infection, as discussed above. Hence, it is tempting to speculate that, through the hijacking of SG components, flaviviruses remodel the host proteome by positively or negatively regulating the expression of pro- and anti-viral host proteins, respectively.

Interestingly, several studies have shown that sfRNA also plays an important role in flaviviral life cycle and dissemination in infected insects (Schnettler et al., 2012; Moon et al., 2015; Pompon et al., 2017). The determinants of the viral genome governing the abundance of sfRNA appear to be the same in insect and mammalian cells; however, in mosquitos, the sfRNA causes disruption of the innate immune response in salivary glands by inhibiting the Toll receptor pathway (Pompon et al., 2017). Furthermore, sfRNA downregulates the RNA interference (RNAi) machinery, the main mediator of innate immunity in insects (Schnettler et al., 2012; Moon et al., 2015). Mutations in DENV and WNV decreasing the production of sfRNA showed an impairment in RNAi suppression (Moon et al., 2015). This appears to be mediated by the association of sfRNA with Dicer and Ago2, two essential proteins of the RNAi machinery. Taken together, these data suggest that the sfRNA crucially contributes at multiple levels to viral evasion from innate immunity in both arthropod and mammalian hosts.

#### sfRNA-MEDIATED MODULATION OF PATHOGENESIS

In addition to its roles in innate immune evasion, the sfRNA was shown to be important for WNV and DENV pathogenicity. WNV or DENV genomes harboring mutations that disrupt the formation of full-length sfRNA produced much smaller plaques in cell culture (Pijlman et al., 2008; Liu et al., 2014). Consistently, drastic decreases in overall cell death and apoptosis were observed. These phenotypes were rescued by the expression of the sfRNA in trans. However, sfRNA overexpression alone did not induce any cell death implying that its action requires flavivirus replication. Importantly, viral translation, vRNA synthesis and particle production were not significantly affected by the loss of sfRNA expression. Overall, this strongly suggests that sfRNA is not essential for flavivirus replication but rather modulates cytopathic effects in addition to innate immunity. Interestingly, DENV sfRNA-mutated viruses were unable to inhibit Bcl2 and the AKT/PI3K pro-survival pathways suggesting that flavivirus-induced cytopathic effects rely on the modulation of these signaling cascades (Liu et al., 2014). Most importantly, mice infected with full length sfRNAdeficient WNV all survive in contrast to the usual 100% mortality rate with wild-type WNV (Pijlman et al., 2008). This did not correlate with defects in virus dissemination in the brain and spleen confirming that sfRNA is crucial for pathogenicity in vivo without directly regulating viral replication. In stark contrast, overexpression of JEV sfRNA decreased virus-induced apoptosis in infected A549 cells (Chang et al., 2013). While it is clear that sfRNA is crucial for WNV pathogenicity and that all tested flaviviruses produce sfRNA (Pijlman et al., 2008; Bidet et al., 2014, 2017; Manokaran et al., 2015; Akiyama et al., 2016; Donald et al., 2016), their respective contributions to pathogenesis remain to be addressed. Finally, how sfRNA modulates the flavivirusinduced cytopathic effects at the molecular level is completely unknown. It will be interesting to determine if sfRNA acts at the gene expression level or rather post-translationally through direct interactions with factors involved in cell survival and/or cell death.

## OPEN QUESTIONS AND CONCLUSION

The tremendous work on flaviviruses during the last decade highlights the complexity of the molecular mechanisms governing the fate and functions of the vRNA. This includes dynamic RNA secondary and tertiary structures, RNA modifications such as 2<sup>0</sup> -O and N6A methylation, the formation of functional vRNA sub-products (like sfRNA) and the participation of viral proteins as well as host RNA-binding proteins. This intricate network is most likely hosted within viral RFs. Future studies will be needed to explore how all these regulated processes are interconnected to generate a precise integrated model of vRNA metabolism. For instance, does vRNA methylation on specific nucleotides impact vRNA tertiary structure formation, cyclization and/or affinity for host RNAbinding proteins, and vice-versa? Do these structural changes impact the efficiency of vRNA packaging into assembling virions? Importantly, when compared with HCV RFs, little is known about how flaviviruses regulate the morphogenesis of VPs that are very homogenous in size and shape. The same applies to the formation and maintenance of the VP pore that is believed to play a pivotal role in the transfer of vRNA from replication complexes to assembling particles. How is it functionally coordinated with budding viruses? Is this a dynamic structure oscillating between open and closed states? What is its viral and cellular protein composition? Finally, these considerations should ideally always take into account that flaviviruses infect both insects and mammals. Indeed, subtle differences between hosts in the life cycle (especially with regards to host factor dependency) may be observed and of great interest.

Overall, this review highlights how flaviviruses have evolved to confer upon a single RNA species and one viral polyprotein product all the information required for optimal infection in both insect and mammalian hosts. More generally, all of these open questions regarding the vRNA perfectly illustrate the importance of flaviviruses as an exquisite model to study spatio-temporal control of RNA metabolism. Finally, a precise understanding of the dynamic control of vRNA in the flavivirus life cycle will hopefully identify potential therapeutic targets for the development of antivirals, ideally with a broad pan-flaviviral spectrum.

#### AUTHOR CONTRIBUTIONS

fgene-09-00595 December 1, 2018 Time: 14:6 # 14

CM and WF contributed equally to this work. CM and WF wrote the manuscript and made the figures. LC-C edited the final version of the manuscript.

#### FUNDING

LC-C is receiving a research scholar (Junior 2) salary support from Fonds de la Recherche du Québec-Santé (FRQS).

### REFERENCES


LC-C's research is supported by grants from Natural Sciences and Engineering Research Council of Canada (NSERC; RGPIN-2016-05584), the Canadian Institutes of Health Research (CIHR; PJT153020; ICS154142), Fonds de la Recherche du Québec-Nature et Technologies (FRQNT; 2018-NC-205593), Armand-Frappier Foundation, and Institut National de la Recherche Scientifique (INRS).

#### ACKNOWLEDGMENTS

We are grateful to Dr. Selena M. Sagan (McGill University, Canada) and Dr. Karine Boulay (Université de Montréal, Canada) for critical reading of the manuscript and helpful comments. We apologize to colleagues whose work could not be mentioned or referenced in this review due to space limitations.


positively regulate viral replication. Virology 436, 1–7. doi: 10.1016/j.virol.2012. 09.041



replication sites in dengue virus-infected mosquito cells. J. Virol. 88, 4687–4697. doi: 10.1128/JVI.00118-14


virus complementary minus-strand RNA and facilitate virus replication. J. Virol. 76, 11989–12000.




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Mazeaud, Freppel and Chatel-Chaix. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Translational Control in Stem Cells

Soroush Tahmasebi<sup>1</sup> \*, Mehdi Amiri2,3 and Nahum Sonenberg2,3 \*

Keywords: translational control, stem cell, protein synthesis, development, mRNA

<sup>1</sup> Department of Pharmacology, University of Illinois at Chicago, Chicago, IL, United States, <sup>2</sup> Goodman Cancer Research Center, McGill University, Montreal, QC, Canada, <sup>3</sup> Department of Biochemistry, McGill University, Montreal, QC, Canada

Simultaneous measurements of mRNA and protein abundance and turnover in mammalian cells, have revealed that a significant portion of the cellular proteome is controlled by mRNA translation. Recent studies have demonstrated that both embryonic and somatic stem cells are dependent on low translation rates to maintain an undifferentiated state. Conversely, differentiation requires increased protein synthesis and failure to do so prevents differentiation. Notably, the low translation in stem cell populations is independent of the cell cycle, indicating that stem cells use unique strategies to decouple these fundamental cellular processes. In this chapter, we discuss different mechanisms used by stem cells to control translation, as well as the developmental consequences of translational deregulation.

Edited by: Akio Kanai, Keio University, Japan

#### Reviewed by:

Thomas Preiss, Australian National University, Australia Toshinobu Fujiwara, Kindai University, Japan

\*Correspondence:

Soroush Tahmasebi sorousht@uic.edu Nahum Sonenberg nahum.sonenberg@mcgill.ca

#### Specialty section:

This article was submitted to RNA, a section of the journal Frontiers in Genetics

Received: 19 September 2018 Accepted: 17 December 2018 Published: 15 January 2019

#### Citation:

Tahmasebi S, Amiri M and Sonenberg N (2019) Translational Control in Stem Cells. Front. Genet. 9:709. doi: 10.3389/fgene.2018.00709

## INTRODUCTION

#### The Importance of Translation Control in Mammalian Cells and Stem Cells

The abundance of proteins in a mammalian cell varies by several orders of magnitude (103– 10<sup>8</sup> molecules per cell) (Li et al., 2014; Li and Biggin, 2015). Transcription rate, messenger RNA (mRNA) turnover, translation rate, and protein degradation are four fundamental cellular processes that regulate protein abundance. The poor correlation between protein and mRNA abundance, which is documented in numerous studies, and higher conservation of protein expression compared to mRNA expression across species suggest that post-transcriptional control explains a large percentage of protein variability (Gygi et al., 1999; Maier et al., 2009; Vogel et al., 2010; Schwanhausser et al., 2011; Aviner et al., 2013; Khan et al., 2013; Sharma et al., 2015). Parallel measurements of mRNA and protein levels along with mRNA and protein turnover demonstrated that the translation rate plays a dominant role in regulating the cellular proteome (Schwanhausser et al., 2011). Others reported a much higher correlation between mRNA and protein levels (R <sup>2</sup> ∼= 0.6–0.9) (Li et al., 2014; Jovanovic et al., 2015). Nevertheless, these studies suggest that cell status dictates the contribution of transcriptional versus translational control in defining the proteome of the cell. During steady state or after long-term differentiation/adaptation, transcriptional control is considered the main determinant of the cellular proteome, whereas during early stages of state transition (differentiation/adaptation), translational control plays a dominant role (Lu et al., 2009; Ingolia et al., 2011; Kristensen et al., 2013). Translational control allows cells to quickly respond to internal and external stimuli before a new transcription program comes into effect (Liu et al., 2016).

Notably, among different proteins in the cells the levels of transcription factors and proteins performing essential cellular processes (e.g., ribosomal and mitochondrial proteins), are more stringently subjected to translational control (Lee et al., 2013; Jovanovic et al., 2015). This exquisite

dependency on translational control, also referred to as "translation on demand," has been well documented during early developmental stages, a period when transcription is known to be silenced. For instance, selective translational upregulation of few transcription factors (e.g., Nanog, Sox19b, and Pou5f1) is essential for activation of zygotic genome and maternalto-zygotic transition (MZT) in zebrafish (Lee et al., 2013). The regulatory information encrypted in the 5<sup>0</sup> and 3<sup>0</sup> mRNA untranslated regions' (UTRs) sequences plays a critical role in rendering a subset of mRNAs sensitive to translational control (Hinnebusch et al., 2016). Ribosome footprinting analysis underscored the importance of the upstream open reading frame (uORF) in translational control of several key pluripotency factors, such as Myc and Nanog (Ingolia et al., 2011). In addition to the importance of translational control in defining the cellular proteome, translational control also impacts transcription. A recent study uncovered a delicate fine-tuning between translation and transcription in embryonic stem cells (ESCs) and peri-implantation embryos. An acute inhibition of global translation (using cycloheximide or mTOR inhibitors) disrupts the hypertranscription and euchromatic state of ESCs (Bulut-Karslioglu et al., 2018). This finding highlights the importance of coordination between transcription and translation for maintenance of self-renewal and pluripotency.

### Initiation, the Rate Limiting Step of Translation

mRNA translation is divided into four steps; initiation, elongation, termination, and ribosome recycling. Initiation is the process through which the small subunit of the ribosome (40S), as a component of the 43S preinitiation complex, is recruited to the mRNA, and scans the mRNA 50UTR from 5<sup>0</sup> to 3<sup>0</sup> to recognize the start codon. Following recognition, the 80S initiation complex is assembled at the start codon and elongation will proceed (Sonenberg and Hinnebusch, 2009; Hinnebusch et al., 2016).

Eukaryotic ribosomes (consisting of 4 ribosomal RNAs and 80 ribosomal proteins) are not fully equipped to directly bind to mRNAs and hence, start translation. The activities of multiple eukaryotic translation initiation factors (eIFs) are therefore required for recruitment of ribosomes to mRNAs and translation initiation. The orchestrated activity of eIFs culminates in the assembly of two multisubunit complexes, the 43S preinitiation complex (consist of small ribosomal subunit, initiator tRNA, and eIF1, 1A, 2, and 3) and the eIF4F complex (consist of eIF4E, eIF4A, and eIF4G) at 5<sup>0</sup> end of mRNA. In eukaryotic cells, the abundance of a key component of the eIF4F complex, cap-binding protein eukaryotic translation initiation factor 4E (eIF4E), is far less than that of ribosomes [41 × 10<sup>4</sup> molecules of eIF4E compared to 1064 × 10<sup>4</sup> cytosolic ribosomes per HeLa cell (Merrick and Pavitt, 2018)], which makes eIF4E availability the limiting factor for translation initiation. The activity of eIF2B has been also identified as a rate-limiting step in translation initiation. The eIF2B is a guanine nucleotideexchange factor (GEF) that converts eIF2.GDP to eIF2.GTP, a critical step requires for the formation of the 43S preinitiation complex. Consequently, most mammalian cells, including stem cells, have a surplus of non-translating ribosomes, which could be engaged in translation through the control of the activity of eIFs. Several signaling pathways such as the mechanistic Target of Rapamycin (mTOR), the mitogen activated protein kinase (MAPK), and the integrated stress response (ISR) control translation through phosphorylation of activators (e.g., eIF4E and eIF2α) or inhibitors [e.g., 4E-BPs (eIF4E-binding proteins; inhibitors of eIF4E), PDCD4 (Programmed Cell Death 4; an inhibitor of eIF4A)] of translation initiation. This provides a tunable translation regulatory system that adjusts the translation rate, according to cellular demands.

### GLOBAL TRANSLATION IS INHIBITED IN STEM CELL POPULATIONS

Studies in both embryonic and adult stem cells demonstrated that stem cells require low translation rates to maintain an undifferentiated status (**Figure 1**; Sampath et al., 2008; Signer et al., 2014; Blanco et al., 2016; Zismanov et al., 2016). Sampath et al. (2008) first found that global translation is low in undifferentiated ESCs compared to EB (embryoid body). Differentiation [5 days culture in the absence of LIF (leukemia inhibitory factor)] increases polysome density in the differentiating cells by ∼60% and [35S] methionine incorporation by ∼2-fold as compared to undifferentiated ESCs. The increase in translation of differentiated cells coincides with a significant increase in the content of total RNA (∼50%), ribosomal RNA (∼20%), and proteins (∼30%).

Similar to ESCs, global translation is suppressed in somatic stem cells. Studies on various tissue specific stem cells such as hematopoietic stem cells (HSCs), hair follicle stem cells (HFSCs), and muscle stem cells (satellite cells) demonstrated that protein synthesis is restricted in stem cell population and is increased upon differentiation (Signer et al., 2014; Blanco et al., 2016; Zismanov et al., 2016).

Tight control of translation is crucial for the maintenance of HSCs, as only a 30% decrease (using Rpl24Bst/<sup>+</sup> mice, where ribosome protein Rpl24 is partially depleted) or increase (using Mx1-Cre; Ptenfl/fl mice, where Pten is depleted from adult hematopoietic cells) in protein synthesis is sufficient to impair the proliferation and self-renewal of HSCs (Signer et al., 2014). The rate of protein synthesis also impacts normal hair cycle through regulation of the self-renewal and differentiation of HFSCs (Blanco et al., 2016). Activation of HFSCs during hair growth (transition from telogen to anagen) coincides with a profound increase in protein synthesis (**Figure 1D**). Committed progenitor cells located at the inner root sheath (IRS) display the highest translation rate compared to other progenitors. The importance of translation control in regulating HFSC has been highlighted in NOP2/Sun RNA Methyltransferase Family Member 2 (NSUN2) knockout (KO) mouse (Blanco et al., 2016). NSUN2 is an RNA methyltransferase that converts cytosine to 5-methylcytosine (m5C), and is required for decoding activity and stability of tRNAs. Hypomethylated tRNAs that are accumulated in NSUN2 KO cells, are cleaved by endonuclease and the resulting tRNA

fragments inhibit translation initiation (Spriggs et al., 2010; Ivanov et al., 2011; Sobala and Hutvagner, 2013). NSUN2 is highly expressed in committed progenitor cells of the epidermis. The inhibition of translation in NSUN2 KO cells blocks the differentiation of epidermal stem cells toward committed progenitors, which leads to cyclic alopecia in the mouse (Blanco et al., 2016).

Lack of Pseudouridylate Synthase 7 (PUS7) has the opposite effect to that of NSUN2 deficiency. PUS7 is a member of pseudouridine synthases (PUSs) that catalyzes the pseudouridylation (9) of a subset of tRNAs at U8 (uridine at position 8 of tRNA). Pseudouridylation of a group of tRNAderived small fragments inhibit translation initiation, and consequently, the absence of PUS7 promotes global translation.

Interestingly, a recent study uncovered the importance of PUS7 activity in maintenance and differentiation of ESCs and HSCs (Guzzi et al., 2018).

Translational control also plays a central role in differentiation of adult stem cells in the testis. This has been well documented in several translationally defective mouse models including in NSUN2 KO mouse. In addition to the epidermis, NSUN2 is highly expressed in testis and plays a critical role in germ cell differentiation. Consequently, NSUN2 KO males not only have defect in hair growth but they also display infertility. During the late stages of spermatogenesis, translation activation of germ cell-specific mRNAs is required for successful generation of spermatozoa (**Figure 1G**). Inhibition of global translation due to NSUN2 depletion halts the progression of a germ cell through the late stages of spermatogenesis, engendering infertility. Interestingly, a similar phenotype (male infertility and defect in late spermatogenesis) has also been reported in Paip2a {Pabp [poly(A)-binding protein]-interacting protein 2A} KO mice, where global translation is inhibited (Yanagiya et al., 2010). Three Pabp – interacting proteins (Paips) have been discovered in mammals [Paip1 (Craig et al., 1998), Paip2a (Khaleghpour et al., 2001), and Paip2b (Berlanga et al., 2006)]. This family of proteins regulates mRNA translation and stability through the control of PABP function. Lack of PAIP2s has been linked to translation activation as their bindings to PABP compete with the interaction of PABP with the poly(A) tail and eIF4G (Khaleghpour et al., 2001; Karim et al., 2006). During late spermatogenesis, translational derepression of a subset of mRNAs, such as Prms (protamines) and Tps (transition proteins), is essential for the generation of functional spermatozoa. This translational derepression coincides with shortening of poly(A) tails, from approximately 180 nucleotides in a translationally repressed state to 30 nucleotides in a translationally active state (Kleene, 1989). Conversely, lack of PAIPs during spermatogenesis inhibits translation of Prms and Tps. This effect has been explained by an excess expression of Pabpc1 (an isoform of Pabp that is expressed in Elongating spermatids) (Yanagiya et al., 2010). Altogether, these findings demonstrate that translational control is a key modulator of stem cell differentiation.

#### HOW DO STEM CELLS MAINTAIN A LOW TRANSLATION RATE?

#### Ribosome Biogenesis

Under physiological condition, ribosome abundance is not considered a limiting factor for translation initiation in stem cells (**Figure 2**). However, studies in Drosophila and mammals suggest that differentiation of stem cells relies on increased ribosomal biogenesis (Ingolia et al., 2011; Zhang et al., 2014; Sanchez et al., 2016). Sampath et al. (2008) found that ribosomal RNAs are ∼20% elevated in 5 day EB as compared to mESCs. Using ribosome footprinting, Ingolia et al. (2011) identified a modest increase in translation of ribosomal proteins (RPs) mRNAs at early stages of differentiation (36 h after LIF withdrawal), whereas translation of RPs strongly suppressed at later time points (8 days EBs). They concluded that the increase in expression of RPs at early stages of differentiation is required for the profound increase in global translation observed at later stages and is mediated by mTORC1 activation.

Single cell sequencing of neural stem cells (NSCs) demonstrated that in response to injury, there is a dramatic increase in transcription of the genes involved in ribosome biogenesis (Llorens-Bobadilla et al., 2015). The increase in ribosome biogenesis triggers a global increase in protein synthesis, which is required for the activation and differentiation of NSCs. Study of ribosomophaties also highlights the importance of the ribosome in differentiation. Ribosomopathies are a group of inherited human diseases that are caused by mutations in the small or large ribosomal subunits or factors involved in ribosome biogenesis (Tahmasebi et al., 2018a). While ribosomes can be found in almost all mammalian cells, it is surprising that defects in ribosomal function preferably affect only specific cell types, most prominently erythroid progenitors. Several hypotheses have been proposed to explain the cell type and tissue specificity associated with ribosomopathies. One model suggests that ribosomes are heterogeneous and each cell type possesses its own unique set of ribosomes, which are specialized in translating cell type-specific mRNAs (specialized ribosome model) (Xue and Barna, 2012; Shi et al., 2017). An alternative model suggests that ribosomes are homogenous, but different mRNAs or cell types have different sensitivity to ribosomal defects (ribosome concentration model) (Mills and Green, 2017). For instance, studies in Diamond-Blackfan anemia (DBA) demonstrated that mutations in 60S or 40S ribosomal proteins [such as RPL5, RPL11, RPS7, RPS10 among others (Tahmasebi et al., 2018a)] decrease the ribosome levels but leave the composition of the ribosomes intact. This renders ribosome availability a limiting factor for translation of a subset of mRNAs, such as GATA1, that play a critical role in differentiation of HSCs (Khajuria et al., 2018). In support of this model, mutation of other factors that impair ribosome biogenesis have been linked to depletion of HSCs (Le Bouteiller et al., 2013).

#### mTORC1/4E-BPs

The importance of the mTORC1/4E-BPs pathway in self-renewal and differentiation of stem cells is well documented in ESCs, HSCs, and NSCs (Sampath et al., 2008; Hartman et al., 2013; Signer et al., 2014, 2016; Tahmasebi et al., 2016). ESCs have the remarkable ability to maintain low mTORC1 activity in the presence of LIF (an activator of the PI3K-Akt pathway) and a high content of amino acids and serum (15% FBS) in the medium (Sampath et al., 2008; Tahmasebi et al., 2014, 2016). Combining polysome profiling with microarray analysis, Sampath et al. (2008) discovered a hierarchical translation control network downstream of the mTORC1/4E-BP pathway that regulates expression of pro-differentiation mRNAs. mTOR activity and phosphorylation of 4E-BP1 increase in response to ESC differentiation. The importance of 4E-BPs in the regulation of self-renewal and differentiation of ESCs has been also examined in ESCs lacking 4E-BP1 and 2 (the two 4E-BP isoforms that are highly expressed in ESCs). 4E-BP1/2 DKO ESCs proliferate slower than WT cells and are prone to differentiation partly through increased translation of YY2

mRNA (Tahmasebi et al., 2016). In addition, the mTORC1/4E-BP pathway plays a critical role in the generation of induced pluripotent stem cells (iPSCs) (Chen et al., 2011; He et al., 2012; Tahmasebi et al., 2014). Interestingly, more recent evidence indicates that ESCs are largely tolerant to mTOR inhibition. Inhibition of the mTOR pathway by mTOR inhibitors (INK128 and RapaLink-1) engenders a reversible paused state in ESCs and blastocysts. Paused ESCs are translationally and transcriptionally silent, but remain pluripotent, mimicking a diapause state of blastocysts in vivo (Bulut-Karslioglu et al., 2016).

There is increasing evidence that the mTOR/4E-BP pathway also contributes to translation inhibition and maintenance of adult stem cells such as HSCs and NSCs. Phosphorylation of 4E-BP1 is reduced in HSC/MPPs (multipotent progenitors) compared to most progenitor cells, and abrogation of 4E-BP1 and 2 specifically increases protein synthesis in HSCs, while having a negative effect on their ability for reconstitution (Signer et al., 2016). The subventricular zone (SVZ) in the fetal and adult brain of mammals harbors a small population of cells with stem cell properties (self-renewal and multipotency), known as NSCs (**Figure 1E**; Gage, 2000). Hartman et al. (2013) demonstrated that the mTORC1 is suppressed in quiescent NSCs located at the SVZ. The activity of mTORC1 is increased (judged by phosphorylation of 4E-BP1/2 and ribosomal protein S6) in proliferating NSC progenitors undergoing differentiation. Genetic (shRNA against Rheb or Raptor) or pharmacological (rapamycin) inhibition of mTORC1 blocks differentiation of NSCs to intermediate progenitors, resulting in lower neuron production. Hyperactivation of mTORC1 mediated by a constitutively active Rheb (RhebCA) induces differentiation of NSC and reduces the population of self-renewing NSCs, specifically through inhibition of 4E-BPs (Hartman et al., 2013).

#### ISR Pathway

Recent studies highlighted the importance of the ISR pathway in translational control of stem cells. The ISR pathway activation is triggered by a family of four kinases that control translation initiation through phosphorylation of eIF2α. The eIF2α kinases encompass HRI (heme-regulated inhibitor; also known as eIF2AK1), PKR (protein kinase RNA-activated; also known as eIF2AK2), PERK [PKR-like endoplasmic reticulum (ER) kinase; also known as eIF2AK3], and GCN2 (general control nonderepressible 2; also known as eIF2AK4). All four kinases share a conserved kinase domain but each has evolved unique regulatory domains that only sense and respond to a distinct set of stressors. While p-eIF2α decreases global translation, it has a stimulatory effect on the translation of selective mRNAs containing uORFs within their 50UTR such as Atf4, Chop, and BiP. By suppressing global translation but increasing translation of stress-induced mRNAs, cells can overcome the stress condition. The significance of HRI and PERK in erythropoiesis and differentiation of pancreatic beta cells, respectively, has been uncovered using transgenic animal models (Han et al., 2001; Harding et al., 2001; Zhang et al., 2006). The discovery of PERK mutations in Wolcott-Rallison syndrome (WRS), a multi-systemic disease with early-onset diabetes mellitus, further supports the findings in animal models (Delepine et al., 2000). Additionally, genome-wide translational profiling underscores the importance of eIF2 phosphorylation in erythroid homeostasis (Paolini et al., 2018). Increasing evidence in recent years emerged that demonstrate the importance of the ISR pathway in stem cells. Zismanov et al. (2016) used the eIF2α S51A/S51A mouse model (where phosphorylation of eIF2α has been blocked by mutation of serine 51 to alanine) to highlight the significance of eIF2α phosphorylation in muscle stem cells. Muscle stem cells, also known as satellite cells, are a small population of cells located between sarcolemma and the basal lamina of muscle fibers (**Figure 1F**), and play a critical role in growth and regeneration of muscles. In quiescent satellite cells, the level of p-eIF2α is high but it quickly decreases once the cells differentiate and start to activate the myogenic program. The high level of p-eIF2α in the quiescent satellite cells has been linked to relatively high activity of PERK in these cells. Zismanov et al. (2016) further showed that in addition to the well-characterized p-eIF2α targets (e.g., Atf4 and Chop), translation of numerous stem cell-related mRNAs such as the deubiquitinating enzyme Usp9x (Ivanova et al., 2002; Ramalho-Santos et al., 2002) relies on p-eIF2α. Importantly, a chemical-mediated increase in p-eIF2α


TABLE 1 | Lethal phenotypes resulting from change in activity or lack of eIFs in mouse.

(using sal003, a compound that inhibits the eIF2α phosphatase Gadd34/PP1) promotes self-renewal and regenerative capacity of cultured satellite cells, indicating that modulation of p-eIF2α can be used as a strategy to improve stem cell transplantation. The mTORC1 pathway also regulates the activity of satellite cells and is required for their transition from G0 quiescent state into GAlert phase (an "alerting" state of quiescent stem cells that allows them to immediately enter the cell cycle and respond to injury or stress) (Rodgers et al., 2014). Thus, in addition to p-eIF2α, it is highly likely that the activity of 4E-BPs contributes to translation inhibition in satellite cells.

Studies in other stem cell populations also uncovered the importance of p-eIF2α in self-renewal and differentiation. Undifferentiated ESCs have a high level of p-eIF2α, while differentiation decreases p-eIF2α levels (Friend et al., 2015). p-eIF2α promotes translation of stem cell factors, such as Nanog and Myc containing uORFs in their 50UTR. A study in human HSCs demonstrated that PERK and PERK-dependent genes (Atf4, Chop, and Gadd34) are enriched in HSPCs (HSCs and progenitor cells; CD34+CD38−) as compared to more differentiated progenitors (CD34+CD38+) (van Galen et al., 2014). Accordingly, HSCs display a higher sensitivity (increased apoptosis and reduced clonogenic capacity) to ER stress compared to progenitors. Overexpression of ERDJ4 (a member of the J protein family that fosters protein folding in ER) in HSCs decreases ER stress and promotes in vivo transplantation (van Galen et al., 2014).

#### Other Translation Factors

It is very likely that additional translational factors or regulators contribute to translational control in stem cells. For instance, recent data support the importance of m(6)A RNA modification in differentiation of ESCs (Batista et al., 2014). Despite the long list of biochemically characterized eIFs, only few studies examined the role of eIFs in stem cells. Lack of eIFs in mouse is often embryonic or perinatal lethal and has detrimental effects on stem cells and normal development (**Table 1**).

#### DAP5/p97/NAT1, eIF4G2

Nat1 (also known as DAP5 and eIF4G2) is an eIF4G homolog that interacts with eIF4A, eIF3, and MNK. However, in contrast to eIF4G, p97/DAP5/Nat1 does not bind to eIF4E and therefore has been proposed to be involved in capindependent translation (Henis-Korenblit et al., 2002; Liberman et al., 2015). Nat1 KO mice are embryonic lethal and display defects in the gastrulation step (Yamanaka et al., 2000). Proliferation and global translation are similar between Nat1 null ESCs and their WT counterpart. However, Nat1 null cells are resistant to differentiation in both mouse and human (Yamanaka et al., 2000; Yoffe et al., 2016). Ribosome foot-printing analysis of Nat1 KO ESCs demonstrated that lack of Nat1 causes a decrease translation of differentiationpromoting factors such as Map3k3 and Sos1 (Sugiyama et al., 2017).

#### TRANSLATION INHIBITION IN STEM CELLS IS CELL CYCLE INDEPENDENT

Studies on embryonic and adult stem cells demonstrated that translation inhibition is independent of replication rate in these cells. Mouse ESCs exhibit a fast replication rate (divide every 8–10 h as compared to >16 h of differentiated cells), and have a unique cell cycle control (Singh and Dalton, 2009), as they progress through a very short G1 phase (15%), while residing mostly in S phase (65%). Human ESCs maintain similar cell cycle structure as mouse ESCs, however, they replicate much slower (divide every 30–38 h) (Singh and Dalton, 2009). Adult stem cells are slow-growing cells that spend most of their time in a dormant state (G0/G1) and only divide in response to physiological or pathological stimuli. Low translation rate of HSCs is not just a consequence of their dormant state, as when protein synthesis was compared using cell cycle-matched populations (S/G2/M or G0/G1), HSCs exhibited a lower translation rate compared to differentiated progenitors (Signer et al., 2014). Study in HFSCs also demonstrates that the rate of protein synthesis is independent of cell cycle and proliferation (Blanco et al., 2016). How stem cells decouple translation rate from cell cycle control has yet to be understood, and remains one of many intriguing questions in the stem cell field.

#### CONCLUDING REMARKS

fgene-09-00709 January 12, 2019 Time: 17:5 # 7

It has been more than five decades since the importance of translation control in early developmental processes was delineated through the study of the fertilization of sea urchin eggs (Hultin, 1961; Monroy et al., 1961; Tahmasebi et al., 2018b). However, the role of translational control in differentiation and maintenance of stem cells has been explored only recently. Technological advances in the studies of translation, combined

#### REFERENCES


with novel genetic approaches, are beginning to provide the essential tools required for understanding this critical step of gene expression in stem cell plasticity.

### AUTHOR CONTRIBUTIONS

All authors contributed in conceptualizing and writing the review.

development. Proc. Natl. Acad. Sci. U.S.A. 106, 1832–1837. doi: 10.1073/pnas. 0809632106


ribosome synthesis is required for the maintenance of adult hematopoietic stem cells. J. Exp. Med. 210, 2351–2369. doi: 10.1084/jem.20122019


the retinoic acid pathway. Embo J. 19, 5533–5541. doi: 10.1093/emboj/19. 20.5533


is required for postnatal glucose homeostasis. Cell Metab. 4, 491–497. doi: 10.1016/j.cmet.2006.11.002

Zismanov, V., Chichkov, V., Colangelo, V., Jamet, S., Wang, S., Syme, A., et al. (2016). Phosphorylation of eIF2alpha is a translational control mechanism regulating muscle stem cell quiescence and self-renewal. Cell Stem Cell 18, 79–90. doi: 10.1016/j.stem.2015. 09.020

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Tahmasebi, Amiri and Sonenberg. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# RNA Dysregulation in Amyotrophic Lateral Sclerosis

#### Zoe Butti and Shunmoogum A. Patten\*

INRS-Institut Armand-Frappier, National Institute of Scientific Research, Laval, QC, Canada

Amyotrophic lateral sclerosis (ALS) is the most common adult-onset motor neuron disease and is characterized by the degeneration of upper and lower motor neurons. It has become increasingly clear that RNA dysregulation is a key contributor to ALS pathogenesis. The major ALS genes SOD1, TARDBP, FUS, and C9orf72 are involved in aspects of RNA metabolism processes such as mRNA transcription, alternative splicing, RNA transport, mRNA stabilization, and miRNA biogenesis. In this review, we highlight the current understanding of RNA dysregulation in ALS pathogenesis involving these major ALS genes and discuss the potential of therapeutic strategies targeting disease RNAs for treating ALS.

Keywords: ALS (amyotrophic lateral sclerosis), FUS, C9orf72, TDP-43, RNA processing, RNAi (RNA interference), antisense oligonucleotide-drug conjugates

#### Edited by:

Pascal Chartrand, Université de Montréal, Canada

#### Reviewed by:

Jean-Marc Gallo, King's College London, United Kingdom Rita Sattler, Barrow Neurological Institute (BNI), United States

\*Correspondence:

Shunmoogum A. Patten kessen.patten@iaf.inrs.ca

#### Specialty section:

This article was submitted to Genetic Disorders, a section of the journal Frontiers in Genetics

Received: 13 August 2018 Accepted: 20 December 2018 Published: 22 January 2019

#### Citation:

Butti Z and Patten SA (2019) RNA Dysregulation in Amyotrophic Lateral Sclerosis. Front. Genet. 9:712. doi: 10.3389/fgene.2018.00712

## INTRODUCTION

Amyotrophic lateral sclerosis (ALS) is a progressive and fatal neurodegenerative disorder of motor function. It is characterized by the selective degeneration of the lower and upper motor neurons. Among the symptoms of this disease are progressive muscle weakness and paralysis, swallowing difficulties and breathing impairment due to respiratory muscle weakness that ultimately causes death, usually within 2–5 years following clinical diagnosis (Kiernan et al., 2011). Though most cases of ALS are sporadic, some families (10%) demonstrate a clinically indistinguishable form of ALS with clear Mendelian inheritance and high penetrance (Pasinelli and Brown, 2006). Treatments to slow the progression of ALS to date remains riluzole (Bensimon et al., 1994) and edaravone (Abe et al., 2014) but they are only modestly effective. However, in the past couple years, there has been a real encouragement in witnessing potentially efficacious treatments, such as Masitinib and Pimozide (Trias et al., 2016; Patten et al., 2017; Petrov et al., 2017) claiming to demonstrate clinical benefit. Furthermore, RNA-targeted therapies are currently intensively being evaluated as potential strategies for treating this ALS (Schoch and Miller, 2017; Mathis and Le Masson, 2018). There is indeed hope to have new and potentially more effective treatment options available for ALS in the near future.

Mutations in over more than 20 genes contribute to the etiology of ALS (Chia et al., 2018) (**Table 1**). Amongst these genes, the major established causal ALS genes are SOD1 (Cu-Zn superoxide dismutase 1), TARDBP (transactive response DNA Binding protein 43kDa), FUS (fused in sarcoma) and hexanucleotide expansion repeat in Chromosome 9 Open Reading Frame 72 (C9ORF72). These genetic discoveries have led to the development of animal models (Julien and Kriz, 2006; Kabashi et al., 2010; Patten et al., 2014; Picher-Martel et al., 2016) that permitted the identification of key pathobiological insights. Currently, RNA dysregulation appears to be a major contributor to ALS pathogenesis. Indeed, TDP-43 and FUS are deeply involved in RNA processing such as transcription, alternative splicing and microRNA (miRNA) biogenesis

**136**

(Buratti et al., 2004, 2010; Polymenidou et al., 2012). Mutations in C9ORF72, lead to a toxic mRNA gain of function through RNA foci formation, and the subsequent sequestration in stress granules and altered activity of RNA-binding proteins (Barker et al., 2017). In addition to the major ALS genes, other ALS genes including ataxin-2 (ATXN2) (Ostrowski et al., 2017), TATA-box binding protein associated factor 15 (TAF15) (Ibrahim et al., 2013), heterogeneous nuclear ribonucleoprotein A1 (hnRNPA1) (Dreyfuss et al., 1993), heterogeneous nuclear ribonucleoprotein A2 B1 (hnRNPA2 B1) (Alarcon et al., 2015), matrin 3 (MATR3) (Coelho et al., 2015), Ewing's sarcoma breakpoint region 1 (EWSR1) (Duggimpudi et al., 2015), T-cell-restricted intracellular antigen-1 (TIA1) (Forch et al., 2000), senataxin (SETX) and angiogenin (ANG) (Yamasaki et al., 2009), play critical role in RNA processing (**Table 1**).

In this review, we focus on the four major ALS-associated genes (SOD1, TARDBP, FUS, and C9orf72) and present how they play critical roles in various RNA pathways. We particularly highlight recent developments on the dysregulation of RNA

TABLE 1 | ALS genes and their involvement in RNA processing.


pathways (**Figure 1**) as a major contributor to ALS pathogenesis and discuss the potential of RNA-targeted therapies for ALS.

#### TAR DNA BINDING PROTEIN (TDP-43)

A major advance in our understanding of cellular mechanisms in ALS came from the identification of causative mutations in the TARDBP gene (Kabashi et al., 2008; Sreedharan et al., 2008). This gene encodes for the evolutionarily conserved RNA/DNA binding protein, TDP-43. It is a protein that is normally nuclear, however, in cases of TARDBP mutations, it is mislocalized to the cytoplasm and forms aggregates (Van Deerlin et al., 2008; Winton et al., 2008b). It is found in the pathological aggregates in motor neurons in the majority of cases of ALS (Neumann et al., 2006). It is believed that TDP-43 aggregation leads to a gain of toxicity and its nuclear depletion results to a loss of function of TDP-43. Indeed, several studies have demonstrated that either overexpression or knockdown of TDP-43 causes neurodegeneration and ALS phenotypes (Kabashi et al., 2010; Stallings et al., 2010; Iguchi et al., 2013; Yang et al., 2014). For instance, the expression of the mutant TDP-43A315T in the C. elegans' GABAergic motor neurons results in age-dependent motility defects and neurodegeneration (Vaccaro et al., 2012). In drosophila, overexpression of TDP-43 in motor neurons was found to cause cytoplasmic accumulation of TDP-43 aggregates, neuromuscular junction (NMJ) morphological defects and cell death (Li et al., 2010). Similarly, the loss of TDP-43 reduced locomotion and lifespan (Feiguin et al., 2009; Diaper et al., 2013). Implications of TDP-43 loss and toxic gain-of-function in impaired motility, neurodegeneration and survival were further confirmed in higher model systems such as the zebrafish (Kabashi et al., 2010) and mice (Wegorzewska et al., 2009; Iguchi et al., 2013). Altogether, these reports strongly suggest that alterations in the level of TDP-43 are detrimental to neuronal function and survival.

TDP-43 contains two RNA recognition motifs (RRM1-2), a glycine rich domain in the C-terminus and nuclear localization and export signals (NLS and NES) (Buratti and Baralle, 2001; Winton et al., 2008a). TDP-43 plays a major role in multiple steps of RNA processing such as splicing, RNA stability and mRNA transport (Buratti and Baralle, 2008). For instance, TDP43 has been shown to bind to mRNA and regulate the expression of other proteins implicated in ALS and other neurodegenerative diseases such as FUS, Tau, ATXN 2 and progranulin (Polymenidou et al., 2011; Sephton et al., 2011; Tollervey et al., 2011). This suggests that TDP-43 may be a central component in the pathogenesis of several neurodegenerative conditions (Polymenidou et al., 2011). By RNA-seq analysis, Polymenidou et al. (2011) reported that TDP-43 is required for regulating the expression of 239 mRNAs, many of those encoding synaptic proteins. Several independent studies have corroborated that TDP-43 plays an important role in regulating genes involved in synaptic formation and function and in the regulation of neurotransmitter processes (Godena et al., 2011; Sephton et al., 2011; Colombrita et al., 2012; Narayanan et al., 2013; Chang et al., 2014). Examples of such genes are neurexin

(NRXN1-3) (Polymenidou et al., 2011), neuroligin (NLGN1-2) (Polymenidou et al., 2011), scaffolding protein Homer2 (Sephton et al., 2011), microtubule-associated protein 1B (MAP1B) (Coyne et al., 2014), GABA receptors subunits (GABRA2, GABRA3) (Narayanan et al., 2013), AMPA receptor subunits (GRIA3, GRIA4) (Sephton et al., 2011; Narayanan et al., 2013), syntaxin 1B (Narayanan et al., 2013), and calcium channel cacophony (Chang et al., 2014). The development of TDP-43 animal models has offered the opportunity to explore synaptic alterations in ALS (Feiguin et al., 2009; Armstrong and Drapeau, 2013; Handley et al., 2017) and continuous efforts are being made to identify compounds that can facilitate synaptic transmission in ALS (Patten et al., 2017). Armstrong and Drapeau (2013) reported that expression of mutant TARDPG348C mRNA in zebrafish resulted in impaired synaptic transmission, reduced frequency of miniature endplate currents (mEPCs) and reduced quantal transmission. Remarkably, they also demonstrated that all these synaptic dysfunction features in their zebrafish TARDBP mutant were stabilized by chronic treatment the L-type calcium channel agonists (Armstrong and Drapeau, 2013). In drosophila neurons, TDP-43 depletion was shown to reduce dendritic branching as well as synaptic formation (Feiguin et al., 2009; Lu Y. et al., 2009). Overexpression or knocking down TDP-43 in cultured mammalian neurons also led to reduced dendritic branching (Herzog et al., 2017). In TDP-43A315T mice, Handley et al. (2017) showed that expression of mutant TDP-43 alters dendritic spine development, spine morphology and neuronal synaptic transmission. Collectively, these independent studies on several model systems, suggest that TDP-43 may play an important role in neuronal morphology, synaptic transmission and neuronal plasticity likely via regulation of RNA processing of various synaptic genes (Godena et al., 2011; Sephton et al., 2011; Colombrita et al., 2012; Narayanan et al., 2013; Chang et al., 2014).

TDP-43 is also known to act as a splicing regulator to reduce its own expression level by binding to the 3<sup>0</sup> UTR of its own pre-mRNA (Ayala et al., 2011). Additionally, it functions as a splicing factor whose depletion or overexpression can affect the alternative splicing of specific targets (Polymenidou et al., 2011; Tollervey et al., 2011). Indeed, the alternative splicing of several genes were reported to be altered in human CNS tissues from TDP-43 ALS cases (Shiga et al., 2012; Yang et al., 2014). For instance, the level of the polymerase delta interacting protein 3 (POLDIP3) variant-2 mRNA (lacking exon 3) was significantly increased in the CNS of ALS patients with ALS, while that of variant-1 mRNA remained unchanged (Shiga et al., 2012). This was consistent with findings that TDP-43 directly regulates the inclusion of exon 3 of POLDIP3 and that depletion of TDP-43 in cell culture models increased variant-2 mRNA (Shiga et al., 2012). TDP-43 has also been shown to regulate splicing of the cystic fibrosis transmembrane regulator (CFTR) gene and controls exon skipping by within the pre-mRNA

(Buratti et al., 2004). Importantly, it controls the alternative splicing of apolipoprotein AII (APOAII) (Mercado et al., 2005) and survival of motor neuron (SMN) transcripts (Bose et al., 2008). Specifically, TDP-43 was shown to enhance the inclusion of exon 7 during the maturation of human SMN2 pre-mRNA, which results to an increase in full-length SMN2 mRNA level in neurons (Bose et al., 2008). Furthermore, recently TDP-43 was shown to bind to HNRNPA1 pre-mRNA to modulate its alternative splicing (Deshaies et al., 2018). TDP-43 depletion resulted in exon7B inclusion, culminating in a longer hnRNAP A1B isoform that is aggregation-prone and cytotoxic (Deshaies et al., 2018). Collectively, these studies demonstrated that loss of TDP-43 results to alterations in alternative splicing of many genes and some of which, for example HNRNPA1, can contribute to cellular vulnerability. It would be interesting further to investigate the contribution of the alteration of splicing of these genes (POLDIP3, CFTR, APOAII, SMN2, HNRNPA1) to the pathogenesis of ALS.

TDP-43 is actively transported along axons and co-localizes with other well-known transport RNA binding proteins close to synaptic terminals (Wang I.F. et al., 2008; Narayanan et al., 2013). It was reported that TDP-43 mutations impair mRNA transport function in vivo and in vitro (Alami et al., 2014). In addition to a role in mRNA transport, TDP-43 also acts as a regulator of mRNA stability (Strong et al., 2007; Fiesel and Kahle, 2011). It was shown to directly interacts with the 3<sup>0</sup> UTR of neurofilament light chain (NFL) mRNA to stabilize it (Strong et al., 2007) and associates with futsch/MAP1B mRNA in Drosophila to regulates its localization and translation (Coyne et al., 2014). Particularly, TDP-43 was found to interact with 14-3-3 protein subunits to modulate the stability of the NFL mRNA (Volkening et al., 2009). Abnormal regulation of NFL mRNA has been observed in ALS patients (Wong et al., 2000) and disruption of NFL mRNA stoichiometry leads to motor neuron death and symptoms of ALS in animal models (Xu et al., 1993; Julien et al., 1995). It is, thus, very likely that TDP-43 mutations may cause motor neuron degeneration by interfering with RNA processing of NFL mRNA.

Other important identified targets regulated by TDP-43 at mRNA level that may play a role in disease are G3BP (McDonald et al., 2011) and TBC1D1 (Stallings et al., 2013). G3BP is an essential component of stress granules, which are cytoplasmic non-membrane organelles that store translationally arrested mRNAs that accumulate during cellular stress (Kedersha and Anderson, 2007). Stress granules consists of polyadenylated mRNAs, translation initiation factors (e.g., eIF3, eIF4E, and eIF4G), small ribosomal subunits and a numerous RNA-binding proteins (Protter and Parker, 2016). TDP-43 is recruited to stress granules in cellular models upon exposure to different stressors (Colombrita et al., 2009; Liu-Yesucevitz et al., 2010; Bentmann et al., 2012). Importantly, cytosolic TDP-43 mutants are more efficiently recruited to stress granules upon cellular stress compared to nuclear wild-type TDP-43 (Liu-Yesucevitz et al., 2010). Prolonged stress is thought to promote sequestration of TDP-43 and their mRNA targets in stress granules; thereby inhibiting translation and potentially contributing to ALS progression (Ramaswami et al., 2013).

### FUSED IN SARCOMA (FUS)

Mutations in FUS are detected in 4–5% of familial ALS patients as well as in sporadic ALS (Kwiatkowski et al., 2009; Vance et al., 2009; Corrado et al., 2010; DeJesus-Hernandez et al., 2010). FUS is an RNA/DNA-binding protein of 526 amino acids, consisting of an RNA-recognition motif, a SYGQ (serine, tyrosine, glycine and glutamine)-rich region, several RGG (arginine, glycine and glycine)-repeat regions, a C2C2 zinc finger motif and a nuclear localization signal (NLS) (Iko et al., 2004). C-terminal ALS FUS mutations disrupt the NLS region and the nuclear import of FUS; resulting in cytoplasmic accumulation (Kwiatkowski et al., 2009; Vance et al., 2009).

Similarly to TDP-43, FUS plays multiple roles in RNA processing by directly binding to RNA. Using CLIP-based methods, several groups have identified thousands of RNA targets bound by FUS in various cell lines (Hoell et al., 2011; Colombrita et al., 2012; Ishigaki et al., 2012), and brain tissues (Lagier-Tourenne et al., 2012; Rogelj et al., 2012). Interestingly, FUS was identified in spliceosomal complexes (Rappsilber et al., 2002; Zhou et al., 2002) and interacting with several key splicing factors (such as hnRNP A1, YB-1) (Rapp et al., 2002; Meissner et al., 2003; Kamelgarn et al., 2016) as well as with the U1 snRNP (Yamazaki et al., 2012; Yu et al., 2015). FUS regulates splicing events for neuronal maintenance and survival (Lagier-Tourenne et al., 2012). Given that FUS plays an essential role in splicing regulation, the consequence of its loss of function in ALS on RNA splicing has been immensely investigated (Lagier-Tourenne et al., 2012; Zhou Y. et al., 2013; Reber et al., 2016). For instance, Reber et al. (2016) showed by mass spectrometric analysis that minor spliceosome components are highly enriched among the FUS-interacting proteins. They further reported that FUS interacts with the minor spliceosome and directly regulates the removal of minor introns (Reber et al., 2016). Moreover, the FUSP525L ALS mutation, which destroys the NLS and results in cytoplasmic retention of FUS (Dormann et al., 2010), inhibits splicing of minor introns and causes mislocalization of the minor spliceosome components U11 and U12 snRNA to the cytoplasm and inhibits splicing of minor introns (Reber et al., 2016). Loss of function of FUS led to splicing changes in more than 300 genes mice brains (Lagier-Tourenne et al., 2012) and importantly a vast majority minor intron containing mRNAs was altered (Reber et al., 2016). Corroborating the results with mouse brain, many minor intron-containing genes were found to be downregulated in FUS-depleted SH-SY5Y cells (Reber et al., 2016). FUS depletion has been shown to affect minor intron containing genes that are important for neurogenesis (PPP2R2C), dendritic development (ACTL6B) and action potential transmission in skeletal muscles (SCN8A and SCN4A) (Reber et al., 2016) and may contribute to ALS pathogenesis. FUS has also been shown to regulate alternative splicing of genes related to cytoskeletal organization, axonal growth and guidance such as the microtubule-associated protein tau (MAPT) (Ishigaki et al., 2012; Orozco et al., 2012; Rogelj et al., 2012), Netrin G1

(NTNG1) (Rogelj et al., 2012), neuronal cell adhesion molecule (NRCAM) (Rogelj et al., 2012; Nakaya et al., 2013) and the actin-binding LIM (ABLIM1) (Nakaya et al., 2013). For example, FUS knockdown has been shown to promote inclusion of exon 10 in the MAPT/tau protein and to significantly cause shortened axon length and growth cone enlargement (Orozco et al., 2012). Loss of function of FUS altered MAPT/tau isoform expression and likely disturbed cytoskeletal function impairing axonal growth and maintenance. Interestingly, axon retraction and denervation are early events in ALS (Boillee et al., 2006; Nijssen et al., 2017). Disruption of cytoskeleton function may thus play an important role in neurodegeneration in ALS.

Besides its functions in splicing, FUS has been proposed to regulate transcription by RNA polymerase II (RNAP2), RNA polymerase III (RNAP3) or cyclin D1 (Wang X. et al., 2008; Tan and Manley, 2010; Brooke et al., 2011; Schwartz et al., 2012; Tan et al., 2012). For instance, transcriptomic analyses showed that knockdown of FUS results in differential expression several genes (Lagier-Tourenne et al., 2012; Nakaya et al., 2013) including many mRNAs encoding proteins important for neuronal function. Transcriptome changes have also been observed in human motoneurons obtained from FUS mutant induced pluripotent stem cells (IPSCs) (De Santis et al., 2017) and transgenic FUS knockin mice (Scekic-Zahirovic et al., 2016). Alterations in the expression of several genes involved in pathways related to cell adhesion, apoptosis, synaptogenesis and other neurodegenerative diseases were reported in these FUS models (Fujioka et al., 2013; Scekic-Zahirovic et al., 2016; De Santis et al., 2017). Among these genes TAF15, which is mutated in some case of ALS (Couthouis et al., 2011), has been found to be upregulated in several ALS FUS models including human mutant IPSC derived motoneurons (De Santis et al., 2017), FUS knockout and knockin mouse (Kino et al., 2015; Scekic-Zahirovic et al., 2016). However, it remains to be determined whether TAF15 upregulation upon FUS loss- or toxic gain- of function contributes to ALS pathogenesis.

FUS is also incorporated into stress granules under cellular stress conditions (Sama et al., 2013). Sequestration of FUS and its protein partners into these cytoplasmic organelles appears to contribute to ALS pathogenesis (Yasuda et al., 2013). An example of such a protein partner is Pur-alpha, which co-localizes with mutant FUS and becomes trapped in stress granules in stress conditions, as reported in ALS patient cells carrying FUS mutations (Di Salvio et al., 2015; Daigle et al., 2016). It has been shown that FUS physically interacts with Pur-alpha. In vivo expression of Pur-alpha in Drosophila significantly exacerbates the neurodegeneration caused by mutated FUS. Conversely, Di Salvio et al. (2015) showed that the downregulation of Pur-alpha in neurons expressing mutated FUS significantly improves fly climbing activity. It was subsequently demonstrated that overexpression Pur-alpha inhibits cytoplasmic mislocalization of mutant FUS and promotes neuroprotection (Daigle et al., 2016). However, the function of Pur-alpha in regulating ALS pathogenesis remains elusive.

### SUPEROXIDE DISMUTASE-1 (SOD1)

Unlike TDP43 and FUS, SOD1 does not contain RNA-binding motifs, however, several reports have demonstrated a potential role of mutant SOD1 in regulating RNA metabolism (Menzies et al., 2002; Lu et al., 2007; Lu L. et al., 2009; Chen et al., 2014). Particularly, mutant SOD1 can bind mRNA species such as vascular endothelial growth factor (VEGF) and NFL and negatively affects their expression, stabilization and function (Menzies et al., 2002; Lu et al., 2007; Lu L. et al., 2009; Chen et al., 2014). More precisely, mutant SOD1 can directly bind to specific adenylate- and uridylate-rich stability elements (AREs) located in the 3<sup>0</sup> UTR of transcripts of VEGF (Lu et al., 2007) and NFL (Chen et al., 2014). It is believed that such a gain of abnormal protein–RNA interactions can be caused by SOD1 misfolding that results in the exposure of polypeptide portions with the ability to bind nucleic acids (Kenan et al., 1991; Tiwari et al., 2005).

Binding of mutant SOD1 to the 3<sup>0</sup> UTR of the VEGF mRNA results in the sequestration of other ribonucleoproteins such as TIAR and HuR into insoluble aggregates. These interactions, which are specific to mutant SOD1, result in decline levels of VEGF mRNA, impairment of HuR function and ultimately hampering their neuroprotective actions during stress responses (Lu et al., 2007; Lu L. et al., 2009).

In motor neuron-like NSC34 cell lines expressing mutant SOD1 (G37R or G93A), the level of NFL mRNA is significantly reduced (Menzies et al., 2002). Reduction in NFL mRNA levels has also been reported in G93A transgenic mice and human spinal motor neurons from SOD1-ALS cases (Menzies et al., 2002). It is proposed that destabilization NFL mRNA by mutant SOD1, result to altered stoichiometry of neurofilament (NF) subunits and subsequent NF aggregation in motor neurons (Chen et al., 2014). NF inclusion in the soma and proximal axons of spinal motor neurons is a hallmark of ALS pathology (Hirano et al., 1984). In IPSC-derived model of ALS, a reduction of NFL mRNA level has been reported to result in NF aggregation and neurite degeneration (Chen et al., 2014). Altogether, these studies support a pathogenic role for dysregulation of RNA processing in SOD1-related ALS.

Interestingly, SOD1 has been shown to interact with TDP-43 to modulate NFL mRNA stability (Volkening et al., 2009). As mentioned above, TDP-43 was found to directly interact with the 3<sup>0</sup> UTR of NFL mRNA to stabilize it (Strong et al., 2007). Altogether, these studies suggest that SOD1 and TDP-43 may act in a possible common action in regulating specific RNA stability. In the case of NFL mRNA, it would be interesting to investigate whether mutant SOD1 dislodges TDP-43 from the NFL mRNA in a manner that would affect its mRNA metabolism and potentially making NF prone to form aggregates.

Furthermore, there have been several transcriptome investigations in SOD1 human samples (D'Erchia et al., 2017), motor neuron-like NSC34 cell culture model (Kirby et al., 2005) and transgenic animals including mice (Lincecum et al., 2010; Bandyopadhyay et al., 2013; Sun et al., 2015), rat (Hedlund et al., 2010) and drosophila (Kumimoto et al., 2013). These studies have reported dysregulation of genes involved

in pathways related to the neuroinflammatory and immune response, oxidative stress, mitochondria, lipid metabolism, synapse and neurodevelopment (Hedlund et al., 2010; Lincecum et al., 2010; Bandyopadhyay et al., 2013; Kumimoto et al., 2013; Sun et al., 2015; D'Erchia et al., 2017). However, in these studies it is not clear whether SOD1 directly or indirectly impact the regulation of the differentially expressed genes. In a recent elegant study, Rotem et al. (2017), compared transcriptome changes in SOD1 and TDP-43 models. They found that most genes that were altered in the SOD1G93A model were not dysregulated in the TDP-43A315T model, and vice versa (Rotem et al., 2017). There were, however, a few genes whose expressions were altered in both ALS models

(Rotem et al., 2017). These findings are consistent with the ALS pathology, which is distinguishable between the ALS-related SOD1 phenotype and the TDP-43 phenotype. Although different cellular pathways are likely activated by SOD1 versus TDP-43, it is very plausible that they ultimately convergence onto common targets to result in similar motor neuron toxicity and ALS phenotype.

### C9orf72 INTRONIC EXPANSION

In 2011, a large GGGGCC hexanucleotide repeat expansion in the first intron or promoter region of the C9orf72 gene has been discovered as a new cause of ALS (DeJesus-Hernandez et al., 2011; Renton et al., 2011). C9orf72 repeat expansion mutations account for about 50% of familial ALS and 5–10% of sporadic ALS (Majounie et al., 2012). It remains a topic of debate whether the repeat expansion in C9orf72 causes neurodegeneration primarily through a toxic gain of function, loss of function, or both. The C9orf72 repeat expansion is transcribed in both the sense and antisense directions and leads to accumulations of repeatcontaining RNA foci in patient tissues (Gendron et al., 2013). The formation of RNA foci facilitates the recruitment of RNAbinding proteins, causes their mislocalization and interferes with their normal functions (Simon-Sanchez et al., 2012; Donnelly et al., 2013; Lee et al., 2013; Gitler and Tsuiji, 2016). Indeed, RNA foci may bind RNA binding proteins and alter RNA metabolism (Donnelly et al., 2013; Lee et al., 2013; Mori et al., 2013a). For example, Mori et al. (2013a) and Hutvagner et al. (2001) showed that RNA foci can sequester hnRNP-A3 and repress its RNA processing function. Aborted transcripts containing the repeat can also disrupt nucleolar function (Haeusler et al., 2014). Importantly, these foci can sequester nuclear proteins such as TDP-43 and FUS, impacting expression of the their RNA targets and culminating in a range of RNA misprocessing events. Other RNA binding proteins binding to RNA foci include hnRNP A1, hnRNP-H, ADARB2, Pur-α, ASF/SF2, ALYREF and nucleolin (Donnelly et al., 2013; Lee et al., 2013; Sareen et al., 2013; Xu et al., 2013; Cooper-Knock et al., 2014; Haeusler et al., 2014). Antisense oligonucleotides (ASOs) targeting the C9orf72 repeat expansion suppress RNA foci formation, attenuate sequestration of specific RNA-binding proteins and reverse gene expression alterations in C9orf72 ALS motor neurons derived from IPSCs (Donnelly et al., 2013; Lagier-Tourenne et al., 2013).

Additionally, simple dipeptide repeats (poly-GA, poly-GP, poly-GR, poly-PA, and poly-PR) can be generated by repeatassociated non-ATG-dependent (RAN) translation of both the sense and antisense strands that have a variety of toxic effects (Ash et al., 2013; Mori et al., 2013b). Poly-PR and poly-GR can alter the splicing patterns of specific RNAs. For example, poly-PR has been shown to cause exon-skipping in RAN and PTX3 RNA (Kwon et al., 2014). Dipeptides repeat proteins have also been found to be toxic by creating aggregates sequestrating cytoplasmic proteins (Freibaum and Taylor, 2017). Poly-GR dipeptide co-localizes with several ribosomal subunits and with a transcription factor elF3η (Zhang et al., 2018c). This suggests a ribosomal dysfunction, which implies a defect in RNA translation. In line with these findings, a recent report demonstrated that poly-PR co-localizes with the nucleolar protein, nucleophosmin, and reduces the expression of several ribosomal RNA (Suzuki et al., 2018). Suzuki et al. (2018) further showed that the reduction in the expression of ribosomal RNA results in neuronal cell death and this could be rescued by overexpression of an accelerator of ribosome biogenesis, Myc (Suzuki et al., 2018). RNA sequencing reveals that more than 6,000 genes are up or down regulated in mice that express the dipeptide construct in the brain (Zhang et al., 2018c). Other findings show that poly-PR dipeptide binds nuclear pores channels blocking the import and export of molecules. The dipeptide actually binds the nucleoporin proteins Nup54 and Nup98 that rim the central channel of the pore (Shi et al., 2017). The accumulation of poly-PR dipeptide at the nuclear pore was found to correlate with defect in nuclear transport of RNA and protein, which is consistent with previous findings (Freibaum et al., 2015; Zhang et al., 2015).

The last proposed mechanism involved in ALS pathogenesis is a haploinsufficiency due to the expansion of repetition leading to a decreased transcription of the gene and consequently to a decrease of its translation (Ciura et al., 2013). Studies have demonstrated that C9orf72 expansion repeat can interfere with transcription or splicing of C9orf72 transcripts (Mori et al., 2013b; Haeusler et al., 2014; Highley et al., 2014). It has also been proposed that the C9orf72 expansion repeat could disrupt the C9orf72 promoter activity thereby reducing its expression (Gijselinck et al., 2016). Several studies have demonstrated alterations in the C9orf72 ALS transcriptome (Donnelly et al., 2013; Prudencio et al., 2015; Selvaraj et al., 2018). Interestingly, a recent article reported an increased expression of the calciumpermeable GluA1 AMPA receptor subunit in motoneurons derived from IPSC of patients with C9orf72 mutations (Selvaraj et al., 2018). This alteration in AMPA receptor composition led to an enhanced motoneuron vulnerability to AMPA-induced excitotoxicity (Selvaraj et al., 2018). It remains to be determined whether the increased expression of GluA1 AMPA subunit is related to reduced levels of C9orf72, RNA foci and/or dipeptide repeats.

C9orf72 has also been showed to be involved in the generation of stress granules (Maharjan et al., 2017) and sequestering other RNA binding proteins that are involved in nucleo-cytoplasmic transport (Zhang et al., 2015, 2018b). It has been found that stress granules observed in C9orf72 mutants co-localizes with Ran GAP (Zhang et al., 2015, 2018b); which is known to activate

Ran GTPase. This GTPase in involved in nucleo-cytoplasmic transport. It has also been published that expressing Ran GAP rescues the age-related motor defects in flies expressing the GGGGCC repeats (Zhang et al., 2018a). Very recently, it has also been reported that one of the dipeptide generated by the expansion has a role in formation of these stress granules (Zhang et al., 2018c). Moreover, importins and exportins are sequestered in stress granules; which also implies that protein transport in altered (Zhang et al., 2018b).

These toxic gain- or loss-of function mechanisms are thought to be all involved in synergy in ALS pathogenesis and it can be summed up that that altered RNA processing plays a key role in C9orf72-mediated toxicity through two ways. The first is altered processing of the expanded C9orf72 transcript itself, in terms of altered transcription, splicing defects, nuclear aggregation and non-conventional translation (Barker et al., 2017). The second involves downstream and indirect changes in RNA processing of other transcripts. A thorough understanding of RNA metabolism dysregulation could definitely bring a major enlightenment on how C9orf72 mutation leads to ALS and provide insights on therapeutic targets.

#### DYSREGULATION OF MICRORNA (miRNA) IN ALS

Multiple mechanisms control the proper levels of RNA and subsequent protein expression; among these are microRNAs (miRNAs) (Catalanotto et al., 2016). They are endogenous small non-coding RNAs (approximately 22 nucleotides in length) that are initially transcribed by the RNA polymerase II as primary miRNA (pri-miRNAs) transcripts. These primiRNAs are processed into precursor miRNAs (pre-miRNAs) by the nuclear ribonuclease III (RNase III), DROSHA, and the double-stranded RNA-binding protein, DGCR8, which anchors DROSHA to the pri-miRNA transcript (Lee et al., 2003; Denli et al., 2004). Pre-miRNA is then exported into the cytoplasm by exportin-5 (Yi et al., 2003), where it is processed into a mature miRNA by the DICER enzyme (Hutvagner et al., 2001; Ketting et al., 2001). The mature miRNA is then incorporated with a ribonucleoprotein (RNP) complex with argonaute (AGO) proteins to form the RNA-induced silencing complex (RISC) (Hammond et al., 2001; Schwarz et al., 2003; Kawamata and Tomari, 2010), which mediates inhibition of translation and/or mRNA degradation of targeted transcripts that are complementary to the miRNA (Hutvagner and Zamore, 2002; Yekta et al., 2004). The recognition of mRNAs by miRNAs occurs through base-pairing interactions within the 3<sup>0</sup> -untranslated region (UTR) of the targeted mRNAs. Besides their well-known gene silencing functions, miRNAs can also induce up-regulation of their targets (Vasudevan et al., 2007; Lin et al., 2011; Truesdell et al., 2012; Vasudevan, 2012).

MiRNAs play important roles in several biological processes such as cell proliferation (Chen et al., 2006), cell differentiation (Naguibneva et al., 2006), apoptosis (Matsushima et al., 2011), and patterning of the nervous system (Johnston and Hobert, 2003). Interestingly, several miRNAs have been particularly shown to be essential for motor neuron development and survival (see review, Haramati et al., 2010). For example, in developing chick, it was demonstrated that the activation of the miRNA miR9 is necessary to suppress the expression of the transcription factor onecut1, which in turn helps to drive differentiation of neural progenitor cells into spinal motor neurons (Luxenhofer et al., 2014). It is believed that several miRNAs work in concert to establish motor neuron identity. Indeed, in addition to miR9, other miRNAs such as miR-128 (Thiebes et al., 2015), miR-196 (Asli and Kessel, 2010), miR-375 (Bhinge et al., 2016) have been shown to play a role in motor neuron differentiation and localization. Loss of DICER function within progenitor cells results in aberrant motor neuron development while its loss in motor neuron leads to progressive motor neuron degeneration (Haramati et al., 2010; Chen and Wichterle, 2012). Furthermore, miRNAs are important players for NMJ function, synaptic plasticity and for maintaining cytoskeletal integrity (see review, Hawley et al., 2017).

The ALS genes, TDP-43 and FUS, were identified in a protein complex with RNAse III DORSHA and shown to play a role in miRNA biogenesis (Freibaum et al., 2010; Da Cruz and Cleveland, 2011). TDP-43, in particular was shown to associate with proteins involved in the cytoplasmic cleavage of pre-miRNA mediated by the DICER enzyme (Freibaum et al., 2010). It is thus to no surprise that dysregulation of miRNAs has been observed in ALS (Li et al., 2013; Zhang et al., 2013; Dini Modigliani et al., 2014; Eitan and Hornstein, 2016). Indeed, mutations in TARDBP result in differential expression of miRNAs – miR-9, miR-132, miR-143, and miR-558 (Kawahara and Mieda-Sato, 2012; Zhang et al., 2013). Interestingly, the expression of several of these miRNAs (miR-9, miR-132, miR-143) and including others (such as miR-125, miR-192) are altered upon FUS depletion (Morlando et al., 2012). MiR-9 expression is also found to be upregulation in mutant SOD1 mice (Zhou F. et al., 2013). These dysregulated miRNAs are essential for motor neuron development and maintenance (Otaegi et al., 2011; Luxenhofer et al., 2014), axonal growth (Dajas-Bailador et al., 2012; Kawahara and Mieda-Sato, 2012) and synaptic transmission (Edbauer et al., 2010; Sun et al., 2012). Thus, these miRNA alterations likely contribute to the pathological phenotype observed in ALS.

Additionally, depletion of TDP-43 in cell culture systems has also been shown to change the total miRNA expression profile (Buratti et al., 2010). A similar observation was recently observed in motoneurons progenitors derived from human ALS IPSCs (Rizzuti et al., 2018). Particularly, it was reported that 15 miRNAs were dysregulated including disease-relevant miR-34a and miR504, which are known to be, implicated synaptic vesicle regulation and cell survival (Rizzuti et al., 2018). Additionally, another important miRNA, namely microRNA-1825, was found to be downregulated in CNS of both sporadic and familial ALS patients (Helferich et al., 2018). Interestingly, reduced levels of microRNA-1825 was demonstrated to cause a translational upregulation of tubulin-folding cofactor b (TBCB) which consequently to depolymerization and degradation of tubulin alpha-4A (TUBA4A), which is encoded by a known ALS gene (Helferich et al., 2018).

In several repeats diseases such as myotonic dystrophy, fragile X tremor and ataxia syndrome, toxic RNA from expansion repeats cause widespread RNA splicing abnormalities, degeneration of affected tissues (Miller et al., 2000) and alter miRNA processing (Sellier et al., 2013). Since its discovery, C9orf72 GGGGCC expansion repeat was also questioned as a disruptor of miRNA processing. Recently, the DROSHA protein was found to be mislocalized in dipeptide repeat protein-aggregates in frontal cortex and cerebellum C9orf72 ALS/FTLD patients (Porta et al., 2015). An involvement of the miRNA pathway in motor neuron impairment in ALS is evident and further investigations on miRNAs dysregulation in ALS pathogenesis could eventually lead to the identification of therapeutic targets.

#### RNA-TARGETED THERAPEUTICS FOR ALS

Our understanding of RNA biology has expanded tremendously over the past decades, resulting in new approaches to engage RNA as a therapeutic target. More precisely, RNA-targeted therapeutics have been developed to mediate the reduction or expression of a given target RNA by employing mechanisms such as RNA cleaving, modulation of RNA splicing, inhibition of mRNA translation into protein, inhibition of miRNA binding sites, increasing translation by targeting upstream open reading frames and disruption of RNA structures regulating RNA stability (Robertson et al., 2010; Fellmann and Lowe, 2014; Vickers and Crooke, 2014; Havens and Hastings, 2016; Liang et al., 2016). Therapeutics that directly target RNAs are promising for a broad spectrum of disorders, including the neurodegenerative diseases (Scoles and Pulst, 2018) and are currently under evaluation as potential strategies for treating ALS. The RNA therapeutics approaches include RNA interference (RNAi) and ASOs (**Figure 2**), both bind to their target nucleic acid via Watson-Crick base pairing and cause degradation of or inactivate the targeted mRNA (Burnett and Rossi, 2012). Recently, application of innovative drug discovery approaches has showed that targeting RNA with bioactive small molecules is achievable (Disney, 2013; Bernat and Disney, 2015). A few researchers

including us are currently exploiting such a new type of RNAtargeted therapeutics to search for RNA-targeted small molecules as C9orf72 ALS therapeutics.

#### RNA Interference (RNAi)

fgene-09-00712 January 19, 2019 Time: 16:44 # 9

RNAi is an endogenous cellular mechanism to regulate mRNA. It operates sequence specifically and post-transcriptionally via the RISC (Carthew and Sontheimer, 2009). Methods of mediating the RNAi effects are via small interfering RNA (siRNA), short hairpin RNA (shRNA), and artificial miRNA (Fire et al., 1998; Moore et al., 2010; Chakraborty et al., 2017). These approaches can help to reduce the expression of mutant (toxic) gene and can provide significant therapeutic benefit in treating ALS and other neurodegenerative disease implicating aberrant accumulation of misfolded proteins.

The challenge of using siRNA for treating ALS is that it has to be designed to have the specificity and ability to reduce the aberrant mutant protein while leaving wild-type protein intact. Attempts were made to design siRNA, which could recognize just a single nucleotide alternation to selectively suppress mutant SOD1 (particularly G93A) expression leaving wild-type SOD1 intact (Yokota et al., 2004; Wang H. et al., 2008). The design of siRNA G93A.1 and G93A.2 by Yokota et al. (2004) were found to successfully suppress the expression of approximately 90% of mutant SOD1 G93A. Importantly, both siRNA had virtually little or no effect on wild-type SOD1 expression (Yokota et al., 2004). To achieve long-term expression of siRNA in cells, the use of viral delivery system has proved powerful to provide a continuous delivery and expression of shRNA in sufficient quantities (Bowers et al., 2011). Indeed, diverse viral vectors have been studied such as adeno-associated virus (AAV), lentivirus (LV), and rabies-glycoprotein-pseudotyped lentivirus (RGP-LV) (Raoul et al., 2005; Wu et al., 2009). Recombinant AAVs are currently the choice of RNAi treatment vehicle for neurological diseases because they are non-pathogenic and safe (Maguire et al., 2014; Smith and Agbandje-McKenna, 2018). Several studies have aimed at engineering AAV serotypes with better cell-type and tissue specificities and an improved immune-evasion potential (Gao et al., 2005; Weinmann and Grimm, 2017). AAV9 and AAVrh10 serotypes have been shown to cross the blood–brain barrier and efficiently transduce cells in the CNS, with widespread and sustained transgene expression in the spinal cord and brain even after just a single injection (Thomsen et al., 2014; Dirren et al., 2015; Borel et al., 2016). Importantly, they can efficiently target neurons and astrocytes, making them the most applicable delivery systems for treating ALS.

Several researchers have independently use siRNA or shRNA to silence mutant SOD1 expression in vitro and in vivo (Miller et al., 2005; Raoul et al., 2005; Ralph et al., 2005; Foust et al., 2013). Intramuscular delivery of siRNA targeting mutant SOD1 in SOD1G93<sup>A</sup> mice delays the onset of motor neuron symptoms and extend their survival (Miller et al., 2005). Similarly, SOD1G93A mice treated with injection of AAV encoding shRNA against human SOD1 mRNA (hSOD1) exhibited delayed diseases onset and significantly increased their survival by 23% (Foust et al., 2013). The same group later demonstrated the efficacy of this approach in SOD1G93A rats, showing that silencing of hSOD1 expression selectively in the motor cortex also delayed disease onset and prolonged survival (Thomsen et al., 2014). Silencing of SOD1 using an artificial miRNA (miR-SOD1) systemically delivered using the viral vector AAVrh10 in SOD1G93A mice was also found to significantly delayed disease onset, preserved muscle motor functions and extended survival (Borel et al., 2016). Interestingly, similar findings were observed in nonhuman primates treated with AAVrh10-miR-SOD1 (Wang et al., 2014; Borel et al., 2016). These findings suggest that miRNA silencing strategy warrants further investigations and may offer promise for the development for the treatment of SOD1-related ALS.

#### Antisense Oligonucleotides (ASOs)

The concept of ASOs was first introduced in 1978, when Stephenson and Zamecnik used a chemically modified oligonucleotide, designed to bind to its complementary sequence in a Rous sarcoma virus transcript to inhibit its gene expression and viral replication (Stephenson and Zamecnik, 1978). ASOs are synthetic single-stranded oligonucleotides that activate the RNAse H, an endonuclease in the nucleus, to degrade the complementary mRNA. They can be designed to specifically target mutant RNAs or mRNA splicing (Bennett and Swayze, 2010). An ASO therapy based (nusinersen) approach designed to promote exon skipping has proven to be very effective in treating spinal muscular atrophy (SMA) in clinical trials (Chiriboga et al., 2016; Finkel et al., 2016; Mendell et al., 2017; Scoto et al., 2017). In late 2016, this antisense drug (marketed as Spinraza) has received FDA approval for the treatment of SMA. This was the first exciting success of ASO therapeutics in neurodegeneration and a significant milestone for ASO therapy, in general. With increased understanding of gain- and loss-of-function mechanisms of genetic forms of ALS, ASOs therapies have also been tested principally tested in SOD1 and C9ORF72 models to target the mutant forms of RNA but not the wild-type.

The first study using an ASO to target SOD1 showed an effective silencing of SOD1 and reduced mutated SOD1 protein throughout the brain and spinal cord of SOD1G93A rats (Smith et al., 2006). Infusion of ASOs complementary to hSOD1 mRNA extended survival in SOD1G93A rats (Smith et al., 2006). Given these promising preclinical results, the ASO IONIS-SOD1Rx (ISIS 333611 and BIIB067) has been proposed as a therapeutic strategy for SOD1-link ALS and has been clinical tested. In a phase I testing, intrathecal administration of the ASO IONIS-SOD1Rx was showed to be both practical and safe in SOD1 ALS patients (Miller et al., 2013). A phase Ib/IIa trial (NCT02623699) is currently underway to further evaluate safety, tolerability, and pharmacokinetics of IONIS-SOD1Rx. Altogether, the preclinical and clinical tests suggest that ASOs delivered to the CNS represent a feasible treatment for SOD1-related ALS and are safe, however, ASOs are not specific for mutant over wild-type SOD1 and the long-term effects of the reduction of SOD1 need further investigation.

In addition, silencing of SOD1 can be induced by exon skipping of hSOD1 using ASOs complementary to splicing regulatory elements on the primary transcript (Biferi et al., 2017). For instance, administrating an exon-2-targeted ASO

embedded in a modified U7 small-nuclear RNA and delivered by AAV10, in either newborn or adult (P50) SOD1G93A mice, was shown to increase survival and restore neuromuscular function (Biferi et al., 2017). These recent findings provide new hope for treatment of ALS and open perspectives for a clinical development.

Strong evidence supports that the mechanism by which the GGGGCC repeat expansion in C9orf72 causes the diseases is by toxicity of RNAs that they generate. Thus early development of ASO-based therapeutics for C9orf72 ALS focused on reducing gain-of-function toxicity associated with the repeat expansion. Testing of the efficacy of ASO-based therapeutics for C9orf72 was initially performed on clinically relevant human IPSC-derived neurons and fibroblasts (Donnelly et al., 2013; Lagier-Tourenne et al., 2013; Sareen et al., 2013). More recently, ASOs were also evaluated in mouse models expressing the expanded C9orf72 (O'Rourke et al., 2015; Jiang et al., 2016).

Antisense oligonucleotides were designed to bind within the GGGGCC repeat expansion or within surrounding N-terminal regions of the C9orf72 mRNA transcript to either degrade the transcript or block the interaction between the repeat expansion and RNA-binding proteins (Donnelly et al., 2013). ASOs effectively reduced RNA foci formation, dipeptide proteins, increased survival from glutamate excitotoxicity and restored normal gene expression markers (Donnelly et al., 2013; Lagier-Tourenne et al., 2013; Sareen et al., 2013; O'Rourke et al., 2015; Jiang et al., 2016). These promising findings suggest that ASObased therapy can be a powerful way for treating C9orf72 ALS. They also provided the basis for the initiation of the first C9orf72 ASO clinical trial that is anticipated to start by the end of 2018.

These planned ASOs trials in ALS as well as ongoing trials of ASOs in SMA, Huntington's disease and Alzheimer's disease will enhance our understanding of this therapeutic approach. Importantly, positive outcomes from these clinical trials will revolutionize the treatment of genetically mediated neurodegenerative diseases.

#### Small Molecules Targeting RNA

RNAs adopt discrete secondary and tertiary structures and have pivotal roles in biology and diseases (Bernat and Disney, 2015). The ALS-associated C9orf72 GGGGCC repeat RNA can stably fold to into a four-stranded structure formed by the stacking of planar tetrads of four guanosine residues, termed G-quadruplex (Huppert, 2008; Fratta et al., 2012). This G-quadruplex structure can affect various RNA processing including splicing and translation (Simone et al., 2015). In particular, the C9or72 repeat RNA G-quadruplexes have been shown to specifically sequester RNA-binding proteins and have toxic functions (Haeusler et al., 2016). GGGGCC repeat RNA sequence can also adopt a hairpin structure in addition to G-quadruplexes (Haeusler et al., 2014; Su et al., 2014). Hairpin is composed of a base-paired stem and a loop and it can affect transcription and alternative splicing (Kuznetsov et al., 2008). Targeting these RNA structures of the C9or72 repeat is a potential therapeutic strategy.

Recent developments in technologies and approaches have made the long sought-after goal of developing small-molecule drugs that target RNA possible (Disney, 2013; Bernat and Disney, 2015; Connelly et al., 2016). Small molecules binding to RNA hairpin or G-quadruplex structure have been identified (Di Antonio et al., 2012; Su et al., 2014). This has provided the springboard to initiate the search for small molecules that can specifically target C9orf72 repeat RNA and hinder pathogenic interactions with RNA-binding proteins and/or by interfering with RAN translation (Su et al., 2014; Simone et al., 2018) (**Figure 2C**).

Su et al. (2014) showed that (GGGGCC)<sup>8</sup> RNA can adopt a hairpin structure in equilibrium with a quadruplex structure. They designed three compounds targeting mainly the hairpin structure of the (GGGGCC)<sup>n</sup> RNA and showed that the bioactive small molecule 1a significantly inhibited RAN translation and foci formation in cultured cells (GGGGCC)<sup>66</sup> repeat expansion and in patient-derived neurons (Su et al., 2014). However, these small molecules were only tested in vitro on cellular models. Recently, a drug screen study to identify compounds that specifically target the C9orf72 RNA G-quadruplex structure led to the identification of three lead compounds (Simone et al., 2018). These compounds were then functionally validated as ALS therapeutics in C9orf72 IPSC-derived neurons and C9orf72 repeat-expressing fruit flies. Interestingly, two of the lead compounds reduced RNA foci formation and the levels of toxic dipeptide repeat proteins in IPSC-derived spinal motor neurons and cortical neurons (Simone et al., 2018). The most effective small molecule (DB1273) was then tested in vivo on C9orf72 repeat-expressing fruit flies and was found to significantly reduce dipeptide repeats levels. Furthermore, D1273 improved the survival of the fruit flies (Simone et al., 2018). These studies support the further development of small molecules that selectively bind GGGGCC RNA as a therapeutic strategy for C9orf72 ALS and FTLD.

### LIMITATIONS OF RNA-TARGETED THERAPEUTIC STRATEGIES

RNA-targeted therapeutic approaches offer a treatment strategy with greater specificity, improved potency, and decreased toxicity compared to the small molecules against traditional drug targets (signaling proteins). They represent an important way to treat ALS and other neurodegenerative diseases that need to be considered in the near future. However, there are still some concerns and challenges to overcome for ALS therapeutic applications.

Off-target effects RNAi and ASO remain an important consideration though thorough toxicological and safety research prior to clinical application can diminish some of this concern. The negative charge of siRNA and ASO as well as their size makes it difficult for them to cross the cell membrane. Viral packing is currently widely used to deliver ASO and siRNA into cells. Although, viral vectors are highly efficient as transfer vehicles, immunogenicity of the viral vectors is a major concern. Various other delivery strategies such as nanoparticles, liposomes and aptamers could be more effective and safe. Efforts are also underway to chemically stabilize siRNA, which will avoid the need for viral vectors (Castanotto and Rossi, 2009).

RNA foci and dipeptide products are generated from both sense and antisense directions of the C9orf72 transcript. However, ASOs for C9orf72 ALS preferentially target sense strand transcripts. There may be a need to design ASO strategies to target toxic RNA transcribed from both directions in order to adequately treat the C9orf72 ALS (Schoch and Miller, 2017). Furthermore, ASO-based therapeutic strategy for C9orf72 ALS only target gain-of-function mechanisms, but loss-of-function mechanisms may also act in synergy to cause pathogenesis in C9orf72 ALS. It is very plausible that an integrated therapeutic approach to inhibit toxic RNA foci/dipeptide repeat protein formation and restore normal levels of C9orf72 may be necessary to fully address the cellular deficits in C9orf72 ALS.

#### CONCLUSION

TDP-43, SOD1, FUS, and C9orf72 mutations are involved at various aspects of RNA processing and many of which are shared. It is becoming clear that impaired RNA regulation and processing is a central feature ALS pathogenesis. Given that defects at multiple steps of RNA processing impair cellular function and survival, RNA metabolism can be considered an essential target for therapeutic intervention for ALS and other neurodegenerative disease such as FTLD. The application of RNA-based therapies

#### REFERENCES


to modulation of gene and subsequent protein expression is an attractive therapeutic strategy. The preclinical testing of RNAbased therapies targeting SOD1 and C9orf72 mutations are indeed very promising. Similar studies are yet to be undertaken for FUS and TDP-43 mutations. RNA-based therapies could be considered in the future for the treatment of ALS.

#### AUTHOR CONTRIBUTIONS

SP contributed to the idea conception and overall review design. ZB and SP wrote the manuscript.

#### FUNDING

This work was supported by Canadian Institutes of Health Research (CIHR). SP was supported by an ALS Canada-Brain Canada Career Transition Award and Fonds de Recherche du Québec-Santé (FRQS) Junior 1 research award.

#### ACKNOWLEDGMENTS

The authors would like to thank Dr. Marie-Claude Belanger for her help with some of the illustrations of this manuscript.

Rev. Pharmacol. Toxicol. 50, 259–293. doi: 10.1146/annurev.pharmtox.010909. 105654


cell-cycle progression and prostate cancer growth. Cancer Res. 71, 914–924. doi: 10.1158/0008-5472.CAN-10-0874



non-ATG translation in c9FTD/ALS. Acta Neuropathol. 126, 829–844. doi: 10.1007/s00401-013-1192-8


TAF15 RNA binding sites reveals the impact of TAF15 on the neuronal transcriptome. Cell Rep. 3, 301–308. doi: 10.1016/j.celrep.2013. 01.021


hypersensitizing cells to stress. Mol. Neurobiol. 54, 3062–3077. doi: 10.1007/ s12035-016-9850-1



as a downstream effector of Isl1-Lhx3. Nat. Commun. 6:7718. doi: 10.1038/ ncomms8718



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Butti and Patten. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

### Ciphers and Executioners: How 3 0 -Untranslated Regions Determine the Fate of Messenger RNAs

Vinay K. Mayya and Thomas F. Duchaine\*

Goodman Cancer Research Centre and Department of Biochemistry, McGill University, Montreal, QC, Canada

The sequences and structures of 3<sup>0</sup> -untranslated regions (30UTRs) of messenger RNAs govern their stability, localization, and expression. 30UTR regulatory elements are recognized by a wide variety of trans-acting factors that include microRNAs (miRNAs), their associated machinery, and RNA-binding proteins (RBPs). In turn, these factors instigate common mechanistic strategies to execute the regulatory programs encoded by 30UTRs. Here, we review classes of factors that recognize 30UTR regulatory elements and the effector machineries they guide toward mRNAs to dictate their expression and fate. We outline illustrative examples of competitive, cooperative, and coordinated interplay such as mRNA localization and localized translation. We further review the recent advances in the study of mRNP granules and phase transition, and their possible significance for the functions of 30UTRs. Finally, we highlight some of the most recent strategies aimed at deciphering the complexity of the regulatory codes of 30UTRs, and identify some of the important remaining challenges.

Edited by: Chiara Gamberi, Concordia University, Canada

#### Reviewed by:

Michael Sheets, University of Wisconsin–Madison, United States Piergiorgio Percipalle, New York University Abu Dhabi, United Arab Emirates

#### \*Correspondence:

Thomas F. Duchaine thomas.duchaine@mcgill.ca

#### Specialty section:

This article was submitted to RNA, a section of the journal Frontiers in Genetics

Received: 07 September 2018 Accepted: 07 January 2019 Published: 24 January 2019

#### Citation:

Mayya VK and Duchaine TF (2019) Ciphers and Executioners: How 3<sup>0</sup> -Untranslated Regions Determine the Fate of Messenger RNAs. Front. Genet. 10:6. doi: 10.3389/fgene.2019.00006 Keywords: miRNAs, CCR4-NOT complex, RNA binding proteins (RBPs), phase transition, mRNP granules, translational repression, deadenylation, 30untranslated region (UTR)

#### INTRODUCTION

Precise spatial and temporal regulation of gene expression is necessary for the proper development and homeostasis of organisms. Systems approaches indicate that post-transcriptional mechanisms, in particular translational repression is the most significant contributor to establishing a gene's expression in mammalian cells (Schwanhausser et al., 2011). Post-transcriptional regulation is instated by mechanisms that control translation, stability, and localization of mRNAs. Such mechanisms converge on one or several distinctive features of mRNAs (**Figure 1**).

The coding sequence (CDS) of an mRNA is flanked by 5<sup>0</sup> - and 3<sup>0</sup> -untranslated regions (UTR). These sequences encode regulatory structures and sequences often referred to as cis-regulatory, or cis-acting elements. When unrepressed, interactions between the 5<sup>0</sup> -terminal cap, the eIF4F capbinding complex (an assembly of eIF4E, eIF4A, and eIF4G), the 3<sup>0</sup> -terminal poly(A) tail and the associated poly(A) binding proteins (PABPs) lead to circularization of an mRNA (Gallie, 1991; Wells et al., 1998). mRNA circularization is thought to allow for synergy of the 5<sup>0</sup> -cap and poly(A) tail in potentiating translation initiation, and possibly also in stabilizing the mRNA (Sachs et al., 1997; Schwartz and Parker, 1999). Circularization brings 30UTR cis-acting elements closer to the translation initiation machinery. Perhaps not surprisingly, 30UTR-driven mechanisms determine the expression and fate of mRNAs by targeting the 5<sup>0</sup> -cap and 3<sup>0</sup> -poly(A) tail moieties and/or their associated cofactors.

The functional information encoded in the sequence and structure of 30UTRs are decrypted and acted upon by an array of cellular regulatory factors (often referred to as trans-acting factors). Regulatory factors can be broken down into two distinct categories based on their direct molecular implication in (i) specific recognition of the 30UTR sequence and structure, and (ii) execution of consequent activities. Factors involved in specific recognition include a variety of non-coding RNAs, such as microRNAs (miRNAs), and RNA-binding proteins (RBPs) to match the sequences and structural determinants encoded in 3 <sup>0</sup>UTRs. A more limited diversity of effector machineries can be grouped in three effector activities: (i) translational control (**Figure 1B**), most often acting on translation initiation (Nelson et al., 2004; Humphreys et al., 2005; Chendrimada et al., 2007; Mathonnet et al., 2007; Zdanowicz et al., 2009), but also in some cases on translation elongation (Petersen et al., 2006; Gu et al., 2009), (ii) deadenylation and decay (**Figures 1C,D**), whereby deadenylation of an mRNA can be coupled to some degree to its decapping and decay, and (iii) localization (**Figure 1E**), which can be established through active RNA transport along the cytoskeleton and/or asymmetric anchoring of an mRNA in a cellular domain.

In many cases, including the examples presented below, more than one effector activity can be mobilized by a 30UTR. Recognition and effector activities can involve synergistic, cooperative, or coordinated interactions dictated by the 30UTR regulatory sequences themselves, but also by the cellular, sub-cellular, and biochemical context wherein the mRNA is found. mRNAs and the regulatory machineries are deeply affected by concentration, stoichiometry, affinities, RNA editing, protein post-translational modifications, and physical seclusion, all of which can change with cell identity or adaptation to environmental cues. Directly speaking to both cellular and biochemical contexts and re-emerging with the refining of different classes of RNA-protein condensates (referred to as mRNP granules) is the concept of phase transition. It remains less than clear how phase transition functionally intersects with 3 <sup>0</sup>UTR regulatory mechanisms. Several hypotheses have recently been substantiated and will be discussed later in this review.

#### RNA-BINDING PROTEINS (RBPs)

The human genome encodes more than 1,500 RBPs (reviewed in Hentze et al., 2018). Each one of these proteins is constituted of one or more RNA binding domains (RBD), which can be grouped in RBP families, and auxiliary domains that enable other interactions or carry out enzymatic activities (Gerstberger et al., 2014). Canonical RBDs that are often involved in 30UTR recognition include RNA recognition motifs (RRM), K-Homology (KH) domain, several types of zinc finger domains, double-stranded RNA binding domain (dsRBD), Piwi/Argonaute/Zwille (PAZ) domain, Pumilio/FBF (PUF) domain, and Trim-NHL domain proteins (Lunde et al., 2007). Using intra-molecular or extra-molecular combinations of RBDs, RBPs can improve RNA recognition specificity, affinity, and avidity. Distinct surfaces of RBDs, specific motifs and auxiliary domains mediate the protein-protein interactions required to recruit and activate effector activities to mRNAs.

We will next review some well-characterized examples of how RBPs achieve these functions. Note that RBPs can also play a disruptive role on the activities guided by other regulatory elements in 30UTRs. Those will be discussed later in this review.

### PUF Proteins

Eukaryotic Pumilio and FEM-3 binding factor (PUF) proteins are part of a family of RBPs that can instigate translational repression, deadenylation and decay of targeted mRNAs. PUF proteins regulate a large number of mRNA targets involved in diverse biological functions. For example, Drosophila and Caenorhabditis elegans PUF proteins are important for the maintenance of stem cells (Wickens et al., 2002) and target mRNAs of central components of the Ras/MAPK, PI3K/Akt, NF-κB, and Notch signaling pathways (Kershner and Kimble, 2010). In mammalian cells, the precise dosage of PUF proteins is essential to fine-tune the expression of mRNAs encoding mitosis, DNA damage and DNA replication factors. Recently, PUF proteins were shown to be involved in a network of interactions with the NORAD lncRNA at its center, which prevents chromosomal instability (CIN) (Lee et al., 2016).

The PUF family of proteins binds RNAs bearing the 5<sup>0</sup> - UGUR (where R = purine) sequence (Quenault et al., 2011). The determinants of those interactions are understood to such an extent that a PUF protein's specificity can actually be predicted (Hall, 2016). For example, the classical Drosophila Pumilio protein uses its eight α-helical Pumilio repeats to bind the eightnucleotide sequence 5<sup>0</sup> -UGUANAUA. Furthermore, Pumilio proteins can be co-expressed. In Saccharomyces cerevisiae, coexpression of PUF proteins at different concentrations and with distinct binding affinities can result in competition for individual binding sites (Lapointe et al., 2015, 2017). Binding of PUF proteins to an mRNA typically leads to translational repression, deadenylation, and mRNA decapping. The yeast PUF-domain Mpt5p protein directly interacts with the ortholog of CAF1, one of the two catalytic subunits of the Carbon Catabolite Repressor-Negative on TATA (CCR4-NOT) deadenylase complex, through its RNA-binding domain (Goldstrohm et al., 2006). This interaction is conserved in metazoa, and C. elegans and human PUF homologs can also bind to the yeast CAF1 ortholog (Suh et al., 2009; Van Etten et al., 2012; Weidmann et al., 2014). PUF proteins can also repress mRNA expression by inducing their destabilization. Indeed, Mpt5p can recruit an eukaryotic translation initiation factor 4E (eIF4E)- binding protein to target mRNAs (Blewett and Goldstrohm, 2012). eIF4E-binding proteins block the interaction between eIF4E and eIF4G, and this typically prevents the recruitment of the 43S pre-initiation complex (PIC) to mRNAs (Haghighat et al., 1995). However, sometimes including this case, the interaction leads to the recruitment and activation of decapping and decay co-factors (Ferraiuolo et al., 2005; Nishimura et al., 2015).

#### Nanos and TRIM-NHL Proteins

The outcome of PUF protein binding to mRNA targets can be altered through interactions with other RBPs. This is the

case for the prototypical Pumilio protein in the regulation of hunchback mRNA in Drosophila (Sonoda and Wharton, 2001), wherein its functions are highly dependent on Nanos and Brain Tumor (Brat) proteins. The RNA-binding specificity of Nanos is defined by its interactions with Pumilio, and Nanos directly interacts with the CCR4-NOT deadenylase complex to promote deadenylation of mRNAs (Curtis et al., 1997; Kraemer et al., 1999; Sonoda and Wharton, 1999; Kadyrova et al., 2007). Brat, a member of the broadly conserved TRIM-NHL family of proteins, forms a ternary complex with Pumilio and Nanos. This complex recruits the effector protein 4EHP to repress the translation of mRNAs (Cho et al., 2006). 4EHP is an eIF4E-like cap binding protein that does not interact with eIF4G and impairs ribosome recruitment to the mRNA (Rom et al., 1998). Unlike Nanos, Brat can stably bind RNA on its own through its NHL domain, and can also function independently of PUF proteins (Laver et al., 2015). Proteomic analysis of CCR4-NOT complex also suggests an interaction with Brat (Temme et al., 2010). It remains unknown whether this is a direct interaction and whether it contributes to and/or is necessary for mRNA repression. TRIM-NHL proteins exert a broader set of biological functions beyond their interplay with Pumilio in Drosophila embryo. They play critical roles in brain development, cell polarity, and sex determination (Tocchini and Ciosk, 2015). It is quite possible that this family drives different mechanisms in different cellular or physiological contexts, and that functional interactions with other RBP families may depend on the mRNA target and/or its genetic niche.

#### HuR and TTP Proteins

The presence of adenylate/uridylate (AU)-rich sequences in 3 <sup>0</sup>UTRs has long been associated with regulation of mRNA stability (Barreau et al., 2005). Early computational analysis of human mRNA datasets estimated that 8% of mRNAs harbor AUrich elements (Bakheet et al., 2006). While AU-rich sequences may be expected to contribute to the destabilization of 30UTR folding structures, they are also directly recognized by a diversity of RBPs. Tristetraprolin (TTP) and its paralogs: butyrate response factors 1 and 2 (BRF-1/2), bind to AU-rich elements through their two zinc-finger domains and promote the decay of mRNAs (Lai et al., 2000). Here again, TTP or BRF direct mRNA destabilization by recruiting effectors of deadenylation, decapping, and 5<sup>0</sup> - and

3 0 -exonuclease activities (Lykke-Andersen and Wagner, 2005; Sandler et al., 2011). Interactions with effectors have been mapped to an auxiliary N-terminal domain, which is sufficient to trigger the decay of target mRNAs (Lykke-Andersen and Wagner, 2005). The XRN1 5<sup>0</sup> - > 3 0 exonuclease is thought to be the enzyme effecting mRNA degradation instigated by TTP. It is recruited through the Enhancer of Decapping-4 (EDC4) scaffolding protein (Chang et al., 2014).

Not all AU-rich encoding mRNAs are subjected to degradation. In fact, closely similar sequences can instead lead to enhanced mRNA stability. Such a response often occurs when the HuR protein associates with AU-rich sequences (Brennan and Steitz, 2001). HuR is ubiquitously expressed and belongs to the Embryonic lethal abnormal vision (ELAV) family of proteins (Ma et al., 1996). The exact molecular mechanism used by HuR to confer mRNA stability is still being resolved (von Roretz et al., 2011). An early study showed that overexpression of HuR could slow the decay of mRNAs without impacting their deadenylation rates (Peng et al., 1998). The prevailing model proposes that HuR can stabilize AU-rich encoding mRNAs through competition for binding with factors such as TTP or a subset of miRNAs. Some of the keys to predicting whether an AU-rich sequence dictates degradation, stabilization or has no impact on an mRNA will likely lie in quantitative parameters such as stoichiometry of AU-rich elements and RBPs, and their binding affinities. Future studies may thus benefit from quantitative approaches in specific cell types.

#### microRNAs (miRNAs)

miRNAs are genome-encoded, ∼22-nucleotide (nt)-long RNA molecules which guide the associated proteins toward binding sites located in the 30UTRs of mRNAs to repress their expression. miRNAs were first discovered in C. elegans where they regulate the heterochronic cascade of genes that pre-determines cell fate and developmental transitions (the lin-4 and let-7 miRNAs) (Lee et al., 1993; Wightman et al., 1993; Reinhart et al., 2000). A turning point for the fields of miRNAs and 30UTRs was the identification of several let-7 homologs in other species including humans (Pasquinelli et al., 2000). This discovery coincided with important advances in sequencing technologies and sparked a concerted effort of miRNA sequencing and prediction, leading to the identification of thousands of new miRNAs (Lee and Ambros, 2001; Lau et al., 2001; Lagos-Quintana et al., 2001; Friedman et al., 2009). Currently, more than two thousand miRNAs have been identified in the human genome, and the miRbase database contains 48,885 mature miRNAs from a total of 271 species (Kozomara and Griffiths-Jones, 2014). Since their conservation across species has been shown, miRNAs have been implicated in a myriad of functional cascades across metazoans, including development, signaling, immune system, and metabolism (Ameres and Zamore, 2013). Conversely, their mis-expression or misregulation contributes to or plays instrumental roles in a variety of diseases ranging from heart disease to diabetes to cancer (Hesse and Arenz, 2014).

The base-pairing of miRNAs with 30UTR sequences is quite distinct from what is to be expected from a 'free' single-stranded RNA of the same length. A miRNA's target recognition kinetics and specificity are largely dictated by its interactions with the Argonaute protein within which it is bound in the cell (for a review, see Duchaine and Fabian, 2018). The miRNA strand is stretched across Argonaute's croissant-shaped structure by interactions with its four domains. On its 5<sup>0</sup> end, the miRNA interacts with the Mid and PIWI domains. Across a central cleft, the 3<sup>0</sup> end of the miRNA is bound to the PAZ domain which closely interacts with the N-domain. Extensive interactions preorients the 5<sup>0</sup> -most bases of the miRNA (nts 2-8), a region called the seed, into a favorable conformation for pairing with target sequences. Target recognition through the seed is a two-step process wherein the rate limiting step is the pairing of nts 2– 5 and the dissociation rate is largely determined by the pairing of nts 6–8 (Wee et al., 2012; Schirle et al., 2014; Chandradoss et al., 2015; Salomon et al., 2016). Multiple genomic studies and individual miRNA-binding sites have indicated that, alternative non-canonical routes of target recognition may be prevalent. For example, some miRNAs further use the 3<sup>0</sup> end of the miRNA in target recognition (Broughton et al., 2016; Brancati and Großhans, 2018). Such alternative modes of target recognition likely involve dynamic interactions with the N-PAZ pair of Argonaute domains.

The importance of the interactions and molecular mechanics of the Argonaute scaffold in dictating miRNA targeting kinetics recently led the Zamore group to suggest that the miRNA/Argonaute (a minimal assembly referred to as RISC) behaves as a 'programmable RNA-binding protein' (Salomon et al., 2016). Incidentally, this analogy further extends to the effector activities that are mobilized by miRNAs, which largely overlap with effectors and mechanisms mobilized by RBPs. Metazoan Argonautes that are programmed by miRNAs also stably interact with the TNRC6 or GW182 family of proteins. This constitutes the core of a complex often referred to as miRNA Induced Silencing Complex or miRISC (Jonas and Izaurralde, 2015). In essence, GW182 proteins bridge interactions between Argonaute proteins and effector complexes including mRNA deadenylation, decapping and decay machineries. Here again, the CCR4-NOT complex plays a central and pivotal role (Fabian et al., 2011; Braun et al., 2013; Jonas and Izaurralde, 2015). We will thus next examine in more details the architecture, interactions and important functions of the CCR4-NOT complex in determining the fate of mRNAs.

#### THE CCR4-NOT COMPLEX: A HUB FOR 3 <sup>0</sup>UTR EFFECTOR ACTIVITIES

The CCR4-NOT complex plays a central role in the fate of an important diversity of mRNAs. Other deadenylases such as the PAN2/3 complex exert a regulatory function, but on a more limited subset of mRNAs and on population of longer poly(A) tails (Chen and Shyu, 2011). However, the CCR4-NOT complex seems to be responsible for most poly(A) tail controls in metazoan transcriptomes where it has been examined (Tucker

et al., 2001; Temme et al., 2004; Yamashita et al., 2005; Schwede et al., 2008; Nousch et al., 2013). The CCR4-NOT complex integrates the effector functions in mechanisms initiated by a diversity of RNA-binding proteins and miRNAs (**Figure 2**). CCR4-NOT consists of two highly conserved modules: the CNOT1/2/3 proteins constitute a scaffolding module for all the subunits of the complex, while the catalytic module of the complex is formed by two deadenylases, EEP-type CCR4 and DEDD-type CAF1. Their functions partially overlap or compensate for each otherin vivo, but CAF1 is believed to assume the bulk of the function in miRNA-directed deadenylation (Fabian et al., 2009). Beyond scaffolding the CCR4-NOT complex, the central CNOT1 subunit acts as a tether and directly interacts with GW182, TTP, Nanos, PUF, Smaug, and several other RNA-binding proteins in different cells and organisms (Wahle and Winkler, 2013).

Recruitment of the CCR4-NOT complex to mRNAs is associated with its deadenylation activities, but a different perspective on the function of this complex has recently emerged. The CCR4-NOT complex also recruits distinct activities such as decapping and exonucleases (**Figure 2B**) that are often coupled with deadenylation, but also with cap-binding and translation repression without mRNA deadenylation or decay (**Figures 2C,D**). Its interactions with intrinsically-disordered region (IDR)-encoding proteins that are components of the mRNP in the C. elegans embryo recently suggested a role in nucleating phase transition (Wu et al., 2017) (**Figure 2E**).

### mRNA Deadenylation and Decay

In addition to its role in translation initiation, PABP is a cofactor of deadenylases, including the CCR4-NOT complex (Fabian et al., 2009; Huntzinger et al., 2013). In vitro, PABP accelerates the deadenylation of long 30UTRs for which the poly(A) tail is distant to the regulatory sequences (Flamand et al., 2016). The first step in deadenylation of an mRNA is thought to be the displacement of PABP proteins from the poly(A) tail by cofactors recruited through the GW182 protein and CCR4-NOT complex (Moretti et al., 2012; Zekri et al., 2013). Removal of the poly(A) tail is then catalyzed by the CAF-1 and CCR4 deadenylases subunits (**Figure 2A**).

In metazoans, deadenylation is often tightly coupled with mRNA decapping and decay (**Figure 2B**). Earlier studies showed that following the shortening of a poly(A) tail below a certain threshold, an mRNA is subjected to first-order decay (Chen et al., 2008). mRNA deadenylation and decay are clearly coupled in early zebrafish embryo, where mRNA deadenylation instigated by the miR-430 family of miRNAs marks the initial step in the decay of an important fraction of maternal mRNAs in the Maternal-to-Zygotic Transition (MZT) (Giraldez et al., 2005, 2006). This is also obvious in Drosophila S2 cultured cells, where fully deadenylated mRNAs do not accumulate, and impairing the decapping enzymes Dcp1/2 is necessary to detect the deadenylated species (Eulalio et al., 2009). The LSM1-7 proteins are thought to form a ring-like complex around the remnants of the shortened poly(A) tail and to promote mRNA decapping and decay (Tharun, 2009).

A key protein, which physically couples the CCR4-NOT complex with decapping and decay, is the DEAD-box protein DDX6. DDX6 directly interacts with CNOT1 subunit and multiple decapping/decay factors, either simultaneously or through mutually exclusive interactions (Tritschler et al., 2009; Sharif et al., 2013; Chen et al., 2014; Mathys et al., 2014; Rouya et al., 2014; Nishimura et al., 2015; Ozgur et al., 2015). Interestingly, DDX6 also interacts with eIF4E-transporter (4E-T). This interaction is thought to increase the local concentration of decapping factors such as DCP2 around the 5<sup>0</sup> -cap, thus enabling competition with eIF4E (Nishimura et al., 2015). The removal of the 5<sup>0</sup> -cap structure by DCP2 seals the fate of the mRNA toward degradation via the 5<sup>0</sup> - > 3 <sup>0</sup> decay pathway mediated by XRN1 (Arribas-Layton et al., 2013). The activity of DCP2 is greatly enhanced by DCP1 and additional factors such as enhancers of decapping (EDC-3, EDC-4), PAT1, and the LSM1-7 complex (Jonas and Izaurralde, 2013) (**Figure 2B**). Alternative routes of mRNA decay have also been proposed, which would proceed from the 3<sup>0</sup> end and through the cytoplasmic exosome complex (Chen and Shyu, 2011).

#### Translational Repression

mRNA deadenylation abolishes the physical and functional synergy between the 5<sup>0</sup> -cap and poly(A) tail, resulting in translational repression (Mishima et al., 2006; Wakiyama et al., 2007). However, strong evidence indicates that the CCR4-NOT complex can also participate in direct translational repression, through mechanisms that do not involve its deadenylase activities (**Figures 2C,D**). Using luciferase reporters engineered to block deadenylation, an early study showed that tethering of Xenopus or human CAF1 is sufficient to repress mRNAs (Cooke et al., 2010). Several other reports, using different experimental designs and systems, have since then confirmed the role of CCR4- NOT as a direct translational repressor (Braun et al., 2011; Chekulaeva et al., 2011; Flamand et al., 2016; Chapat et al., 2017). Models proposed to explain this activity have accumulated in recent years and were substantiated to different extents. Disruption of mRNA circularization by displacement of PABP through CCR4-NOT and its cofactors has been suggested as one mechanism (Zekri et al., 2013). Other mechanisms instead revolve around displacement of interactions with the 5<sup>0</sup> -cap of targeted mRNAs, and DDX6 is also central for these functions of CCR4-NOT.

DDX6 can recruit 4E-T whose interaction with eIF4E can displace eIF4G and thus mediate translational repression (Kamenska et al., 2016). Repression can also occur through the strong interaction between 4E-T and 4EHP (Joshi et al., 2004; Cho et al., 2005). Recruitment of this dimer to CCR4-NOT through DDX6 was recently involved in translation repression by miRNAs (Chapat et al., 2017). A subset of mRNAs is translationally regulated through this 4EHP-4E-T mechanism in mammalian cells, among which DUSP6 plays an important role in fine-tuning the ERK signaling cascade (Jafarnejad et al., 2018). This last study is unique in identifying a physiological purpose to one of the many CCR4-NOT 'pure' translational repression mechanisms. Indeed, the physiological importance has yet to be determined for most of those mechanisms, which were identified in cell culture and/or in vitro. It remains possible that distinct mechanisms will be predominant in different cellular contexts or on particular mRNA targets.

## COOPERATIVE AND COMPETITIVE INTERPLAY AMONG RBPs AND miRISC

RBPs and miRISC can interact among themselves and with each other to alter the fate of mRNAs through either cooperation or competition. Considering the importance of 30UTR sequences and the diversity and density of potential binding sites for RBPs and miRNAs, it is hard to expect otherwise. The median length of human 30UTRs is 1,200 nt (Jan et al., 2011). On average each mRNA 30UTR is bound by 14 RBPs (Plass et al., 2017), and ∼70% of vertebrate 30UTRs encode multiple sites for different miRNA families (Friedman et al., 2009). Neither miRNA- nor RBP binding sites are distributed randomly in 30UTR sequences. Early on, genomic studies have shown that miRNA-binding sites are more likely to be functional when they are located close to each other, or when located close to the ORF or the poly(A) tail (Grimson et al., 2007; Saetrom et al., 2007). Similarly, genomic analyses indicate that AU-rich sequences are associated with a greater functional output of nearby miRNA-binding sites, and computational analyses of the mammalian genomes indicate that recognition sites for PUF proteins and AU-rich sequences are enriched within 50 nt of binding sites for a subset of miRNAs (Jiang et al., 2013).

## miRNA–miRNA Cooperativity

Signs that miRNA-mediated silencing acts through a cooperative mechanism were already visible in the seminal discovery papers in C. elegans. The 30UTR of lin-14 encodes 7 potential basepairing sites (Lee et al., 1993), while the lin-41 3 <sup>0</sup>UTR harbors two let-7 miRNA-binding sites, separated by intervening sequences of 27 nt in length (Reinhart et al., 2000). If each of these individual sites were independently functional, some degree of redundancy could be expected, with their individual impairment having limited to no consequence. Instead, both let-7 sites in the lin-41 3 <sup>0</sup>UTR are important in vivo (Vella et al., 2004). Likewise, binding sites for lin-4 and let-7, and multiple sites for lsy-6 functionally

interact on the lin-28 and cog-1 mRNAs, respectively (Moss et al., 1997; Reinhart et al., 2000; Didiano and Hobert, 2008). In vitro and in vivo studies later demonstrated that miR-35 and miR-58 miRNAs cooperate in the deadenylation and the silencing of the C. elegans egl-1/BIM mRNA (Wu et al., 2010; Sherrard et al., 2017). In addition to the fore-mentioned early genomic studies, which support miRNA cooperativity, mammalian reporter assays clearly confirmed that a combination of sites exert a much more potent silencing output (Broderick et al., 2011). While some studies examined miRNA-binding site cooperativity on natural or fragments of 30UTR sequences (Koscianska et al., 2015; Schouten et al., 2015), there are few detailed studies of miRNA-binding site interplay.

The mechanisms underlying miRNA cooperativity are still poorly resolved, but three models have been proposed and two have been substantiated experimentally. First, miRISC binding to nearby miRNA-binding sites can enhance their affinity for the 30UTR (Broderick et al., 2011; Flamand et al., 2017). This type of cooperativity in target binding is in fact required for some non-seed miRNA-binding sites to be stably bound by miRISC and to be functional (Flamand et al., 2017). A second model involves the cooperative recruitment of effector machineries. In an embryonic cell-free system, a reporter mRNA bearing a single miRNA-binding site was not deadenylated, and could not recruit the CCR4-NOT complex, whereas a reporter encoding three adjacent miRNA-binding sites did so efficiently (Flamand et al., 2017). Whether this mode of cooperativity is especially important in the embryo and/or in C. elegans is not known at present. A third, mutually not exclusive, possibility could involve the cooperative activation of effector activities. CCR4-NOT recruitment by miRISC on 30UTRs may not be sufficient on its own to trigger mRNA deadenylation and decay. A stoichiometric threshold, a specific configuration of target sites, post-translational modifications and/or conformation changes of miRISC may be required to trigger effector activation. These variations would be consistent with other protein/nucleic acid interaction paradigms, such as transcription factors.

### RBP-miRISC 30UTR Interactions

RBPs, miRNAs and the associated machineries can regulate their activities through cooperative or competitive interplay. It is likely that the mechanisms at work in cooperating miRNA-binding sites may also explain some of the RBP-miRNA cooperativity. Putative examples of direct interplay may include the cooperation of TTP with miR-16 in regulating TNF-alpha mRNA (Jing et al., 2005), and AU-rich sequences near the miR-16 binding site in the 30UTR of COX-2 mRNA (Young et al., 2012). Positive interplay can also be indirect, through the modulation of global or local 30UTR structures. Because they do not code, 30UTRs can adopt complex folding structures, which can have positive or negative impacts on overlapping or nearby regulatory sequences. Structures can constitute determinants for the recognition of other RBPs, or limit binding to miRNA-binding sites. In turn, binding of miRISC or RBP to high-affinity sites can destabilize folding structures and facilitate access to nearby binding sites. This model explains the effect of Pumilio on the 30UTR of the p27 tumor suppressor. Pumilio binding promotes a change in the local structure of the RNA that allows the binding of miR-221 and miR-222, leading to silencing of the p27 mRNA (Kedde et al., 2010). Similarly, a study showed that HuR could enhance the activity of let-7 on c-Myc mRNA. This is also likely through a change in the local structure of the RNA resulting in the unmasking of the let-7 binding site (Kim et al., 2009).

In the simplest form of antagonistic interaction, overlapping or nearby binding sites can lead to direct competition between RBPs and miRNAs/miRISC through steric hindrance. A survey by Keene and colleagues suggested that HuR prevents the function of abundant miRNAs on nearby and overlapping sites in a subset of mRNAs in HEK293 cells (Mukherjee et al., 2011). Similarly, Fillipowicz and colleagues showed that HuR could displace miRISC bound to a target mRNA thereby alleviating miRNA-mediated repression. This displacement occurs when HuR binds to AU-rich sequences 20–50 nt away from the miRNA-binding site (Kundu et al., 2012), again suggesting steric interference. The HuR example illustrates the fact that an RBP can have both positive or negative impacts on miRNA-binding site function, depending on 30UTR structure and binding site positioning. It also highlights that interactions between 30UTR structures, regulatory sequences and their trans-acting factors are precisely tuned through co-evolution.

#### COORDINATED AND SEQUENTIAL 3 <sup>0</sup>UTR ACTIVITIES

Beyond simple positive or negative interplay, 30UTR sequences can lead to the coordination of post-transcriptional mechanisms in both time and space. The mechanism underlying miRNAmediated silencing is in itself a coordinated series of events wherein mRNA translation repression precedes deadenylation, which in turn precedes decapping and decay. Translation repression can be resolved in vitro in a mammalian cellfree system (Mathonnet et al., 2007), in vivo in cell culture (Djuranovic et al., 2012), and even occur at distinct but subsequent developmental stages during early zebrafish embryo development (Bazzini et al., 2012). The biological purpose of this series of events, however, remains to be fully elucidated. Some of these steps in the silencing mechanism may be expected to be at least partially redundant with regards to the impact on gene expression. However, one possibility is that translation inhibition enables faster repression, e.g., when a binary decision is promptly required. Another possibility is that this allows for reversible repression in the early steps, whereas decapping and decay may offer a more permanent decision.

### RNA Localization

The coordination of 30UTR-driven activities is clearly illustrated through examples of active mRNA transport and localization. A majority of mRNAs are localized to subcellular regions and most examples where the underlying mechanisms have been detailed involve 30UTR regulatory elements (Jansen, 2001; Lécuyer et al., 2007). mRNA localization can be achieved through several mechanisms (reviewed in Martin and Ephrussi, 2009). In active mRNA transport, the mRNA is assembled in a

ribonucleoprotein (mRNP) complex through the specific binding of a combination of RBPs to the 30UTR of an mRNA (**Figure 1E**). Bound RBPs recruit effector proteins that repress translation and mediate interactions with motor proteins. The repressed mRNP is then transported via the cytoskeleton until it reaches its destination where it is anchored. The mRNA is then derepressed at the appropriate time and place through a series of events involving displacement/competition by other RBPs, and/or post-translational modifications (reviewed in Besse and Ephrussi, 2008).

#### Oskar mRNA Localization

Localization of oskar mRNA in the Drosophila oocyte is the archetype, and remains one of the best-characterized examples of active mRNA transport (Ephrussi et al., 1991; Kim-Ha et al., 1991). oskar mRNA localization to the posterior pole of the oocyte occurs via microtubules through interactions with Staufen (Stau), tropomyosin and EJC components (Micklem et al., 2000; Zimyanin et al., 2008). Localized expression of oskar ensures proper patterning of the posterior body axis and germline fate (Kim-Ha et al., 1991). Mislocalization to the anterior pole leads to ectopic formation of abdomen and germ cells (Ephrussi and Lehmann, 1992), and absence of Oskar protein leads to loss of germ cells and aberrant abdominal segments (Lehmann and Nusslein-Volhard, 1986). Moreover, premature translation of localizing oskar mRNAs also results in patterning defects (Smith et al., 1992). Translational repression is achieved by Bruno RBP binding to multiple elements in the 30UTR of oskar mRNA (Kim-Ha et al., 1995), which also recruits an eIF4Ebinding protein, Cup (Wilhelm et al., 2003) to the mRNA. Similar to the 4E-T protein, Cup disrupts the interaction between eIF4E and eIF4G and prevents 43S pre-initiation complex binding to oskar mRNA (Nakamura et al., 2004). Bruno further represses oskar mRNA by promoting its oligomerization, a process which likely also contributes to rendering it inaccessible to the translation machinery (Chekulaeva et al., 2006). The Polypyrimidine Tract-Binding protein (PTB), which binds to multiple sites in oskar mRNA 30UTR, is also essential for mRNA oligomerization and densely packed mRNP particles (Besse et al., 2009).

The fates of oskar and nanos mRNAs are closely linked in the Drosophila oocyte. nanos mRNA is also localized to the posterior pole, and is rapidly deadenylated and degraded elsewhere in the early embryo through the recruitment of CCR4-NOT complex by Smaug (Smibert et al., 1996, 1999; Bashirullah et al., 1999; Dahanukar et al., 1999; Zaessinger et al., 2006). Translation of nanos mRNA at the posterior pole is thought to be activated by the Oskar and Vasa proteins, but the exact underlying mechanism remains unclear (Ephrussi and Lehmann, 1992; Smith et al., 1992). Oskar could inhibit the function of Smaug, either by affecting the binding of Smaug to nanos mRNA or by interfering with the recruitment of the CCR4-NOT complex. It is also clear that some of the keys to solving the underlying mechanism will stem from the properties of phase transition in the posterior pole germ plasm (see below).

### mRNA Routes in Mammalian Cells

An important variety of RNA localization events have been described in mammalian cells. Among them, the cascades dictated by the Zipcode and A2RE/RTS cis-acting elements provide well-delineated examples of how mammalian mRNAs can be sorted and locally translated in distinct cell types through information encoded in 30UTRs. They also illustrate how localized cellular signaling can determine the precise site of translation of localized mRNAs.

#### Zipcode and the Zipcode-Binding Protein 1

β-actin mRNA localizes to the leading edge of the fibroblasts (Lawrence and Singer, 1986), and analogous mechanisms are thought to be at work in developing neurites and hippocampal dendrites (Bassell et al., 1998; Zhang et al., 1999, 2001; Eom et al., 2003; Shav-Tal and Singer, 2005). Localization of β-actin mRNA is instigated by the zipcode binding protein 1 (ZBP1) (Ross et al., 1997), which specifically binds a 54-nt long 30UTR segment termed the 'Zipcode' (Kislauskis et al., 1994). The motor for β-actin mRNA localization in fibroblasts was only recently identified (Song et al., 2015). KIF11, a tubulin-associated motor, associates with the β-actin mRNPs wherein it directly interacts with ZBP. Disruption of this interaction in vivo leads to β-actin mRNA mis-localization and perturbs cell motility.

The exact nature of the mechanism responsible for the silencing of transported β-actin mRNAs remains unclear. Singlecell live imaging revealed an anti-correlation between the association of ZBP1 or ribosomes with β-actin mRNA (Wu et al., 2015). The authors thus proposed that the packaging of β-actin mRNA into mRNP granules may seclude mRNAs from ribosomes, and thus pre-empt translation. On one hand, the pervasive nature of mRNP granule formation in mRNA localization suggests that translation repression may be at least partly achieved through packaging of such mRNPs. On the other hand, the events leading to localized mRNA translational de-repression are rarely defined. For β-actin mRNA, this appears to result from signaling cascades locally converging on trans-acting factors. Upon reaching the endpoint of mRNA transport, phosphorylation of ZBP1 on a tyrosine residue by the protein kinase Src, which is closely associated with the cell membrane, disrupts RNA binding and relieves β-actin mRNA from translational repression (Hüttelmaier et al., 2005).

#### The A2RE/RTS Pathway

The A2 response element (A2RE) or RNA trafficking signal (RTS) is an 11-nt cis-acting element recognized by the heterogenous nuclear ribonucleoprotein A2 (hnRNP A2) and CArG-box binding factor A (CBF-A) proteins. The importance of this element was originally described in the transport of Myelin Basic Protein (MBP) mRNA in oligodendrocyte processes (Ainger et al., 1997; Carson et al., 1997; Hoek et al., 1998; Munro et al., 1999). A2RE/RTS-like sequences have since then been identified in a growing number of localized transcripts including BC1, αCaMKII, NG, ARC, BDNF, Prm2 mRNAs, and HIV RNAs (Mouland et al., 2001; Muslimov et al., 2006; Gao et al., 2008; Raju et al., 2011; Fukuda et al., 2013). Though the mechanism of translation inhibition remains unclear for most of these mRNAs,

assembly of MBP mRNA molecules into granules somehow maintains the transcripts in a repressed state. Just like for β-actin mRNA, phosphorylation of a trans-acting factor is key to enable the translation of MBP mRNA, which is released at sites of glianeuronal contacts through phosphorylation of hnRNP A2 and hnRNP F by the Fyn kinase (White et al., 2008, 2012).

### Xenopus Oocyte mRNA Localization Pathways

The developing Xenopus laevis oocyte features mRNA localization examples that illustrate how elements in 30UTRs direct toward distinct localization path in successive stages of development. During the six stages of oogenesis, RNAs localize along the animal/vegetal (A/V) axis of the oocyte (Kloc et al., 2001) through the early and late pathways. In the early pathway, germ plasm RNAs such as DEADSouth, Xpat, Xcat2, and Xdazl are transported by associating with a membrane-less structure termed the mitochondrial cloud (MC) or Balbiani body. This body contains germinal granules, endoplasmic reticulum, mitochondria, and is surrounded with bundles of intermediate filaments, which were suggested to play a role in maintaining its structure (Heasman et al., 1984; Forristall et al., 1995; Kloc and Etkin, 1995; Gard et al., 1997; King et al., 2005; Carotenuto and Tussellino, 2018). During stage II of oogenesis, the mitochondrial cloud expands between the nucleus and vegetal cortex. This expansion is thought to 'push' the germinal granules and RNAs toward the vegetal cortex where they are anchored on the cytoskeleton (Alarcon and Elinson, 2001; Wilk et al., 2005). Two distinct localization elements (LE) are encoded in the Xcat2 mRNA 30UTR (Mosquera et al., 1993). A proximal 240 nt-long element is required for mitochondrial cloud localization (MCLE), whereas a distal ∼160 nt-long germinal granule localization element (GGLE) enables incorporation into germinal granules present inside the MC. Both localization signals are necessary for the proper localization of Xcat2 mRNA, which highlights the coordinated contributions of both 30UTR elements (Kloc et al., 2000). Xcat2 mRNA is translationally repressed in the MC (MacArthur et al., 1999), and a few studies have implicated the RNA-binding protein Hermes in the repression of Xcat2 mRNP (King et al., 2005; Song et al., 2007; Nijjar and Woodland, 2013).

In the late localization pathway, mRNAs involved in somatic cell fates such as Vg1 and VegT are transported to the vegetal cortex in a microtubule-dependent mechanism (Kloc and Etkin, 1998). The 30UTR of Vg1 mRNA encodes a 340-nt long LE, wherein clusters of short motifs are bound by the Vera and hnRNP I proteins (Deshler et al., 1997). The Vg1 LE is thought to be initially recognized by hnRNP I. This interaction remodels the Vg1 mRNP, which in turn allows Vera to bind Vg1 mRNA directly. Other factors are then recruited to the Vg1 mRNP including Staufen, Prrp and a kinesin motor to enact localization (Zhao et al., 2001; Yoon and Mowry, 2004; Lewis and Mowry, 2007; Lewis et al., 2008). Only after localizing to the vegetal cortex at the late stage IV of oogenesis is Vg1 mRNA translated (Dale et al., 1989; Tannahill and Melton, 1989). The spatiotemporal control of Vg1 mRNA translation is dictated by the 250-nt long translation-control element (TCE) encoded downstream of the Vg1 LE (Wilhelm et al., 2000; Otero et al., 2001). ElrB, a member of the ELAV family of RBPs, interacts with the TCE of Vg1 mRNA (Colegrove-Otero et al., 2005). This interaction correlates with the repression of the Vg1 mRNA, but how ElrB effects translational repression is not known.

### mRNPs: GOING THROUGH PHASES IN THE LIVES OF mRNAs

Mechanisms involving 30UTR regulatory elements have long been associated with large mRNP granules. These granules can reach massive sizes by molecular standards (Brangwynne, 2013), often rivaling organelles. The list of large mRNP granules is rapidly expanding and includes P-bodies (originally named GW bodies), germ granules (also called polar granules and P granules, depending on species), stress granules, and the mRNA transport particles (Voronina et al., 2011), among others. Similarities and differences in the composition of large mRNPs have been documented (Eulalio et al., 2007a), mainly through comparison of associated markers by immunofluorescence. For example, stress granules are often distinguished from coexpressed P-bodies through exclusive colocalization of G3BP and DCP2, respectively (Ingelfinger et al., 2002; Tourriere et al., 2003; Kedersha and Anderson, 2007). In the early embryo, germ granules are distinguished from P-bodies through their association with germline markers such as PIE-1 in C. elegans (Strome, 2005). The absence of membranes in these organellesized particles and their scale led to their non-specific description as 'large aggregates' of RNA and proteins. A function in local mRNA concentration or storage for germ granules was naturally inferred from their scale and their concentration of maternal mRNAs in the oocyte (Noble et al., 2008; Voronina et al., 2011). Their importance in storage and protection of subsets of mRNAs from degradation was substantiated by welldefined examples, including the above-described nanos mRNA in Drosophila. The mRNA storage/protection model for mRNPs is also often associated with seclusion from the translational machinery. For example, in the developing oocytes of C. elegans, P granules help store translationally silent transcripts to prevent premature differentiation (Boag et al., 2008). Later in the embryo, P granules selectively repress somatic mRNAs in the P-lineage blastomeres, but not germline mRNAs to maintain germline fate and totipotency (Gallo et al., 2010; Updike et al., 2014).

While a role in mRNA storage makes sense and appears to be well supported, the biochemical nature of large mRNPs has remained elusive since the identification of the electrondense 'nuage' structures in the early days of germline and developmental biology (Wilsch-Bräuninger et al., 1997). A breakthrough was recently made in the mechanisms of assembly and disassembly of mRNP granules. Hyman and colleagues showed that P granules in fact form by phase separation. Granules have liquid-like properties that permit dynamic fusing and exchange of components, but segregate from their surroundings like oil from water (Brangwynne et al., 2009). Similar properties were also described for P-bodies and

stress granules in vitro (Lin et al., 2015; Molliex et al., 2015). Intrinsically disordered proteins (IDPs) or proteins with at least a portion of disordered regions (IDRs) are a critical component of phase transition and mRNPs (Brangwynne et al., 2015). It is suspected that most, if not all mRNP granules contain different IDPs/IDRs (Uversky, 2017), and the interactions and properties of these proteins can control mRNP contents. Another typical property is their propensity to scaffold multiple proteins through multivalent interaction networks (van der Lee et al., 2014). Alongside IDP/IDRs, mRNAs and their interactions contribute to mRNP dynamics, either in promoting (Lin et al., 2015), or modulating granule assembly (Hubstenberger et al., 2015; Seydoux, 2018). Thus, the nature of protein-protein and protein-RNA interactions which contribute to assembly and stability of mRNP granules are distinct from what is observed in stable complexes in aqueous phases. Phase separation instead is governed by weak multivalent interactions that segregate interacting macromolecules away from water at a critical concentration (Li et al., 2012; Hyman et al., 2014; Banani et al., 2017). Traditional protein-protein interaction studies based on co-immunoprecipitation and in vitro interaction assays may not be suitable to detect many, if not most of the interactions that occur in mRNPs. This, in turn, may be one of the reasons why proximity-based interaction mapping methods such as BioID were fruitful in mapping interactions in P-bodies and stress granules (Youn et al., 2018).

In light of the newly discovered properties of mRNPs, new and important questions have emerged. What are the folding and enzymatic differences that prevail in such phase-separated liquid droplets? How is the specific composition (if any) of an mRNP defined, and how are biochemical boundaries maintained or crossed between different types of mRNPs? Earlier work by the Seydoux group revealed that P granules and P-bodies closely interact, but do not merge in the C. elegans early embryo (Gallo et al., 2008). More recently, their work identified an important role for IDP MEG-3 in modulating the structural stability of P granules. Different enrichments in PGL-1 and MEG-3 proteins significantly altered mRNP properties and could limit access to RNA (Smith et al., 2016). In the Drosophila oocyte, nanos mRNPs progress along the cytoskeleton from smaller localization particles to the larger germ granules at the posterior pole. The Gavis group used quantitative singlemolecule imaging to analyze the localization dynamics and assembly of mRNP germ granules in the Drosophila oocyte. Interestingly, single mRNP complexes that contain individual nanos transcripts merge into multi-mRNA granules at the posterior pole. This localized 'growth' appears to be exponential, rather than additive, which could be interpreted as mRNPs merging through phase transition into the germ plasm. In contrast, the oskar mRNA localizes as multi-copy mRNPs which are segregated from other mRNP granules once it reaches the posterior pole, and this exclusivity contributes to proper germline specification (Little et al., 2015). This suggests that single- or multi-mRNPs, can be differentially transported and locally stored. It further strengthens and refines the links between mRNPs and the transport and localization of mRNA granules.

The possible implications of this mechanism reach far beyond C. elegans and Drosophila oocytes and embryo. For example, analogous mRNP granules are likely common in mammalian neurons. A study took advantage of the preferential precipitation of IDPs by the chemical biotinylated isoxazole (b-isox) to fractionate mRNPs from mouse brain tissue (Han et al., 2012; Kato et al., 2012). mRNAs that precipitated with b-isox had on average 5-fold longer 30UTRs compared to mRNAs recovered in the soluble fraction. Moreover, precipitated mRNAs encoded roughly 10-fold more binding sites for Pumilio proteins. This further suggests that 30UTRs and their ability to bind multiple RBPs play an important role in mRNP assembly.

Originally named GW bodies because they contained an important fraction of the miRISC component GW182, P-bodies (for processing bodies) were later renamed because they also colocalized with decapping and decay proteins (Eystathioy et al., 2002, 2003). Because of this association, P-bodies have long been suspected to be sites of mRNA degradation (Sheth and Parker, 2003). They were also proposed as the site for RNAi, and several other mRNA decay activities (Unterholzner and Izaurralde, 2004; Jakymiw et al., 2005; Liu et al., 2005; Sheth and Parker, 2006). These functions, however, had been inferred and not directly demonstrated, and several studies challenged this role for P-bodies over the years (Chu and Rana, 2006; Eulalio et al., 2007b). Early on, a study by Izaurralde's group revealed that while miRNA-mediated silencing promoted P-body formation, detectable P-bodies were not required for miRNA function (Eulalio et al., 2007b). More recently, the Weil group developed a FACS-based method to purify endogenous P-bodies and sequenced their RNA contents. With this method, they could not detect any mRNA decay intermediates (Hubstenberger et al., 2017). Interestingly, they also found that mRNAs in P-bodies were translationally repressed. They thus proposed that mRNP formation may increase the local concentration of translational repressors and thus maintain mRNA targets in a translationally repressed state. Similarly, another group monitored the dynamics of XRN1 (which mediates the 5<sup>0</sup> - > 3 0 activity in many mRNA decay pathways) using an elegant dual fluorescent reporter design. Surprisingly, they noted that mRNA decay occurred throughout the cytoplasm, but not in P-bodies. This led them to also suggest that P-bodies are sites for mRNA storage, and not decay (Horvathova et al., 2017). This model nonetheless remains at striking odds with the localized concentration of decapping and decay enzymes in P-bodies.

Part of the solution to this conundrum may come from examining the composition and properties of P-bodies in different cellular lineages. The Seydoux group showed that the biochemical composition of P-bodies matured during early embryonic development, as it gained important decapping cofactors (Gallo et al., 2008). This stands to reason considering the dependence of mRNPs on the composition and concentration of proteins and mRNAs that are present in a particular context. P-bodies may have very different properties and functions in lineages as distinct as a neuron, an oocyte, an early blastomere, or an epithelial cell.

The properties of the proteins that are recruited to a 30UTR target of miRISC or an RBP may also influence mRNP structure

and activities. A recent study in C. elegans embryos suggested that recruitment of the CCR4-NOT complex and the associated IDR proteins by miRISC could nucleate mRNP assembly on target mRNAs. Recruitment of cell-lineage specified IDR proteins (such as PGL-1 or MEG-1/2) or co-factors of decapping and decay may enable progression into larger mRNP and toward context-dependent functions (Wu et al., 2017). In keeping with the importance of cellular context, a recent study by the Simard lab showed that miRISC has a distinct composition in C. elegans germline. While germline miRNA target reporters were silenced, single-molecule FISH methods revealed that targeting led to juxtaposition to P granules (germ granules) and also stabilized the targeted mRNA (Dallaire et al., 2018).

Lastly, a recent intriguing study showed that interactions between GW182 and the Argonaute could result in formation of miRISC droplets. This phase-separated condensate could in turn lead to sequestration of miRNA targets, and acceleration of their deadenylation in vitro (Sheu-Gruttadauria and MacRae, 2018). It thus seems likely that resolving the functions of P-bodies will be undissociable from the cellular expression and the subcellular concentration of mRNAs, IDRs, regulatory factors and effector machineries. Advances in quantitative methods to locally trace translation, mRNA deadenylation and decay in situ and in individual cell lineages may be important to resolve the apparent conflict that exists on the function of P-bodies.

### CURRENT FRONTIERS IN 30UTR RESEARCH

Great strides have been made in understanding the mechanisms underlying 30UTR regulatory sequences and the factors that recognize and effect them. However, several dating problems remain unsolved and important new ones recently emerged. The above-mentioned resolution of the functions of phase transition mRNPs provides an example of an old problem that was recently visited with a new perspective. Other important problems came into focus with the emergence of next-generation sequencing, including alternative cleavage and polyadenylation (APA) (Edwalds-Gilbert et al., 1997), which generates significant diversity in 30UTR isoforms. High-throughput sequencing identified multiple APA sites in at least 70% of known mammalian genes (Derti et al., 2012; Hoque et al., 2012). Most tissue-specific genes express single UTRs, but more than half of ubiquitously expressed genes are produced as multiple 3 <sup>0</sup>UTR isoforms (Lianoglou et al., 2013). A different choice of polyadenylation sites in a 30UTR has the potential to profoundly re-shape its structure and response elements, thus impacting mRNA stability, translation and localization. An interesting recent study even showed that an mRNA APA can alter the localization and expression of the membrane protein it encodes (Berkovits and Mayr, 2015). Not only is there an important diversity of 30UTR isoforms, they are also dynamic in different cellular states. On average, proliferating cells (including several tumor-derived cell lines) express shorter 30UTRs in mRNAs that are more stable and translated into more protein compared to the longer 30UTR mRNAs expressed in differentiated cells (Sandberg et al., 2008; Ji et al., 2009; Mayr and Bartel, 2009). This led to the idea that shorter 30UTR isoforms allowed mRNAs to avoid regulation by miRNAs and RBPs. This is likely an oversimplification and is not always the case, however, as shorter 3 <sup>0</sup>UTRs can also mean more potent deadenylation (Flamand et al., 2016), and longer 30UTRs can also mean regulatory sequences being buried in a more complex structure (Thivierge et al., 2018). Furthermore, some tissues like the brain (Ji et al., 2009; Hilgers et al., 2011; Ulitsky et al., 2012; Miura et al., 2013) have on average much longer 30UTRs, potentially multiplying the folding structures and/or regulatory input, and thus the complexity of functional interplay.

The folding structures of 30UTRs remain largely underappreciated. This in itself is an important frontier, as structures can profoundly impact gene regulation (for reviews, see Jacobs et al., 2012; Kwok et al., 2015). Significant advances in chemical probes and next-generation sequencing now enable us to obtain genome-wide in vivo structures at single nucleotide resolution (Bevilacqua et al., 2016). Structures can be derived from in vivo transcripts, thus providing a perspective on the impact of developmental and cellular contexts, and the prevailing 30UTR interactions (Spitale et al., 2015). Along those lines, a recent study analyzed changes in structures in zebrafish transcripts during MZT (Beaudoin et al., 2018), and revealed the interplay between ribosomes and the unwinding of mRNA secondary structures.

Improvements in throughput, library generation methods, and cost-effectiveness of next-generation sequencing now enable an integrated genomic perspective on multiple regulatory mechanisms. Massively parallel reporter assays (MPRA) have been used in the past to identify functional cis-regulatory elements in transcription and splicing (Melnikov et al., 2012; Rosenberg et al., 2015). Thousands of random sequences with unique tags are fused to reporters and introduced into cells, and their regulatory output is then quantified using highthroughput sequencing. A recent study used a similar technique to identify cis-regulatory elements in the 30UTRs of maternal mRNAs in zebrafish that regulate mRNA decay (Rabani et al., 2017). The authors identified 2 stabilizing elements (polyU and UUAG sequences) and four destabilizing elements (GCrich, AU-rich, Pumilio-binding sites, and miR-430-binding sequences).

Because so many mechanisms mobilize the deadenylase complex and its activities, sequencing libraries that allow the capture of poly(A) tail size, the end of the 30UTR isoform, and the abundance of transcripts will provide insight on the impact of these key features on gene expression. Recent studies already identified distinct populations of poly(A) tail sizes in the transcriptome (Subtelny et al., 2014; Eichhorn et al., 2016; Lima et al., 2017).

## CONCLUSION

We have reviewed how regulatory elements in 30UTRs are recognized by miRNAs and RBPs, and some of the betterknown mechanisms leading to the decisions on the fate of mRNAs. While genomic approaches are successful in unveiling

the complexity and breadth of some of these mechanisms, each 30UTR is also unique and has co-evolved closely in its genetic and cellular niche with its regulatory factors. Deciphering the 30UTR code will also require detailing this uniqueness for each 30UTR. Embracing genetics once more, this time through genome edition in model organisms, offers powerful new possibilities in linking the structures and sequences of 30UTRs with mRNA fates in their physiological context.

### AUTHOR CONTRIBUTIONS

VM and TD contributed equally in writing the manuscript.

#### REFERENCES


#### FUNDING

This work was supported by Canadian Institute of Health Research (CIHR) grants to TD (PJT-152900) and Fonds de Recherche du Québec-Santé (FRQS) Chercheur-Boursier Senior salary award to TD, The Charlotte and Leo Karassik Foundation Ph.D. fellowship award to VM.

#### ACKNOWLEDGMENTS

We apologize for directly related work that we have not cited in this review. We would like to thank the members of the lab for their comments on the manuscript.


decay. Wiley Interdiscip. Rev. RNA 2, 167–183. doi: 10.1002/wrna.40 Chen, Y., Boland, A., Kuzuoglu-Öztürk, D., Bawankar, P., Loh, B., Chang, C. T., ˘ et al. (2014). A DDX6-CNOT1 complex and W-binding pockets in CNOT9

oligomerization and formation of silencing particles. Cell 124, 521–533.

Chekulaeva, M., Mathys, H., Zipprich, J. T., Attig, J., Colic, M., Parker, R., et al. (2011). miRNA repression involves GW182-mediated recruitment of CCR4- NOT through conserved W-containing motifs. Nat. Struct. Mol. Biol. 18,

Chen, C. Y., Ezzeddine, N., and Shyu, A. B. (2008). Messenger RNA half-life measurements in mammalian cells. Methods Enzymol. 448, 335–357. doi: 10.


fgene-10-00006 January 22, 2019 Time: 16:47 # 13

doi: 10.1016/j.cell.2006.01.031

1016/S0076-6879(08)02617-7

1218–1226. doi: 10.1038/nsmb.2166

protein RNAs by the A2 pathway. Mol. Biol. Cell 19, 2311–2327. doi: 10.1091/ mbc.E07-09-0914


deadenylases are required for both translational repression and degradation of miRNA targets'. Nucleic Acids Res. 41, 978–994. doi: 10.1093/nar/gks1078




Pat1, Edc3 and RNA in mutually exclusive interactions. Nucleic Acids Res. 41, 8377–8390. doi: 10.1093/nar/gkt600



organization of mRNA-associated granules and bodies. Mol Cell 69, 517– 532.e11. doi: 10.1016/j.molcel.2017.12.020


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Mayya and Duchaine. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Secret Life of Translation Initiation in Prostate Cancer

Greco Hernández<sup>1</sup> \*, Jorge L. Ramírez<sup>1</sup> , Abraham Pedroza-Torres<sup>2</sup> , Luis A. Herrera<sup>3</sup> and Miguel A. Jiménez-Ríos<sup>4</sup>

<sup>1</sup> Translation and Cancer Laboratory, Unit of Biomedical Research on Cancer, National Institute of Cancer, Mexico City, Mexico, <sup>2</sup> Cátedra-CONACyT Program, Hereditary Cancer Clinic, National Institute of Cancer, Mexico City, Mexico, <sup>3</sup> Unidad de Investigación Biomédica en Cáncer, Instituto Nacional de Cancerología-Instituto de Investigaciones Biomédicas, The National Autonomous University of Mexico, Mexico City, Mexico, <sup>4</sup> Department of Oncologic Urology, National Institute of Cancer, Mexico City, Mexico

Prostate cancer (PCa) is the second most prevalent cancer in men worldwide. Despite the advances understanding the molecular processes driving the onset and progression of this disease, as well as the continued implementation of screening programs, PCa still remains a significant cause of morbidity and mortality, in particular in low-income countries. It is only recently that defects of the translation process, i.e., the synthesis of proteins by the ribosome using a messenger (m)RNA as a template, have begun to gain attention as an important cause of cancer development in different human tissues, including prostate. In particular, the initiation step of translation has been established to play a key role in tumorigenesis. In this review, we discuss the state-of-the-art of three key aspects of protein synthesis in PCa, namely, misexpression of translation initiation factors, dysregulation of the major signaling cascades regulating translation, and the therapeutic strategies based on pharmacological compounds targeting translation as a novel alternative to those based on hormones controlling the androgen receptor pathway.

Keywords: prostate cancer, translation initiation, translational control, androgen receptor, eIF4E, eIF4G, mTOR, MAPK

### INTRODUCTION

Among different types of cancers, prostate cancer (PCa) is the third most commonly diagnosed tumor around the world, ranking second in incidence among men and fifth leading cause of cancer death in this gender. The most recent data (2018) have reported about 360,000 deaths and almost 1.3 million new cases due to this neoplasia worldwide (Dy et al., 2017; Bray et al., 2018; Ferlay et al., 2018; Pilleron et al., 2018). In low-income countries, the importance of this malady is even more dramatic. For instance, in the Americas, PCa is the most commonly diagnosed malign neoplasia with over 400,000 new cases and the second cause of cancer death with about 80,000 dead men in 2018 (Global Burden of Disease Cancer Collaboration et al., 2015; Dy et al., 2017; Bray et al., 2018; Pilleron et al., 2018).

Prostate is a gland laying underneath the bladder that secretes factors for sperm maintenance and viability throughout life. PCa is defined as the uncontrolled growth of cells from the gland epithelium that acquire the ability to scatter. Indeed, PCa is a highly heterogeneous disease, comprising mostly adenocarcinomas that display a wide spectrum of both clinical evolution patterns and phenotypic defects (Humphrey, 2014; Seitzer et al., 2014; Network, 2015;

#### Edited by:

Chiara Gamberi, Concordia University, Canada

#### Reviewed by:

Hari K. Koul, Louisiana State University Health Sciences Center Shreveport, United States Woan-Yuh Tarn, Academia Sinica, Taiwan

#### \*Correspondence:

Greco Hernández ghernandezr@incan.edu.mx; greco.hernandez@gmail.com

#### Specialty section:

This article was submitted to RNA, a section of the journal Frontiers in Genetics

Received: 12 August 2018 Accepted: 11 January 2019 Published: 30 January 2019

#### Citation:

Hernández G, Ramírez JL, Pedroza-Torres A, Herrera LA and Jiménez-Ríos MA (2019) The Secret Life of Translation Initiation in Prostate Cancer. Front. Genet. 10:14. doi: 10.3389/fgene.2019.00014

**172**

Packer and Maitland, 2016; Arora and Barbieri, 2018). Nowadays, the parameters most used for surveillance, diagnosis, and design of treatments are the blood level of prostate-specific antigen (PSA), the biopsy clinical stage, and the Gleason score of tumors (Humphrey, 2014; Seitzer et al., 2014; Jones et al., 2018; Martin et al., 2018; McCrea et al., 2018).

Prostate cancer pathology has started to be understood at the molecular level. Normal development and function of prostate strongly depend on the action of both, androgens and androgen receptor (AR, a transcriptional factor). Most tumors exhibit AR gene amplification and/or somatic prostate mutations (Gottlieb et al., 2012). Thus, AR malfunctioning may be a main trigger for the onset and progression of PCa. Other genes are also found dysregulated in PCa. For example, loss of one copy of the tumor suppressor PTEN has been reported in nearly 60% of PCa patients (Phin et al., 2013), which appears to be a critical component in the evolution of PCa with metastasic potential (Baca et al., 2013). In metastasic prostate tumors, amplification of the oncogene c-myc (DeVita et al., 2011) and mutations in the genes involved in cell cycle regulation Cyclin-dependent kinase inhibitor 1B (CDKN1B) and TP53 (Baca et al., 2013) have also been reported. Moreover, promoter hypermethylation of different genes such as PTEN, retinoblastome gene (RB), and cadherin 1 gene (CDH1) has been linked to advanced stages of PCa (Friedlander et al., 2012).

Genomic rearrangements involving the 5<sup>0</sup> untranslated region (UTR) of E26 transformation-specific (ETS) gene family members also occur in approximately 50% of PCa tumors (Rubin et al., 2011). A DNA rearrangement found in 40–50% of primary PCa tumors produces TMPRSS2-ERG, the fusion of the androgen induced transmembrane gene serine 2 protease gene (TMPRSS2) with members of the erythroblast transformation-specific related gene (ERG) family of transcription factors (Tomlins et al., 2005), which results in the androgen-dependent ERG oncogenic expression (Nam et al., 2007; Perner et al., 2007; Tu et al., 2007; Albadine et al., 2009; Fine et al., 2010).

Advanced PCa tumors are regularly treated by hormonedeprivation via different types of castration to block AR function. However, this eventually leads to treatment resistance and the tumor recurs as a castration-resistant prostate cancer (CRPC). Unfortunately, studies on the CRPC condition are scarce. Some AR splicing variants lacking regulatory regions, such as the ligand-binding domain, contribute to the development of CRPC (Guo et al., 2009; Hu et al., 2009). Comparisons between primary PCa and CRPC revealed significant differences in ERG expression, with primary tumors displaying higher expression levels (Roudier et al., 2016). This may indicate that ERG expression is important in primary PCa and may no longer be required in CRPC tumors that might use a different mechanism to promote proliferation and cell survival (Roudier et al., 2016). Moreover, genome sequencing of CRPC tumors have shown that the most recurrently alterations are mutations in the TP53 and AR genes, the TMPRSS2:ERG fusion, loss of RB and breast cancer gene (BRCA) genes, and gains in AR and MYC copy numbers (Grasso et al., 2012). In contrast, these genomic alterations were less frequent among clinically localized primary tumors, supporting the idea that hormonal deprivation may induce changes that alter AR function (Taplin et al., 1999).

Translation has recently begun to gain attention as a possible key molecular process in cancer development, because cancer cells display rapid growth and proliferation with significantly increased protein synthesis. Translation is largely controlled at the initiation step and translation initiation was frequently found to be involved in the development of different types of cancer, including PCa (Parsyan, 2014; Bhat et al., 2015; Sharma et al., 2016; Truitt and Ruggero, 2016; Ali et al., 2017; Robichaud et al., 2018). Thus, targeting translation initiation is being probed as part of the global schemas of some emerging cancer therapies. Here, we review a rapidly growing field of the study of translation initiation contribution to PCa, as well as the signaling pathways regulating it. We also summarize the most relevant research on pharmacological compounds targeting translation initiation as a new potential mean to alleviate this malady.

### AN OVERVIEW OF TRANSLATION INITIATION AND ITS REGULATORY SIGNALING CASCADES

Translation is a sophisticated and tightly controlled process that plays a central role in gene expression. It consists of three main stages, namely, initiation, elongation, termination, and a final stage in which the ribosome recycles. Overall, the initiation step consists of the recruitment of the 40S ribosome subunit to the 5<sup>0</sup> - UTR of an mRNA through the action of around a dozen initiation factors (eIFs) (Jackson et al., 2010; Hinnebusch, 2014; Hershey et al., 2018). This process is mostly regulated by two signaling cascades, the mTOR and the mitogen-activated protein kinase (MAPK) pathways.

#### Translation Initiation

Translation initiation begins when a free 40S ribosomal subunit interacts with eIF1, eIF1A, eIF3, eIF5, and the so-called ternary complex (consisting of eIF2 bound to GTP and an initiator Met-tRNA<sup>i</sup> Met) to form a 43S pre-initiation complex (PIC). This step positions the initiator Met-tRNA<sup>i</sup> Met in the peptidyl (P) decoding site of the ribosome. In a parallel set of reactions, the cap structure (m7GpppN, where N is any nucleotide) located at the 5<sup>0</sup> end of the mRNA is recognized by eIF4E. Then, the scaffold protein eIF4G performs simultaneous interactions with the cap-bound eIF4E, the RNA-helicase eIF4A, poly(A)-binding protein (PABP), and the ribosome-bound eIF3, to coordinate recruitment of the 43S PIC to the mRNA 5<sup>0</sup> -UTR. Because PABP binds the poly(A) tail at the mRNA 3<sup>0</sup> end, this set of interactions circularizes the translating mRNA. Then, the 43S PIC scans base-by-base the mRNA 5<sup>0</sup> -UTR to reach the AUG start codon, a process in which eIF4A, assisted by eIF4B, unwinds 5<sup>0</sup> -UTR secondary structures. Selection of the correct AUG start codon is driven by eIF1 and eIF1A, that leads to the establishment a perfect Watson–Crick match between the anticodon of the Met-tRNA<sup>i</sup> Met and the mRNA start codon. Selection of the authentic start codon establishes the open reading frame for mRNA decoding, arrests mRNA scanning, and results in formation of a 48S PIC containing the Met-tRNA<sup>i</sup> Met and eIF1A tightly positioned within the A-site. Afterward, GTP

hydrolysis of GTP—eIF5B promotes the release of eIF5B from the 80S monoribosome, which facilitates 60S ribosomal subunit joining and the assembly of an 80S initiation complex, which is ready to start elongation (Jackson et al., 2010; Hinnebusch, 2014; Hershey et al., 2018).

#### mTOR Pathway

Two major signaling cascades control protein synthesis, namely, the phosphatidylinositol 3-kinase (PI3K)/protein kinase B (Akt)/mammalian target of rapamycin complex 1 (mTORC1) pathway, and the mitogen-activated protein kinase (MAPK) pathway (**Figure 1**). The serine/threonine kinase mTOR is the core of two structurally and functionally distinct multisubunit complexes, namely, mTORC1 and mTOR2. mTORC1 is composed by the proteins lethal SEC13 protein 8 (mLST8), pleckstrin [DEP]-domain-containing mTOR interacting protein (DEPTOR), regulatory associated protein of mTOR (RAPTOR), and proline-rich Akt substrate 40 kDa (PRAS40). The mTORC1 signaling pathway senses, integrates, and responds to nutrient availability, stress, cellular energy status, hormones, and mitogens to control cellular growth, survival, and proliferation, as well as translation, transcription of ribosomal RNAs and transfer RNAs, ribosome biogenesis, lysosome biogenesis, lipid synthesis, and protein breakdown. TORC2 regulates co-translational protein degradation and cytoskeletal organization (Fonseca et al., 2016; Proud, 2018; Roux and Topisirovic, 2018). Thus, only mTORC1 is of our interest here, as mTORC1 pathway integrates cellular signals to control translation through the phosphorylation of proteins with functions in the initiation and elongation steps.

mTORC1 phosphorylates factors that directly regulate the translational machinery, as well as protein kinases that phosphorylate translation factors, including the eIF4E-binding proteins (4E-BPs) and the S6 kinases (S6Ks). mTORC1 also promotes the indirect phosphorylation of initiation factors eIF4B, eIF4G, and elongation factor 2 kinase (eEF2K) (Fonseca et al., 2016; Proud, 2018; Roux and Topisirovic, 2018). Binding of 4E-BPs to eIF4E precludes its association with eIF4G and represses cap-dependent translation. Binding to eIF4E is controlled by the phosphorylation status of 4E-BPs: whereas hypophosphorylated 4E-BPs bind eIF4E with high affinity, the hyperphosphorylated species dissociate from eIF4E to relieve translational repression. The reverse reaction is favored by the protein phosphatase 1G (PP1G) that removes 4E-BP1 phosphate groups. S6Ks control translation by modulating the activity of targets such as ribosomal protein S6, eIF4B, and programmed cell death 4 protein (PDCD4), a negative eIF4A regulator (Fonseca et al., 2016; Proud, 2018; Roux and Topisirovic, 2018).

#### MAPK Pathway

The MAPKs pathway also regulates translation (**Figure 1**). MAPKs are serine/threonine kinases that mediate intracellular

signaling associated with a variety of cellular activities, including cell proliferation, differentiation, survival, death, and transformation. MAPK cascades components are activated by mitogens and stress stimuli, and are coupled to the translation machinery via the phosphorylation of downstream MAPKactivated protein kinases (collectively known as MKs). In response to diverse stimuli (Scheper et al., 2001; Wang et al., 2001), ERK or p38 MAPK phosphorylate Mnk 1/2 kinases, which in turn interact with the carboxy-terminal of eIF4G to directly phosphorylate eIF4E on Ser-209, resulting in stimulation of translation (Proud, 2018; Roux and Topisirovic, 2018). The RAS-ERK pathway crosstalks with the PI3K/AKT/mTORC1 pathway. When bound to GTP, RAS can directly bind and allosterically activate PI3K (Mendoza et al., 2011). AKT negatively regulates ERK activation by phosphorylating RAF in its amino-terminus (Mendoza et al., 2011). ERK in turn phosphorylates RAPTOR which activates TORC1 in an AKTindependent way (Herbert et al., 2002; Foster et al., 2010; Carriere et al., 2011).

TABLE 1 | Defects in eIFs and the signaling pathways regulating translation in prostate cancer.


In the following, we will focus on how malfunction of eIFs and the mTOR and MAPK pathways impact PCa, and review the numerous molecular defects related to translation that have been reported in PCa (**Table 1**). We will also discuss the prospects of targeting translation in PCa treatments using drugs inhibiting translation.

### TRANSLATION INITIATION FACTORS INVOLVED IN PCa

#### eIF2

eIF2 is composed of three subunits (α, β, and γ) that form the core of the ternary complex GTP/eIF2/Met-tRNA<sup>i</sup> Met , which delivers initiator methionyl-tRNA<sup>i</sup> to the ribosomal P-site during translation initiation. eIF2α regulates protein synthesis depending on its phosphorylation status. Phosphorylated eIF2α increases its affinity for its guanine nucleotide exchange factor eIF2B, leading to the formation of inactive eIF2B–eIF2–GDP complexes that suppress cap-dependent translation. eIF2α can be phosphorylated by four stress-responsive kinases upon various stimuli, namely, double-stranded RNA activated protein kinase (PKR), general control non-repressed 2 (GCN2) kinase, hemeregulated inhibitor (HRI), and PKR like endoplasmic reticulum kinase (PERK), that become activated in response to viral infection, decreased nutrients, oxidizing agents, high salt levels, hypoxia, and heat-shock among others (Wek, 2018).

Nguyen et al. (2018) used murine and humanized models to demonstrate that PCa can respond adaptively via eIF2α phosphorylation to reset global protein synthesis and promote aggressive tumor development. Additionally, high expression of phosphorylated eIF2α along with loss of PTEN in 424 PCa patients was found to associate with increased risk of metastasis (Nguyen et al., 2018). The critical role of eIF2α phosphorylation to regulate the global rate of translation renders eIF2α a promising target for PCa treatments.

### eIF3

The multisubunit eIF3 is the largest of initiation factors, with an approximate size of 804 kDa. This factor is a complex of 13 subunits, namely, eIF3a-m, that bridges between the 43S PIC and the mRNA/eIF4F complex during translation initiation. The functions of the different eIF3 subunits are varied. While some fulfill essential tasks for the synthesis of proteins, others have regulatory activities (Hinnebusch, 2006). Of particular interest for PCa, eIF3h was frequently found overexpressed in tumors and high levels of eIF3h positively correlated with increased Gleason scores (Nupponen et al., 1999; Saramaki et al., 2001). However, overexpression of eIF3 subunits is not a rule in PCa; in fact, the eIF3e subunit was found to be down regulated in this neoplasia (Marchetti et al., 2001).

#### eIF4F

The eIF4E cap-binding protein, together with the eIF4A RNA helicase and the eIF4G scaffold protein, form the eIF4F complex that drives mRNA recruitment to the 40S ribosome subunit

to initiate mRNA translation. Although eIF4E is required for cap-dependent translation of all nuclear-transcribed mRNAs, some mRNAs with long and highly structured 5<sup>0</sup> -UTR have showed a high requirement for eIF4E and the eIF4A helicase unwinding activity (Rajasekhar et al., 2003; Mamane et al., 2007; Feoktistova et al., 2013). Tightly related to cancer development, these so-called "eIF4E-sensitive" transcripts encode proteins that stimulate cell survival and proliferation, such as vascular endothelial growth factor-A (VEGF-A), hypoxia inducible factor 1 alpha (HIF1-α), BCL-2 family members, ornithine decarboxylase 1 (ODC1), cyclin D3, and c-MYC (Feoktistova et al., 2013; Truitt et al., 2015; Hinnebusch et al., 2016; Truitt and Ruggero, 2016; Vaklavas et al., 2017). Their regulation is critical for normal cell proliferation. Accordingly, using a haploin-sufficient eIF4E mouse model (eIF4E+/−), Truitt et al. (2015) observed that a 50% reduction of eIF4E levels protected the animals from cellular transformation and tumorigenicity.

eIF4E, eIF4G, and eIF4B have been implicated in PCa development. Overexpression of eIF4E has been reported in advanced tumor stages, and also to be associated with decreased rates of patient survival (Wang et al., 2005; Graff et al., 2009). eIF4E phosphorylation promotes tumor development in prostate and has been found to be elevated in PCa. Moreover, eIF4E is highly phosphorylated in hormone-refractory PCa, which correlates with poor clinical outcome (Furic et al., 2010). By using a model knock-in mice expressing a non-phosphorylatable version of eIF4E, Furic et al. (2010) also demonstrated that eIF4E phosphorylation is required for translational upregulation of several mRNAs, and that increased phospho-eIF4E levels correlate with disease progression in patients with PCa.

Jaiswal et al. (2018) have observed that eIF4G1 protein levels are increased in PCa tumors as compared to normal tissues, and that gene expression of this protein positively correlates with the tumor grade and stage. Accordingly, eIF4G1 silencing impaired cell viability, proliferation, and migration and downregulated genes involved in the epithelial-mesenchymal transition, such as N-cadherin and Snail-1 (Jaiswal et al., 2018). Renner et al. (2007) have observed that eIF4G phosphorylation was increased in the prostate of transgenic mice expressing a constitutively active p110-alpha catalytic subunit of PI3K. Consistently, inhibition of PI3K activity with the drug LY294002 inhibited eIF4G phosphorylation. Thus, eIF4G phosphorylation has been proposed as a new marker for PI3K activity in PCa (Renner et al., 2007). Finally, both meta-analysis and immunoblot of tissue extracts showed that eIF4G is overexpressed in human PCa epithelial tissue (Wang et al., 2005).

eIF4B activity has been also proven to affect PCa development (Oh et al., 2007). Accordingly, the protein level of the serine/threonine kinase Proviral integration site of murine (Pim-2) was found to significantly correlate with eIF4B phosphorylation both in PCa samples and in cell lines (Ren et al., 2013). Pim-2 is a potent anti-apoptotic factor and its upregulation is associated with prostatic carcinoma tumorigenesis, suggesting that Pim-2 overexpression may cause direct eIF4B phosphorylation during PCa tumorigenesis (Ren et al., 2013).

### DYSREGULATION OF THE MAJOR SIGNALING CASCADES CONTROLLING THE TRANSLATION MACHINERY IN PCa

#### PI3K/Akt/mTOR Pathway

In response to different stimuli, PI3K phosphorylates phosphatidylinositol-4,5-biphosphate (PIP2) yielding phosphatidylinositol-3,4,5-triphosphate (PIP3). This reaction is balanced by PTEN, which catalyzes the reverse reaction. PIP3 acts as a second messenger propagating intracellular signals and resulting in AKT activation. Upon activation, AKT phosphorylates several proteins, including the mTORC1.

The PI3K/Akt/mTOR signaling pathway is frequently hyperactivated in most human cancers, and inactivation of tumor suppressors such as PTEN, LKB1, and TSC1/2, which antagonize the PI3K/AKT/mTORC1 pathway, may drive tumorigenesis (Fonseca et al., 2016; Proud, 2018; Roux and Topisirovic, 2018). Activation of the PI3K pathway is associated to resistance to androgen deprivation therapy and to poor outcomes in PCa (Jiao et al., 2007; Reid et al., 2010; Bitting and Armstrong, 2013; Liu and Dong, 2014). Aberrations in PI3K/AKT/mTORC1 signaling have been identified in approximately 40% of early PCa cases and 70–100% in advanced cases and metastasic tumors (Taylor et al., 2010; Carver et al., 2011). In particular, overactivation of this pathway via PTEN loss significantly favors initiation of PCa (Di Cristofano et al., 1998; Suzuki et al., 1998; Podsypanina et al., 1999), and leads to constitutive activation of the PI3K pathway in 60% of CRPCs (Vivanco and Sawyers, 2002). Pre-clinical data indicated that some PTEN-deficient neoplasms, including PCa, activated the PI3K pathway via the p110beta isoform of the PI3K catalytic subunit (Jia et al., 2008; Wee et al., 2008; Ni et al., 2012). Moreover, mutations of AKT or its gene amplification have also been observed in different PCa cases (Sadeghi and Gerber, 2012). Genetic studies in mouse models have implicated mTOR hyperactivation in triggering PCa in vivo (Guertin et al., 2009; Nardella et al., 2009). It has also been shown that 4E-BP1 may regulate tumor initiation and progression through mTOR signaling in PCa (Hsieh et al., 2015).

#### MAPK/ERK Pathway

MAPK signaling is divided into three subtypes, namely, extracellular signal-regulated protein kinase (ERK), p38 MAPK, and c-Jun N-terminal kinase/stress-activate protein kinase (JNK/SAPK), that play a key role in modulating intracellular responses, including translation. Whereas JNK/SAPK and p38 have been generally linked to cell death and tumor suppression, ERK plays a prominent role in cell survival and tumor promotion in response to a broad range of stimuli (Zhuang et al., 2005; Proud, 2018; Roux and Topisirovic, 2018).

In most cancer types, including PCa, the MAPK signaling cascades are found hyperactivated and also play a role in tumor growth, castration-resistant development, and metastasis (Wagner and Nebreda, 2009; Mulholland et al., 2012; Rodriguez-Berriguete et al., 2012; Proud, 2018; Roux and Topisirovic, 2018). Their inhibition prevents PCa cell growth (Gioeli et al., 1999; Kinkade et al., 2008), and in Pten-null;Ras activated PCa cells, the

RAS/MAPK pathway was observed to play a significant role in metastasis (Mulholland et al., 2012).

#### TARGETING TRANSLATION INITIATION IN PCa

A summary of the translation initiation factors as well as the components of the PI3K/AKT/mTOR pathway used as therapeutic targets in PCa is depicted in **Figure 1** (Kinkade et al., 2008; Edlind and Hsieh, 2014). PI3K is a common therapeutic target with existing drugs such as GDC-0941 and LY294002 which are reported to inhibit proliferation in human (Raynaud et al., 2009) and mouse transgenic (Renner et al., 2007) PCa cell lines. NVP-BKM120, a pan-class I PI3K inhibitor, showed antiproliferative activity in xenograft animal models and the PCa cell line PC3 (Maira et al., 2012). The use of the AKT inhibitor GSK690693 has also demonstrated antitumoral activity in PCa xenograft animal models (Rhodes et al., 2008).

mTOR is perhaps the most targeted molecule in PCa. mTOR inhibition by the drug INK128 was described to prevent PCa cells invasion and metastasis in vivo (Guertin et al., 2009; Nardella et al., 2009). In combination with the AR inhibitor bicalutamide (not depicted), Everolimus (also termed RAD001) inhibits mTORC1 and leads to growth arrest in some castration-resistant PCa models (D'Abronzo et al., 2017). The drug MLN0128 (also known as INK128) has been reported to make a dual inhibition of both mTORC1 and mTORC2 complexes in PCa cells, preventing metastasis and inducing apoptosis (Sarbassov et al., 2006; Hsieh et al., 2012; Bitting and Armstrong, 2013). However, a recent study suggested that the clinical efficacy of MLN0128 is limited (Graham et al., 2018). Fenofibrate is a widely used drug for its lipid-lowering activity, and some reports have described its inhibitory effect on growth of different PCa cell lines, such as LN and PC-3. It induces apoptosis mediated by oxidative stress (LN cells) (Zhao et al., 2013), or by the caspase-3 and the apoptosis-inducing factor (AIF) signaling pathways (PC-3 cells) (Lian et al., 2017). In PC-3 cells, Fenofibrate inhibits the activation of the mTOR pathway independently of the PI3K/AKT, MAPK, and AMPK pathways, but the mechanism underlying this effect remains unclear (Lian et al., 2017). Carver et al. (2011) reported that the use of the dual PI3K/mTOR inhibitor BEZ235 (also known as Dactolisib) and of enzultamide induces cell death in a Pten-deficient PCa mouse model, which results in ∼80% of tumor regression (Carver et al., 2011).

Stimulation of PCa cells with dihydrotestosterone has been described to induce eIF2α phosphorylation at Ser-51 in an ARdependent way, thus shutting down global protein synthesis (Overcash et al., 2013). eIF4F assembly and activity has also been targeted by different drugs in prostate tumors or cells. In PC3 cultured cells, Cencic et al. (2009) showed that silvestrol impaired ribosome recruitment by affecting eIF4A activity and the composition of eIF4F complex. Silvestrol exhibits strong anticancer effects, such as increased apoptosis, decreased proliferation, and inhibition of angiogenesis in PCa xenograft animal models. It mediates its effects by preferentially inhibiting translation of malignancy-related mRNAs (Cencic et al., 2009). Jaiswal et al. (2018) have shown that treatment of CRPC C4-2B cells with the eIF4G/eIF4E complex formation inhibitor 4EGI-1 impairs prostate tumor progression. They also showed that treatment with 4EGI-1 sensitized CRPC cells to enzalutamide and bicalutamide, two antiandrogen chemotherapy agents currently used to treat PCa (Jaiswal et al., 2018). By using a mouse model of PCa, Hsieh et al. (2015) found that diminishing 4E-BP1 expression decreased resistance to the PI3K pathway inhibitor BKM120 in CaP cells (Hsieh et al., 2015). Additionally, PCa patients treated with BKM120 displayed increased 4E-BP1 abundance, indicating that 4E-BP1 may be associated with PCa progression and drug resistance (Hsieh et al., 2015).

## OUTLOOK

Understanding the molecular processes underlying PCa will provide novel tools for both, its timely detection and the development of improved therapeutic strategies. To date, the United States Food and Drug Administration (FDA) has approved more than 20 pharmacological compounds for PCa treatment, most of which are hormonal modulators targeting the AR pathway. However, in most patients, advanced PCa develops resistance to androgen-deprivation therapies (Khemlina et al., 2015; Nevedomskaya et al., 2018). Due to the prolific studies in the field of dysregulation of translation in PCa, new molecules that can be chemically targeted are being rapidly identified. Most drugs tested in PCa models so far act on the signaling cascades controlling translation. Interestingly, other molecules targeting eIFs have shown activity in a myriad of different cancers but have not been yet tested in PCa. These include 4Ei-I, antisense eIF4E oligos, Hippuristanol, Pateamine A, 4E1R-Cat, 4E2R-Cat, and Rivavirin among others (Parsyan et al., 2012; Bhat et al., 2015; Malka-Mahieu et al., 2017; Siddiqui and Sonenberg, 2015). The next step should be to test these compounds in PCa.

Currently, although measurements of PSA in blood are the routine test for detection of possible PCa, the predictive value of PSA is at debate. In some studies, PSA has demonstrated a positive effect in the detection of potentially fatal cancer, but its value as a population screening tool can lead to poor diagnosis and treatments (Herget et al., 2016). In another study, a followup of 10 years, Martin et al. (2018) found that the single PSA screening intervention detected more PCa cases but had no significant predictive power of PCa mortality (Martin et al., 2018). Thus, there is a need to find more reliable markers that can complement the PSA test.

Genomics and epigenomics studies have led to the discovery of novel putative PCa biomarkers (Goh et al., 2014; Ngollo et al., 2014; Peng et al., 2014). Among these, the most promising molecule is the PCa antigen 3 (PCA3), a long non-coding RNA found overexpressed in more than 90% of prostate tumors (Kok et al., 2002). Different from the PSA marker, PCA3 has been detected neither in normal prostate tissues nor in prostatic hyperplasias, and PCA3 can be detected in urine samples from PCa patients with high certainty (Auprich et al., 2012). Another promising

marker is the TMPRSS2-ERG fusion (Tomlins et al., 2005; Rubin et al., 2011), which is specific for PCa and can even be detected in precursor lesions such as prostate intraepithelial neoplasia (Mehra et al., 2008; Han et al., 2009). In the near future, this knowledge will be translated to pre-clinical and clinical phases, with multidisciplinary approaches for rigorous validation and future applications in PCa patients. As we have discussed here, the predictive value of hyperphosphorylated factors eIF4G, eIF4B, and eIF2α in PCa should also be validated soon. The next generation of markers should aim to efficiently detect PCaspecific circulating DNAs or microRNAs in fluids such as saliva or urine. They should also aim to detect early stages of this malady.

### AUTHOR CONTRIBUTIONS

GH conceived, gathered information for the manuscript, and wrote most of the manuscript. JR, AP-T, LH, and MJ-R contributed to the writing as well as gathered information for the manuscript. AP-T and JR assembled **Table 1** and **Figure 1**.

### REFERENCES


### FUNDING

GH was supported by CONACyT (Mexico), Grant No. 273116, "Identification of new mutations in the Androgen Receptor gene specific of Mexican men and their clinic impact on prostate cancer," and by an internal funding of the National Institute of Cancer (INCAN, Mexico). JR was supported by a Master in Sciences Fellowship of CONACyT (Mexico). AP-T was supported by a fellowship of the Cátedra-CONACyT Program (Mexico). LH and MJ-R were supported by an internal funding of the National Institute of Cancer (INCAN, Mexico).

#### ACKNOWLEDGMENTS

We thank the valuable criticism and comments of the two reviewers and of our Editor CG that significantly improved this manuscript. We also thank Sean Kristian Fritchley for proofreading of the manuscript.




patients' overall survival. Prostate Cancer Prostatic Dis. 17, 81–90. doi: 10.1038/ pcan.2013.57



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Hernández, Ramírez, Pedroza-Torres, Herrera and Jiménez-Ríos. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# RNA-Binding Proteins in the Control of LPS-Induced Macrophage Response

#### Dirk H. Ostareck and Antje Ostareck-Lederer\*

Department of Intensive Care Medicine, University Hospital RWTH Aachen, Aachen, Germany

Innate immune response is triggered by pathogen components, like lipopolysaccharides (LPS) of gram-negative bacteria. LPS initiates Toll-like receptor 4 (TLR4) signaling, which involves mitogen activated protein kinases (MAPK) and nuclear factor kappa B (NFκB) in different pathway branches and ultimately induces inflammatory cytokine and chemokine expression, macrophage migration and phagocytosis. Timely gene transcription and post-transcriptional control of gene expression confer the adequate synthesis of signaling molecules. As trans-acting factors RNA binding proteins (RBPs) contribute significantly to the surveillance of gene expression. RBPs are involved in the regulation of mRNA processing, localization, stability and translation. Thereby they enable rapid cellular responses to inflammatory mediators and facilitate a coordinated systemic immune response. Specific RBP binding to conserved sequence motifs in their target mRNAs is mediated by RNA binding domains, like Zink-finger domains, RNA recognition motifs (RRM), and hnRNP K homology domains (KH), often arranged in modular arrays. In this review, we focus on RBPs Tristetraprolin (TTP), human antigen R (HUR), T-cell intracellular antigen 1 related protein (TIAR), and heterogeneous ribonuclear protein K (hnRNP K) in LPS induced macrophages as primary responding immune cells. We discuss recent experiments employing RNA immunoprecipitation and microarray analysis (RIP-Chip) and newly developed individualnucleotide resolution crosslinking and immunoprecipitation (iCLIP), photoactivatable ribonucleoside-enhanced crosslinking (PAR-iCLIP) and RNA sequencing techniques (RNA-Seq). The global mRNA interaction profile analysis of TTP, HUR, TIAR, and hnRNP K exhibited valuable information about the post-transcriptional control of inflammation related gene expression with a broad impact on intracellular signaling and temporal cytokine expression.

Keywords: RNA-binding proteins, post-transcriptional regulation, inflammation, bacterial lipopolysaccharides, macrophage activation

### INTRODUCTION

The immune responses against bacteria, viruses and parasites require tight regulation, because uncontrolled, excessive or persisting immune reactions provoke inflammatory diseases (Zanotti et al., 2002). As a central component of the innate immune response, macrophages sense pathogen components such as lipopolysaccharides (LPS), an essential constituent of the outer membrane

Edited by:

Maritza Jaramillo, National Institute of Scientific Research, University of Quebec, Canada

#### Reviewed by:

Martina Schroeder, Maynooth University, Ireland Roberto Gherzi, University of California, San Diego, United States

> \*Correspondence: Antje Ostareck-Lederer aostareck@ukaachen.de

#### Specialty section:

This article was submitted to RNA, a section of the journal Frontiers in Genetics

Received: 24 September 2018 Accepted: 17 January 2019 Published: 04 February 2019

#### Citation:

Ostareck DH and Ostareck-Lederer A (2019) RNA-Binding Proteins in the Control of LPS-Induced Macrophage Response. Front. Genet. 10:31. doi: 10.3389/fgene.2019.00031

of gram-negative bacteria. Recognition of LPS by TLR4 on the macrophage surface results in the activation of MAPK and NFκB dependent signaling pathways, which activate inflammation related genes encoding pro- and anti-inflammatory cytokines and chemokines (Medzhitov and Horng, 2009; Takeuchi and Akira, 2010; Smale, 2012; Vaure and Liu, 2014). The underlying genome-wide changes in macrophage gene expression (Reynier et al., 2012; Rutledge et al., 2012) require downstream posttranscriptional checkpoints, which are critical for the appropriate modulation of immune reactions (Carpenter et al., 2014; Kafasla et al., 2014). Emerging experimental evidence highlights the impact of RNA binding proteins (RBPs) on the posttranscriptional control of the immune response (Fu and Blackshear, 2017; Garcia-Maurino et al., 2017; Diaz-Munoz and Turner, 2018; Mino and Takeuchi, 2018; Turner and Diaz-Munoz, 2018).

By analyzing RNA-protein interaction profiling and RNA sequencing experiments with TTP, HUR, TIAR, and hnRNP K we provide an overview on their target mRNAs, which are regulated at the level of mRNA stability and translation in LPS activated macrophages.

### ZINK-FINGER PROTEIN TTP CONTROLS TARGET MRNA DECAY IN INFLAMMATION

Tristetraprolin, encoded by the gene Zfp36, has been characterized as critical mRNA destabilizing protein in immune cells (Blackshear, 2002; Brooks and Blackshear, 2013). To initiate target mRNA decay, TTP mediates the recruitment of deadenylation and decapping complexes to the mRNA 3<sup>0</sup> untranslated region (30UTR) and 50UTR, respectively (Fenger-Gron et al., 2005; Fabian et al., 2013). Tandem CCCH-type zinc-finger domains of TTP interact with AU-rich elements (ARE) that are mainly located in mRNA 30UTRs (Lai et al., 1999; Worthington et al., 2002). In macrophages, target mRNAs primarily encode proteins related to inflammation response, among them cytokines and chemokines (Carballo et al., 1998; Lai et al., 1999; Stoecklin et al., 2008; Kratochvill et al., 2011; Sedlyarov et al., 2016; Tiedje et al., 2016). Under steady state conditions TTP is ubiquitously expressed at a basal level. Inflammatory stimuli like LPS and cytokines mediate transcriptional and post-transcriptional induction of TTP expression (Mahtani et al., 2001; Schaljo et al., 2009; Sedlyarov et al., 2016). AREs in TTP mRNA 30UTR represent bona fide functional TTP binding sites. An auto-inhibitory feedback regulation established through the interaction of TTP with these AREs secures a decrease in TTP expression when inflammatory stimuli decline. Thereby TTP contributes to regulatory circuits, which prevent the development of chronic inflammation (Tchen et al., 2004; Schott et al., 2014). TTP deficiency in mice causes a systemic inflammatory syndrome, which is in part attributable to the absence of TTP-controlled tumor necrosis factor (TNFα) mRNA destabilization (Taylor et al., 1996; Carballo et al., 1997). In macrophages, LPS triggered TLR4 signaling leads to the stabilization of TTP target mRNAs and their enhanced translation (Tiedje et al., 2012, 2016). TTP serine phosphorylation catalyzed by TLR4 downstream kinase MK2 induces its sequestration by 14-3-3 proteins and target mRNA release (Chrestensen et al., 2004; Stoecklin et al., 2004). Hence CCR4-Not1 deadenylation complex recruitment is abrogated (Marchese et al., 2010; Clement et al., 2011; Sandler et al., 2011), target mRNAs are stabilized (Brook et al., 2006; Hitti et al., 2006) and translation-promoting factors replace TTP (Tiedje et al., 2012; **Figure 1A**).

Four studies examined the impact of TTP on inflammationrelated pathways: (I) (Stoecklin et al., 2008) identified TTP bound mRNAs in untreated and LPS induced murine RAW264.7 macrophages by TTP co-immunoprecipitation and RIP-Chip analysis (**Table 1**). (II) To investigate TTP-driven mRNA decay (Kratochvill et al., 2011) treated bone marrow derived macrophages (BMDM) with actinomycin D for different times prior to RIP-chip (**Table 1**). (III) Employing iCLIP (Tiedje et al., 2016), identified TTP bound mRNAs in LPS treated BMDM from mice expressing GFP-TTP or the non-MK2 substrate variant (**Table 1**). The impact of TTP phosphorylation on global mRNA stability and mRNA translation was examined integrating iCLIP, RNASeq and Ribosome profiling (RiboSeq) (Tiedje et al., 2016). (IV) To map mRNA binding sites of endogenous TTP precisely and to unveil its role in inflammation resolution (Sedlyarov et al., 2016) applied PAR-iCLIP and RNASeq in BMDM of TTP(wt) and TTP(−/−) mice exposed to LPS for different times (**Table 1**).

Detailed inspection of sequence motifs in TTP bound mRNAs, which were detected in the different studies (**Table 1**) revealed AU-rich TTP binding sites represented by AUUUA pentamers and UUAUUUAUU nonamers (Stoecklin et al., 2008). In target mRNA 30UTRs UAUUUAU heptamers (Sedlyarov et al., 2016) are highly enriched as well. MRNAs encoding checkpoint regulators of LPS induced inflammation response, e.g., TNFα, IL-10, IL-15, CXCL2, and CCL2 were identified with all applied experimental and data validation strategies. Interestingly Kratochvill et al. (2011), reported that 25% of LPS induced transcripts were unstable. Among those displaying a TTP dependent decay were TNFα, IL-6, IL-10, TTP, CXCL1, CXCL2, CSF2, IER3 encoding transcripts (Kratochvill et al., 2011), which were also identified in the study of Tiedje et al. (2016). These data support the hypothesis that TTP functions in the elimination of inflammation related mRNAs, the maintenance of a balanced LPS response and the resolution of inflammation. Related pathways corresponding to enriched mRNAs include TNFα-, NFκB-, Wnt- and chemokine signaling, the formation of focal adhesions, apoptosis and mRNA processing. They were as well covered by mRNAs detected in RiboSeq experiments (Tiedje et al., 2016). The top 25 mRNAs, which were differentially bound by GFP-TTP and GFP-TTP(S52,178A) upon LPS stimulation include not only TNFα and NFκB-related signaling molecules like TNF, CXCL2, CXCL3, but also IER3 and DUSP1, which encode feedback inhibitors of the inflammatory response (Tiedje et al., 2016). These findings emphasize the importance of the MK2 dependent TTP release from target mRNAs to safeguard accurate feedback regulation of the inflammatory response. Remarkably, in the study of Sedlyarov et al. (2016) 343 TTP target mRNAs were identified through intron sequences bound by TTP. Of this group only 1%

from the binding site following c-Src catalyzed tyrosine phosphorylation that is initiated in response to LPS dependent macrophage activation (Liepelt et al., 2014). Thereby a rapid LPS response facilitated by straight signaling molecule synthesis can be established.

exhibited TTP dependent destabilization, suggesting that TTP binding to intron sequences does not affect intron processing. To characterize TTP impact on early and late LPS response (Sedlyarov et al., 2016) applied LPS treatment for 3 and 6 h. In the early phase only a few drivers of inflammation, such as TNFα, which activates central transcription inducers, e.g., NFκB


Information related to experimental tools and conditions as provided in the original papers.

are strongly controlled by TTP. In early and late response, the GO term coverage of target mRNA mostly overlapped. However, GO terms taxis and chemotaxis, which characterize the perpetuation of inflammation were only represented at the late response time point (Sedlyarov et al., 2016). From their analysis the authors conclude that TTP supports a switch to inflammation resolution by destabilizing mRNAs that encode migrationassociated proteins, thereby impeding chronic inflammation.

### HUR A VERSATILE RRM DOMAIN PROTEIN MODULATES MRNA STABILITY

The ubiquitously expressed protein HUR, which is encoded by Elavl1, consists of two consecutive N-terminal RNA recognition motifs (RRM), a central less conserved basic hinge region and a third C-terminal RRM (Ma et al., 1996). Whereas RRM1 and RRM2 function in RNA binding, RRM3 contributes to RNAprotein complex stabilization and protein-protein interactions, including HUR dimerization (Pabis et al., 2018). The basic hinge region includes a shuttling domain (Fan and Steitz, 1998), which in response to stress and mitogen signaling, facilitates nuclearcytoplasmic shuttling of the predominantly nuclear HUR (Keene, 1999). Cytoplasmic HUR accumulation, which is induced by p38 and MK2 dependent T<sup>118</sup> phosphorylation in response to γ-irradiation and oxidative stress, augments its binding to p21, urokinase and urokinase receptor mRNAs and their stabilization (Tran et al., 2003; Lafarga et al., 2009). HUR binds AREs, mostly located in the target mRNA 30UTR (Fan and Steitz, 1998; Lopez de Silanes et al., 2004), but also in intron sequences. HUR binding can contribute to alternative pre-mRNA splicing for specific genes like ZNF207, GANAB, DST and PTBP2 (Lebedeva et al., 2011) and the differential stabilization of 30UTR ARE containing c-Fos and c-Jun mRNAs (Peng et al., 1998). Systematic mapping and functional evaluation of HUR-RNA interactions by PAR-Clip and RIP-Chip experiments employing HEK293 cells confirmed that HUR mediates the modulation of nuclear pre-mRNA processing and stabilizes cytoplasmic mRNAs, which bear both intronic and 30UTR binding sites (Mukherjee et al., 2011). Interestingly, as shown for miR-7 that is encoded in the last HNRNPK exon, HUR binding to specific intronic miRNA precursors is implicated in their processing. HUR depletion from HeLa cells results in upregulation of miR-7, whereas hnRNP K expression remains unaffected, suggesting that HUR controls miR-7 precursor processing (Lebedeva et al., 2011). Implementing a refined digestion optimized RIP-seq protocol (DO-RIP-seq) (Nicholson et al., 2017a,b) were able to quantify HUR binding sites transcriptome-wide. Since HUR target mRNAs encode proteins implicated in cell

cycle control, cell death and differentiation, post-translational HUR modifications and dysregulated functions are associated with a broad range of pathologic conditions (Srikantan and Gorospe, 2012; Grammatikakis et al., 2017). Disease-linked HUR phosphorylation, methylation and proteolytic cleavage not only regulate the subcellular localization of HUR, but affect as well its RNA-binding (reviewed in Grammatikakis et al., 2017). Notably, in HeLa cells, fibroblasts and carcinoma tissues HUR controls the stability and interactions of lncRNAs, such as HOTAIR (Yoon et al., 2013), LincRNA-p21 (Yoon et al., 2012), and 7SL (Abdelmohsen et al., 2014) and adjusts their function in gene expression control. Furthermore, HUR and specific miRNAs cooperate or compete in mRNA regulation (Srikantan et al., 2012). Modulation of miRNA binding by HUR has been reported in human MCF-7 epithelial and Huh7 liver cells (Poria et al., 2016), as well as in murine BMDM (Lu et al., 2014). Remarkably, in BMDM LPS induced MK2 catalyzed TTP phosphorylation causes a shift of the competitive binding equilibrium between HUR and TTP toward HUR, which stabilizes TNFα mRNA and stimulates its translation (Tiedje et al., 2012; **Figure 1A**). This finding corroborates a functional relevance of a regulated crosstalk between HUR and TTP in the LPS induced macrophage immune response. In their comprehensive PAR-iCLIP and RNASeq analysis of the BMDM response to LPS (Sedlyarov et al., 2016) mapped HUR and TTP mRNA binding sites comparatively. The study revealed that a UUUUUUUUU nonamer is the most overrepresented HUR binding motif. With 78% the majority of HUR binding sites was located in 30UTRs, which exceeds two times the number of TTP 30UTR sites, whereas in intron sequences only 17% of the HUR sites were identified. Binding sites for both, TTP and HUR were determined in 59 target mRNAs. 552 and 120 binding sites for HUR and TTP, respectively, were not overlapping and 118 sites did overlap by at least 1 nt (Sedlyarov et al., 2016). This overlap applied to 40 targets, including TNFα and CXCL2 mRNA, for which simultaneous TTP and HUR binding were confirmed experimentally. Stability and expression of mRNAs bearing solely TTP binding sites did not significantly differ from mRNAs with overlapping motifs, suggesting no co-regulation of mRNA stability (Sedlyarov et al., 2016) in macrophage inflammatory response, but possibly at the level of mRNA translation as shown for TNFα mRNA (Tiedje et al., 2012).

#### TIAR, A RRM DOMAIN PROTEIN CONTRIBUTES TO MRNA TRANSLATION CONTROL

The two closely related DNA/RNA-binding proteins, T-cell intracellular antigen 1 (TIA-1) (Anderson et al., 1990) and TIA-1 related protein (TIAR), contain three N-terminal RRMs, which mediate oligonucleotide binding and a C-terminal Q-rich prion-related domain that enables participation in stress granule formation (Waris et al., 2014). TIAR RRM1 preferentially interacts with T-rich ssDNA and functions in transcription activation (Suswam et al., 2005). RRM2 displays affinity for U- and RRM3 for C-rich motifs (Dember et al., 1996; Cruz-Gallardo et al., 2014), whereas the RRM23-tandem domain binds mainly UC-rich sequences (Waris et al., 2017).

RRM2 and RRM3 contribute to nuclear accumulation of TIA proteins and nuclear export, respectively (Zhang et al., 2005). Both interact with U-rich stretches near mRNA 5<sup>0</sup> -splice sites (Del Gatto-Konczak et al., 2000) and modulate alternative splicing of mRNAs encoding FAS in murine fibroblasts (Forch et al., 2000), NF1 in rat neuronal cells (Zhu et al., 2008), human chondrocyte COL2A1 (McAlinden et al., 2007), liver CFTR (Zuccato et al., 2004), and CGRP in HeLa cells (Zhu et al., 2003). Furthermore, TIA proteins control TIAR and TIA-1 isoform expression tissue- and cell type specific (Izquierdo and Valcarcel, 2007).

In the cytoplasm, TIA proteins interact with 30UTR AREs of mRNAs encoding inflammation related proteins. TIAR was shown to bind to the TNFα mRNA 30UTR in RAW264.7 cells (Lewis et al., 1998; Gueydan et al., 1999) and peritoneal macrophages (Piecyk et al., 2000) in an LPS dependent manner. Enhanced TNFα synthesis in macrophages of TIAR(−/−) mice (Piecyk et al., 2000) suggests that in unstimulated cells TIAR impedes TNFα mRNA translation, which can be activated to drive inflammatory cytokine expression upon TLR4 mediated recognition of bacterial LPS (**Figure 1B**). TIA protein mediated control of TNFα expression is demonstrated by impaired TNFα mRNA regulation in TNFα (1ARE) mice, where it is implicated in chronic inflammation (Kontoyiannis et al., 1999). Furthermore, mRNAs encoding inflammation related COX-2 (Cok et al., 2003; Dixon et al., 2003) and HMMP-13 mRNA (Yu et al., 2003) are TIAR targets in primary murine fibroblasts and human mesangial cells, respectively.

Interestingly, TIA proteins have also been shown to contribute to global translation regulation under amino acid starvation in HEK293S cells. TIA proteins bind to the 5<sup>0</sup> -oligopyrimidine tract of 5<sup>0</sup> -TOP mRNAs, which encode critical components of the translational apparatus, like ribosomal proteins and PABP-C1 and induce the release of these target mRNAs from actively translating polysomes (Damgaard and Lykke-Andersen, 2011). Besides that, TIA proteins are involved in the formation of stress granules, which sequester mRNAs that are translationally stalled by specific mRNPs under starvation-induced stress (Damgaard and Lykke-Andersen, 2011), heat shock and arsenide stress in fibroblasts (Kedersha et al., 1999), in LPS activated B-cells (Diaz-Munoz et al., 2017) and other adverse conditions including hypoxia and viral infection (Anderson and Kedersha, 2002; Waris et al., 2014).

RIP-Chip experiments were performed by Kharraz et al. (2016) to identify mRNAs specifically bound by TIAR in unstimulated and LPS induced murine RAW264.7 macrophages stably expressing TIAR(wt) -FLAG and TIAR(1RRM2) -FLAG. RRM2 of TIAR, which is required for high affinity mRNA binding (Dember et al., 1996; Kim et al., 2013) was deleted to discard all mRNAs that bind with low affinity. The analysis revealed that 351 mRNAs were bound by TIAR in unstimulated macrophages and 779 in LPS induced cells, with 8 transcripts exclusively bound in unstimulated and 436 in LPS induced cells, respectively. Binding of TNFα mRNA could be validated, also

the binding of the mRNA that encodes MAPK phosphatase 1 (MKP-1 also termed CL100, VHV1, 3CH134, and DUSP1), for which an interaction with TIAR has been shown before in HeLa cells (Kuwano et al., 2008). The mRNAs encoding TLR4 and the serine/threonine phosphatase 2A catalytic subunit 2β could be identified as new targets (Kharraz et al., 2016). GO term analysis of TIAR target mRNAs shows that under both condition the GO terms catabolic process, cell cycle and regulation of apoptosis were enriched, which cover proteins involved in inflammatory response, cell proliferation, cell death and metabolism. However, exclusively in LPS induced cells mRNAs bound by TIAR encoded proteins within the GO term category positive regulation of IkB kinase/NFκB cascade (Kharraz et al., 2016).

T-cell intracellular antigen 1 related protein ARE binding specificity was lower than that of TTP (Stoecklin et al., 2008), but not affected by LPS treatment (Kharraz et al., 2016). The high number of mRNAs bound in response to LPS suggests that LPS directly modulates TIAR mRNA binding and that TIAR interacts, mediated by RRM1 and RRM3, with additional ARE-independent sequence motifs (Kharraz et al., 2016). TIAR mediated regulation of alternative mRNA splicing and inhibition of mRNA translation, which were shown for inflammation related proteins (Gueydan et al., 1999; Piecyk et al., 2000; Cok et al., 2003; Suswam et al., 2005), indicate that TIAR modulates the inflammatory response and contributes to its rapid decline when the stimulus disappears.

## KH DOMAIN PROTEIN HNRNP K MODULATES MRNA TRANSLATION

Heterogeneous ribonuclear protein K was first described as a structural component of nuclear ribonucleoprotein complexes associated with heterogeneous nuclear RNA (Pinol-Roma et al., 1988; Matunis et al., 1992). The protein contains three KH domains consisting of 65–70 amino acids (Gibson et al., 1993; Siomi et al., 1993; Dejgaard and Leffers, 1996), which occur with two distinct folding topologies (Baber et al., 1999; Grishin, 2001). In SELEX experiments UC3−<sup>4</sup> RNA motifs were determined as optimal ligands for KH3 (Thisted et al., 2001). Binding of hnRNP K KH domains to ssDNA (Braddock et al., 2002; Backe et al., 2005) and RNA (Messias et al., 2006; Moritz et al., 2014) has been analyzed systematically. Quantitative evaluation indicated that the KH domains of hnRNP K contribute differentially to RNA binding, with KH1-KH2 acting as a tandem domain and KH3 as an individual binding domain (Moritz et al., 2014). The affinity of full-length hnRNP K is in the nanomolar range, while K<sup>D</sup> values for the isolated domain KH3 are micromolar (Backe et al., 2005; Moritz et al., 2014). Two U/CCC motifs within 19 nts confer hnRNP K binding, whereas four U/CCC motifs within 38 nts are necessary and sufficient for translational regulation (Ostareck et al., 1997). Bidirectional nuclear-cytoplasmic transport of hnRNP K is mediated by an N-terminal nuclear localization motif and a hnRNP K-specific shuttling domain (Michael et al., 1995, 1997).

As multifunctional protein hnRNP K is associated with transcription activation (Moumen et al., 2005), pre-mRNA splicing (Expert-Bezancon et al., 2002), mRNA stability control (Skalweit et al., 2003) and regulation of mRNA translation (Ostareck et al., 1997, 2001; Collier et al., 1998; Naarmann et al., 2008, 2010; Naarmann-de Vries et al., 2016). HnRNP K functions are modulated by mRNA specific associated mRNP components (Ostareck-Lederer and Ostareck, 2012) and by posttranslational modifications. ERK dependent phosphorylation of S <sup>284</sup>,<sup>353</sup> drives its cytoplasmic accumulation as prerequisite for hnRNP K-mediated mRNA translation regulation (Habelhah et al., 2001). Phosphorylation of KH3 Y<sup>458</sup> by c-Src (Ostareck-Lederer et al., 2002; Messias et al., 2006; Adolph et al., 2007) and caspase-3 catalyzed cleavage (Naarmann-de Vries et al., 2013) control hnRNP K release from translational regulated target mRNAs and site-specific arginine methylation by PRMT1 regulates hnRNP K protein-protein interactions (Ostareck-Lederer et al., 2006).

In human Thp-1 monocytes, hnRNP K was shown to be associated with the COX-2 promoter and to control cytoplasmic COX-2 mRNA stability by modulating miRNA binding to the 3 <sup>0</sup>UTR (Shanmugam et al., 2008).

RIP-Chip analysis of differential mRNA binding in untreated and LPS induced RAW264.7 macrophages demonstrated that 1901 mRNAs were differential bound by hnRNP K in response to LPS. GO term annotation allocated them to biological processes related to metabolism, cell communication, transport, cell cycle, development and immune response (Liepelt et al., 2014). Strikingly, whereas cytokines and chemokines were underrepresented among the 163 mRNAs related to immune response, 21 mRNAs encoded kinases and modulators in TLR4 signaling, of which 7 equally expressed mRNAs encoding IRAK4, IRAK1BP1, ERC1, CARM1/PRMT4, PI3KCA, AKT3, and TAK1 were specifically enriched in non-induced cells (Liepelt et al., 2014). A detailed analysis of differential hnRNP K association with the mRNA of transforming growth factor-ß-activated kinase 1 (TAK1), a central kinase in TLR4 signaling, revealed that KH domain 3 interacts with U/CCC elements in the TAK1 mRNA 30UTR. Silencing of hnRNP K expression in macrophages and BMDM had no impact on the level of TAK1 mRNA, but endogenous TAK1 mRNA accumulated in actively translating polysomal fractions, resulting in an increased TAK1 expression. Through the regulation of TAK1 mRNA translation hnRNP K affects the phosphorylation of the TAK1 downstream target p38 and finally inflammatory cytokine gene transcription, i.e., TNFα, IL-1ß, and IL-10 (Liepelt et al., 2014), thereby hnRNP K modulates the LPS response of macrophages (**Figure 1C**).

## CONCLUSION AND PERSPECTIVES

As primary responding cells of the innate immune system, macrophages recognize pathogens and become activated to initiate and coordinate the organism-wide systemic immune response by cytokine and chemokine secretion, migration and phagocytosis. These processes require highly coordinated gene expression, which is achieved at the post-transcriptional level by regulated functional RBP-RNA interactions. Specific RBPs, TTP, HUR, TIAR, and hnRNP K, regulate the fate of their cellular

target RNAs from synthesis to turnover and translation, thereby contributing to the coordination of the rapid and purposeful immune cell responses. LPS molecules of gramnegative bacteria are abundant and specific ligands that activate macrophages through TLR4 receptor signaling. Systematic analyses of RBP-RNA interaction in untreated and LPS-induced macrophages employing RIP-Chip, iCLIP, PAR-iCLIP, RNAseq, and RiboSeq studies revealed a first insight in the complex protein-mRNA networks that are established by RBPs, which bind mRNAs with different specificities for AREs and U-rich elements, like TTP, HUR, and TIAR and pyrimidine-rich sequence motifs, like hnRNP K based on their individual domain composition. These specific interactions ensure the simultaneous modulation of various target mRNAs, which encode proteins that function in defined biological processes related to the induction and resolution of immune response. Whereas TIAR and hnRNP K mainly modulate mRNA translation, which enables the direct regulation of signaling protein synthesis to initiate immune response, TTP and HUR are primarily involved in the control of mRNA decay and stability that is required to balance and resolve immune reactions.

These processes have been studied so far only for a limited number of mRNAs differentially bound by RBPs in response to LPS. It will be essential and informative to investigate the regulation of further target mRNAs of TTP, HUR, TIAR, and hnRNP K discovered in these studies to get more insight in regulatory feedback mechanisms that coordinate the balanced immune response and its dysregulation in chronic inflammation.

To this end it is interesting to note that a number of unconventional RBPs, have been identified recently, for which RNA related functions that will expand our understanding of

#### REFERENCES


post-transcriptional gene regulation still need to be elucidated (Castello et al., 2015; Albihlal and Gerber, 2018; Hentze et al., 2018).

In macrophages, 19 new putative RBPs, which lack well characterized RNA binding domains were identified by RNA interactome capture (Liepelt et al., 2016). Panther protein class annotation revealed that they are involved in signaling, enzymatic functions and cytoskeletal remodeling. It will be interesting to identify their target mRNAs and to study their potential functions in LPS induced macrophage response.

In addition, post-transcriptional RNA modifications (Nachtergaele and He, 2018) might add a further layer of regulation in LPS-induced macrophages, affecting RBP binding and thereby the fate of their target mRNAs.

#### AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

#### FUNDING

Financial support was provided by a grant from the Deutsche Forschungsgemeinschaft (DFG) SPP 1935, OS 290/6-1.

#### ACKNOWLEDGMENTS

We apologize to colleagues whose work was not cited due to length constrictions.



and HuR-dependent stabilization of p21(Cip1) mRNA mediates the G(1)/S checkpoint. Mol. Cell. Biol. 29, 4341–4351. doi: 10.1128/MCB.00210-09


tumor suppressor gene programmed cell death 4. Oncogene 35, 1703–1715. doi: 10.1038/onc.2015.235


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Ostareck and Ostareck-Lederer. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Insights Into Non-coding RNAs as Novel Antimicrobial Drugs

Gisela Parmeciano Di Noto† , María Carolina Molina† and Cecilia Quiroga\*

Universidad de Buenos Aires, Consejo Nacional de Investigaciones Científicas y Tecnológicas, Instituto de Investigaciones en Microbiología y Parasitología Médica (IMPAM), Facultad de Medicina, Buenos Aires, Argentina

Multidrug resistant bacteria are a serious worldwide problem, especially carbapenemresistant Enterobacteriaceae (such as Klebsiella pneumoniae and Escherichia coli), Acinetobacter baumannii and Pseudomonas aeruginosa. Since the emergence of extensive and pan-drug resistant bacteria there are few antibiotics left to treat patients, thus novel RNA-based strategies are being considered. Here, we examine the current situation of different non-coding RNAs found in bacteria as well as their function and potential application as antimicrobial agents. Furthermore, we discuss the factors that may contribute in the efficient development of RNA-based drugs, the limitations for their implementation and the use of nanocarriers for delivery.

#### Edited by:

Chiara Gamberi, Concordia University, Canada

#### Reviewed by:

Terrence Chi-Kong Lau, City University of Hong Kong, Hong Kong Scott A. Tenenbaum, University at Albany, United States

#### \*Correspondence:

Cecilia Quiroga ceciliaquiroga@conicet.gov.ar

†These authors have contributed equally to this work

#### Specialty section:

This article was submitted to RNA, a section of the journal Frontiers in Genetics

Received: 22 September 2018 Accepted: 24 January 2019 Published: 22 February 2019

#### Citation:

Parmeciano Di Noto G, Molina MC and Quiroga C (2019) Insights Into Non-coding RNAs as Novel Antimicrobial Drugs. Front. Genet. 10:57. doi: 10.3389/fgene.2019.00057 Keywords: sRNA, CRISPR-Cas, antimicrobial, RNA, delivery

#### INTRODUCTION

In the year 2014, the World Health Organization reported the critical problem of antibiotic resistant bacteria (World Health Organization [WHO], 2014). The global resistance levels of bacterial isolates have climbed unrelentingly in the last decades regardless of their source, i.e., clinical settings, in-patients, community, food-related or environmental niches. This led to the increase in the overall morbidity and mortality due to multidrug resistant bacteria (MDR) infections (Baquero et al., 2015; Woolhouse et al., 2016). Throughout the years, misused and abused antimicrobial drugs have led to the selection of resistant strains difficult to eradicate (Baquero et al., 2015). As a result, bacteria have evolved into extensive- (XDR) or pan-drug resistant (PDR) phenotypes.

The Center for Disease Control and Prevention has classified some gram-negative bacteria as urgent or serious threats for public health. Among them, Enterobacteriaceae resistant to carbapenems (CRE) or to extended spectrum beta-lactamases (EBSL), multidrug resistant Acinetobacter and Pseudomonas species present serious hazards. The lack of novel antimicrobial drugs available in the market or the drug development pipeline to combat these pathogens, the high cost of discovering and developing new compounds and the fast evolution of bacterial population to resistant phenotypes are particularly worrisome. Therefore, novel approaches to battle these pathogens are currently encouraged (World Health Organization [WHO], 2014). One promising strategy is the use of RNA-based therapies. This review examines the current situation of non-coding RNA (ncRNA) elements as antimicrobial agents and discusses some strategies and limitations for their implementation.

### NON-CODING RNAs AS THERAPEUTICS AGENTS

Since a few decades ago, RNA molecules have been foreseen as potential drugs against pathogens. With the characterization of novel ncRNAs in bacteria, this strategy seems more plausible. Among the ncRNA molecules studied for their therapeutic potential are the ribozymes hammerhead, group II introns, glmS, and RNAse P (**Figure 1**) (Cui and Davis, 2007; Ferré-D'Amaré, 2010; Lambowitz and Zimmerly, 2011; Hammann et al., 2012; Altman, 2014; Khan et al., 2016). One of the most studied ribozymes is RNAse P. Its activity and interaction with external guide sequence as therapeutics against MDR bacteria has been extensively reviewed elsewhere, and interesting advances in the field have been reported (Forster and Altman, 1990; Kirsebom and Svärd, 1992; Svärd and Kirsebom, 1993; Altman, 2014; Davies-Sala et al., 2015). The approach for the use of this ribozyme is based on the delivery of nuclease-resistant analogs, such as locked nucleic acids/DNA cooligomers or phosphorodiamidate morpholino oligonucleotide EGSs conjugated to permeabilizer peptide (PPMO), that induce a RNAse P-mediated degradation of the target mRNA once introduced in the host. Further advances using this strategy will most likely provide interesting results that will contribute in developing novel RNA-based drugs.

Hammerhead ribozymes have been used to develop antiviral compounds; however, their use against bacteria has not been considered yet (Hammann et al., 2012). Group II introns are self-splicing elements that in the presence of its cofactor can retrotranspose to novel target sites within a genome (Lambowitz and Zimmerly, 2011). Several attempts were made to use these ribozymes as vehicles for the delivery of cargo genes to inhibit cell growth or promote cell death (Plante and Cousineau, 2006; Mohr et al., 2013). One particular subclass of group II introns, C-attC, has the peculiar ability to insert downstream of DNA secondary structures adjacent to antimicrobial gene cassettes located in integron platforms (Centrón and Roy, 2002; Quiroga et al., 2008). The ability exhibited by C-attC group II introns to selectively insert within gene cassettes suggests that they could be employed as vectors to deliver genetic material at specific target sites. Last, the glmS ribozyme has also been a subject of study as an antimicrobial drug. It has been reported that in the presence of carba-α-D-glucosamine it can promote mRNA degradation and inhibit cell growth (Ferré-D'Amaré, 2010; Schüller et al., 2017). Although all these RNA elements have promising features that could be adapted to engineer RNA based drugs, further advances in their delivery are necessary.

The recent upsurge of other functional ncRNAs in bacteria have revealed their essential role in the regulation of different processes, such as cell physiology, defense, horizontal gene transfer, virulence, etc (Gottesman and Storz, 2011; Storz et al., 2011; Caldelari et al., 2013; Fröhlich and Papenfort, 2016). Since many ncRNAs are key regulatory elements, they are currently considered for designing novel therapeutic strategies. These RNAs are commonly small in size (<500 nt), and can either act in cis of the target messenger RNA (thermoregulators, riboswitches) or in trans [small RNAs, antisense RNAs, clustered regularly interspaced short palindromic repeats (CRISPRs)] (**Figure 1**). Riboswitches and thermoregulators control the expression of an adjacent mRNA upon sensing physical or chemical signals (Winkler et al., 2002; Chowdhury et al., 2006). The environmental effect or the presence of specific molecules lead to structural modifications in the 5<sup>0</sup> -UTR of a target mRNA that can either release or sequester the ribosome binding site, resulting in the activation or repression of translation. While thermoregulators are mostly temperature-sensitive RNAs that respond to heat or cold shock, riboswitches are more complex elements that regulate a wide variety of genes. Some riboswitches, such as the guanine riboswitch, have shown promising results as targets for novel antimicrobial compounds against the pathogen Clostridioides difficile (Yan et al., 2018). Also, it can regulate the expression of aminoglycoside antibioticresistance genes (Jia et al., 2013; Rekand and Brenk, 2017). Mechanistic insight into these RNA sensors and their use as antimicrobials can be found in comprehensive reviews (Chowdhury et al., 2006; Rekand and Brenk, 2017). Two additional ncRNA elements, sRNA and CRISPRs, have lately drawn more attention as potential RNA-based antimicrobial drugs. In the following sections, we will focus on their use, strength and limitations.

## SMALL NON-CODING RNAs IN BACTERIA

Small non-coding RNAs (sRNA) are short RNAs that regulate post-transcriptionally gene expression. These RNAs can be encoded in the opposite strand of the target mRNA (known as antisense or cis sRNA), or encoded in trans to the target mRNA. The trans acting sRNA, or simply sRNAs, are RNA regulators frequently found in bacteria that interact by imperfect base pairing with its target mRNA. Their regulation process usually involves the chaperon protein Hfq, as well as ProQ and CsrA (Wagner and Romby, 2015; Olejniczak and Storz, 2017), albeit interactions with other chaperons and cis sRNAs have also been reported (Opdyke et al., 2004; Ross et al., 2013; Ellis et al., 2015). These proteins participate in the sRNA and its target mRNA interaction, in mRNA translation or during RNA decay. As a result, sRNAs can repress translation by binding to the initiation target site, by sequestration of the ribosome standby site, or by facilitating mRNA degradation with ribonucleases; they can also activate translation by exposing a sequestered ribosome binding site or protecting a mRNA by masking a ribonuclease cleavage site (Gottesman and Storz, 2011; Storz et al., 2011; Caldelari et al., 2013).

Several studies have shown that sRNAs regulate a wide variety of genes that code for proteins involved in processes related to physiology, metabolism, stress responses or quorum sensing (reviewed in Gottesman and Storz, 2011; Storz et al., 2011; Caldelari et al., 2013; Fröhlich and Papenfort, 2016). Many of them are capable of regulating more than one target mRNA, which unveils a complex sRNA-based network (Storz et al., 2011). Furthermore, recent studies have suggested that

approximately half of the mRNAs are regulated by sRNAs (Hör and Vogel, 2017), which showcase their important role in post-transcriptional control. sRNAs regulators provide different benefits to the host, such as reduced metabolic cost and a tighter and faster gene regulation, that help bacteria to adapt to new environments (Beisel and Storz, 2010). Thus, sRNAmediated regulation is currently regarded as RNA-based drug targets. In this regard, Na et al. (2013) designed several synthetic sRNAs targeting various mRNAs RBS, which modulate gene expression in different Escherichia coli strains. Since then, several studies on the application of sRNAs in metabolic engineering and synthetic biology have been published (reviewed in Villa et al., 2019).

Other appealing target candidates include virulence and resistance genes as well as mobile elements, thus they have become appealing candidates. In this regard, it has been reported that some sRNAs are involved in antibiotic uptake (GcvB, RyhB, MicF, ErsA), drug efflux (DsrA RydC, SdsR, NrrF), biofilm formation (RprA OmrA/B, McaS, RybB, RydC), and modification of lipopolysaccharide and cell wall synthesis (MgrR, MicA, Sr006). While most of these sRNAs have been extensively studied in E. coli and Salmonella strains (reviewed in Dersch et al., 2017), there is scarce information about their activity in other bacteria. The identification of sRNAs related to antimicrobial resistance genes and their mechanisms of dissemination exposes a new strategy for the delivery of synthetic sRNAs to XDR and PDR bacteria.

## THE CRISPR-Cas SYSTEMS IN BACTERIA

CRISPR-Cas systems are part of the immune system of bacteria and provide protection against mobile genetic elements. Its immunity is based on the specific sequence recognition of foreign DNA or RNA by base pairing with short guide RNAs (32–35 nt), followed by the cleavage of the target sequence by CRISPRassociated protein (encoded by the cas genes). There are two classes and several types of CRISPR-Cas systems, which are usually composed of a cas operon adjacent to a CRISPR array (Koonin et al., 2017). Such array consists of direct repeats interspaced by the DNA invader-derived guide sequences that anneals with the exogenous material (Jackson et al., 2017; Hille et al., 2018 and references within). In recent years, the CRISPR-Cas machinery has been repurposed for gene editing and interference. These systems have a highly sequence-specific targeting ability that inspired the research community to use them as novel antimicrobial agents. The unique activity of CRISPR-Cas systems regards them as elements that can either attack resistance genes or populations of unwanted pathogenic bacteria, while preventing the eradication of bacteria that might be beneficial (Bikard and Barrangou, 2017; Goren et al., 2017; Greene, 2018).

To date, a few CRISPR guide RNAs have been designed to target virulence factors, antimicrobials determinants or essential chromosomal genes from specific pathogens, such as E. coli

or Staphylococcus aureus (Bikard et al., 2014; Citorik et al., 2014; Gomaa et al., 2014). These systems were employed to efficiently target a particular DNA sequence resulting in the introduction of chromosome deletions in different pathogens, which consequently led to cell death or to the reduction in the population of unwanted bacteria (Vercoe et al., 2013; Bikard et al., 2014; Citorik et al., 2014; Gomaa et al., 2014; Hampton et al., 2016). Vercoe et al. (2013) observed that a guide or CRISPR RNA (crRNA) programmed to target a large horizontally acquired island in Pectobacterium atrosepticum activated the endogenous CRISPR-Cas system and promoted the loss of both islands and the accessory genes encoded within. Moreover, double-stranded DNA breaks caused by the Cas machinery made CRISPR-Cas target the bacterial chromosome and resulted in the inhibition of cell growth and a filamentation phenotype (Vercoe et al., 2013). Although it has been confirmed that resistance genes can be eliminated using this technique (Bikard et al., 2014; Citorik et al., 2014), spontaneous point mutations in bacterial genomes might affect the action of synthetic guide CRISPR RNAs or endogenous CRISPR-Cas systems. Therefore measures to counteract these effects during new drug development should be contemplated.

#### CONSIDERATIONS ON THE DESIGN OF RNA-BASED ANTIMICROBIAL STRATEGIES

The development of RNA-based antimicrobial strategies requires the understanding of the factors involved in the mechanisms and activities of each RNA element, the determination of their specificity to ascertain that no off-targets and unexpected events occur, and the evaluation of the impact that introducing these RNAs may cause to the host. Most studies have been limited to reference strains, such as E. coli MG1655, whereas only few of them have been done using clinical isolates (Bikard et al., 2014; Citorik et al., 2014; Gomaa et al., 2014; Chan et al., 2017; Dersch et al., 2017). The extensive genome sequencing projects in antimicrobial resistant pathogens revealed that clinical isolates have large, versatile and plastic genomes that encode an assortment of cellular factors. The process of selecting a target mRNA and designing RNA-based drugs, either using sRNAs or CRISPR guide RNAs, will most likely require a subsequent validation in different bacteria (**Figure 1**).

A special consideration should be placed on the selection of the target mRNAs (**Figure 1**). Most mRNAs are good candidates for RNA-based antimicrobials; however, current approaches for developing drugs are aiming for specific targets that have little or no effect on the host microbiota (Langdon et al., 2016; Lichtman et al., 2016). To overcome this problem, a safe approach involves directing the attack to specific genes that will only have an impact on pathogenic bacteria. Therefore, virulence genes, antimicrobial resistant determinants, mobile genetic elements or genes involved in horizontal transfer are ideal candidates. Designing sRNAs or guide RNAs that hybridize specifically with those genes will limit the effect on microbial flora even if they are introduced in other host cells.

Furthermore, the design of synthetic RNAs should take into consideration their stability in the cell, as well as their folding into proper structures (**Figure 1**). Previous studies have shown that single strand RNAs are more stable when their extremities are protected by stem-loop structures, which improves their survival in the cell (Majdalani et al., 1998). Although this increases their stability, they are not exempted of the effects of the host degradation machinery. In this regard, RNAs that bind to specific proteins (e.g., Hfq or Cas) can be protected from the action of RNAses, which will increase RNA survival in the cell and the execution of the desired tasks. Therefore, functional and structural studies on Hfq interaction with synthetic sRNAs or between guide RNAs and Cas proteins will help to optimize their activity and reduce undesired degradation.

Despite the fact that chaperons and cofactors can provide stability to the candidate RNAs, delivery of RNPs may prove difficult in bacterial cells. Alternatively, some studies have suggested the use of endogenous CRISPR-Cas systems against XDR and PDR bacteria. A caveat in this strategy is that CRISPR-Cas systems are not conserved in bacterial species (Koonin et al., 2017) and previous confirmation of their presence in the host will be necessary.

## RNA DELIVERY IN BACTERIA

The need to explore new delivery systems capable of overcoming the challenges of specificity, selectivity for targeting and efficiency has appeared. Transport of genetic material from an extracellular environment into cytosolic compartment is a complex task specially when referred to transport across bacteria barriers, outer membrane (in gram-negative bacteria), the cell wall and the cytoplasmic membrane (Chen and Dubnau, 2004). Synthetic nanocarriers and bioinspired vehicles, such as bacteriophages, have been investigated for their use in drug and gene delivery systems (**Figure 1**). Bacteriophages are viruses with a highly efficient ability for compressing and wrapping DNA to form compact particles of 28 nm (MS2), 200 nm (T4) or 890 nm (M13) (Karimi et al., 2016). Based on the potential of these viruses to naturally act as carriers, they have been employed in the transfer of genetic information. Phage therapy has been revisited as an alternative to antibiotics for treating bacterial infections in different models as well as implemented in phase I and II of clinical trials (reviewed in Lin et al., 2017). Non-lytic bacterial cellular death was reported employing phagemid constructs that can carry different antimicrobial compounds and target specific bacteria (Krom et al., 2015). The authors showed that this approach led to a significant reduction in bacterial cell viability in vitro and an 80% survival rate in a murine peritonitis infection model, which are promising results.

Toward ncRNA-based antimicrobial therapeutics, Na et al. (2013) showed that custom sRNA cassettes carrying the antisense sequence of a target mRNA and an Hfq-binding motif it is possible to modulate gene expression in different E. coli strains. Based on these findings Bernheim et al. (2016) developed a protocol for synthetic sRNA delivery in E. coli cells using a

phagemid construct and a non-lytic M13 phage that upon encapsulation can infect a population.

On the other hand, three research groups have assessed the delivery of CRISPR-Cas system using phage particles as vectors that seizes the specificity of phages for their hosts (Bikard et al., 2014; Citorik et al., 2014; Yosef et al., 2015). Citorik et al. (2014) used CRISPR-Cas technology and created RNA-guided nucleases targeting antibiotic resistance and virulence determinants in carbapenem-resistant Enterobacteriaceae and enterohemorragic E. coli. This strategy involved the delivery of RNA-guided nucleases using a bacteriophage or a conjugative plasmid. Bikard et al. (2014) used a phage-encoded CRISPR-Cas9 to target antibiotic resistance genes in strains of Staphylococcus aureus. Both groups confirmed their results with in vivo experiments, in a Galleria mellonella infection model and a mouse skin colonization model (Bikard et al., 2014; Citorik et al., 2014). Lastly, Yosef et al. (2015) improved the delivery model by combining the use of a λ prophage and the lytic phage T7. They used E. coli as a host and delivered the CRISPR cascade genes and cas3 of a type I-E CRISPR-Cas system along with the guide crRNAs designed to target the beta-lactam resistance genes blaNDM−<sup>1</sup> and blaCTX−M−15. They proposed to sensitize E. coli cells to β-lactam antibiotics while simultaneously conferring a selective advantage to sensitized bacteria by protecting them from lytic phages with an engineered CRISPR-Cas system delivered by a λ prophage. Therefore, when E. coli cells were infected with a T7 phage, only bacteria that were sensitized and had an active CRISPR-Cas system were able to resist the infection. The authors stated that the use of this technology would reduce multi-drug resistant populations, overcome the resistance problem and repurpose several antibiotics that are no longer used. However, some limitations regarding conjugation efficiency, host range and phage resistance suggest that new delivery vehicles need to be tested. In this regard, nanotechnology offers promising options of nanocarriers that should be explored for antimicrobial delivery systems, a wide variety of materials, and the possibility to improve targeting designed to specifically reach bacterial cells. Of note, extracellular vesicles (EVs) derived from phage-sensitive bacteria have also been proposed as potential extra opportunities in phage therapy. EVs can be administered prior to the phages to enhance the targeting of bacteria and even enable the infection of novel bacterial host targets (Liu et al., 2018).

Non-viral nanoparticles have been tested as nanocarriers to achieve the incorporation of genetic material in bacteria. For instance, encapsulation of plasmid DNA with different molecular weights of chitosan (chitosan-pDNA NPs) resulted in different NP sizes (457 to 820 nm) that greatly enhanced transformation efficiency in E. coli cells compared to naked DNA (Bozkir and Saka, 2004). Further showing the potentiality of nanoparticles and chitosan to introduce genetic material in bacterial cells, other research groups have evaluated the efficiency of plasmid DNA delivery using electrospray of chitosan-pDNA NPs into non-competent vs. competent E. coli (Abyadeh et al., 2017), electrospray of gold NPs (GNPs) in non-competent E. coli (Lee et al., 2011), and transformation of GNPs – pDNA conjugates by high temperature and friction forces of the Yoshida effect in gram positive and gram negative bacteria (Kumari et al., 2017). However, to the best of our knowledge they have not been tested yet using ncRNAs as cargo.

Although the progress in the field is promising, there are still many questions to be answered. For instance, which nanoparticle will efficiently deliver sRNAs without compromising its activity? How functional and adaptable has to be a synthetic system in order to battle the evolution of bacteria toward antimicrobial resistance?

And in the particular case of CRISPR-Cas systems, is it suitable to use the endogenous machinery of pathogens and deliver only CRISPR RNAs, or is it better to deliver the entire CRISPR-Cas machinery? Which type of CRISPR-Cas is more efficient? How efficient is the delivery of these systems with bacteriophages? In this regard, it is well-known that bacteria can resist phage infections using other strategies besides CRISPR-Cas, i.e., by spontaneous mutations of sensitive cells independently of the action of the virus, with restriction and modification systems, masking of membrane receptors or with toxin/antitoxin systems. Moreover, recent reports have revealed that bacteria can encode anti-CRISPR proteins in prophages, which could affect the efficiency of the CRISPR-Cas system (Labrie et al., 2010; Seed, 2015; van Houte et al., 2016; Borges et al., 2017; Oechslin, 2018; Pawluk et al., 2018). There are no studies yet on how these mechanisms would work in face of these therapies.

#### CONCLUDING REMARKS

The antimicrobial resistance problem is a crucial global issue that needs to be addressed. The development of alternative strategies to battle bacterial pathogens are of outmost importance. RNAbased therapies, such as synthetic sRNAs or CRISPR guide RNAs, are attractive strategies to tackle this problem. Both approaches can target accessory genome of pathogenic bacteria, in particular extended spectrum beta-lactams, carbapenems or colistin resistance genes. However, it is important to develop systems that not only are successful for delivering highly effective RNA elements but that can also be rapidly modified upon bacterial acquisition of novel resistances and limits the selection of MDR bacteria. Furthermore, a combined system targeting several mRNAs in a coordinate manner would ideally be more robust. In this regard, the CRISPR-Cas systems have revolutionized the world of microbiology, and their use in the fight against antibiotic multiresistance is going to be without a doubt a powerful tool. Notwithstanding, more studies are indeed necessary to be able to deliver these RNAs with high specificity and achieve a clinically relevant efficacy. The advances on the activity of sRNA and CRISPR-Cas systems have raised the issue of their use as antimicrobial drugs, further progress in the RNA and nanotechnology field are necessary to answer all these questions.

#### AUTHOR CONTRIBUTIONS

CQ, GPDN, and MCM wrote the manuscript. All the authors discussed the content, contributed to manuscript revision, and read and approved the submitted version.

## FUNDING

GPDN and MCM are recipients of doctoral scholarships from Consejo Nacional de Investigaciones en Ciencia y Tecnología

### REFERENCES


(CONICET). CQ is a career investigator from CONICET. This work was supported by grants BID/OC ANPCyT (2013– 1978) and PUE-0085 from National Research Council from Argentina, to CQ.



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Parmeciano Di Noto, Molina and Quiroga. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Drosophila mRNA Localization During Later Development: Past, Present, and Future

Sarah C. Hughes 1,2 and Andrew J. Simmonds <sup>2</sup> \*

*<sup>1</sup> Department of Medical Genetics, Faculty of Medicine and Dentistry, University of Alberta, Edmonton, AB, Canada, <sup>2</sup> Department of Cell Biology, Faculty of Medicine and Dentistry, University of Alberta, Edmonton, AB, Canada*

Edited by:

*Chiara Gamberi, Concordia University, Canada*

#### Reviewed by:

*Eugenia Olesnicky Killian, University of Colorado Colorado Springs, United States Dorothy Lerit, Emory University School of Medicine, United States*

> \*Correspondence: *Andrew J. Simmonds andrew.simmonds@ualberta.ca*

#### Specialty section:

*This article was submitted to RNA, a section of the journal Frontiers in Genetics*

Received: *21 August 2018* Accepted: *11 February 2019* Published: *07 March 2019*

#### Citation:

*Hughes SC and Simmonds AJ (2019) Drosophila mRNA Localization During Later Development: Past, Present, and Future. Front. Genet. 10:135. doi: 10.3389/fgene.2019.00135* Multiple mechanisms tightly regulate mRNAs during their transcription, translation, and degradation. Of these, the physical localization of mRNAs to specific cytoplasmic regions is relatively easy to detect; however, linking localization to functional regulatory roles has been more difficult to establish. Historically, *Drosophila melanogaster* is a highly effective model to identify localized mRNAs and has helped identify roles for this process by regulating various cell activities. The majority of the well-characterized functional roles for localizing mRNAs to sub-regions of the cytoplasm have come from the *Drosophila* oocyte and early syncytial embryo. At present, relatively few functional roles have been established for mRNA localization within the relatively smaller, differentiated somatic cell lineages characteristic of later development, beginning with the cellular blastoderm, and the multiple cell lineages that make up the gastrulating embryo, larva, and adult. This review is divided into three parts—the first outlines past evidence for cytoplasmic mRNA localization affecting aspects of cellular activity post-blastoderm development in *Drosophila*. The majority of these known examples come from highly polarized cell lineages such as differentiating neurons. The second part considers the present state of affairs where we now know that many, if not most mRNAs are localized to discrete cytoplasmic regions in one or more somatic cell lineages of cellularized embryos, larvae or adults. Assuming that the phenomenon of cytoplasmic mRNA localization represents an underlying functional activity, and correlation with the encoded proteins suggests that mRNA localization is involved in far more than neuronal differentiation. Thus, it seems highly likely that past-identified examples represent only a small fraction of localization-based mRNA regulation in somatic cells. The last part highlights recent technological advances that now provide an opportunity for probing the role of mRNA localization in *Drosophila*, moving beyond cataloging the diversity of localized mRNAs to a similar understanding of how localization affects mRNA activity.

Keywords: Drosophila melanogaster, mRNA localization, organelle, neuronal differentiation, epithelial differentiation

## INTRODUCTION

Following transcription, mRNAs are regulated at multiple points during their lifetime. This begins in the nucleus where mRNAs undergo selective pre-mRNA splicing, base modification, sequence editing, and directed transport from the nucleus (reviewed in Stapleton et al., 2006; Maas, 2012; Rosenthal, 2015; Meier et al., 2016; Wei et al., 2017; Krestel and Meier, 2018; Schmid and Jensen, 2018; Wegener and Müller-Mcnicoll, 2018). Once exported to the cytoplasm, it is not at all guaranteed that mRNAs will be translated, as many are sequestered away from ribosomes in a non-translating pool (Patel et al., 2016; Standart and Weil, 2018). In terms of the mechanisms that regulate mRNAs in the cytoplasm, microRNAs (miRNAs), RNA interference (RNAi), and similar pathways are relatively wellcharacterized (Chandra et al., 2017; Noh et al., 2018). Regulated mRNA transcripts are often associated with cytoplasmic ribonucleoprotein complexes (RNPs), that can regulate translation, e.g., Stress Granules (Buchan and Parker, 2009) or degradation (Towler and Newbury, 2018) or both, e.g., RNA processing (P)-bodies (Standart and Weil, 2018). It is becoming clear that there are multiple examples of mRNA regulation at the level of translation via direct regulation of ribosome binding or processivity as (reviewed in Abaza and Gebauer, 2008). All of these different mRNA regulatory pathways are found in Drosophila melanogaster. Among these, the phenomenon of directed cytoplasmic restriction of mRNA localization is commonly observed and has been studied during oogenesis and in the early embryo. However, the underlying roles for this process, beyond the formation of the cellularized-blastoderm, remains poorly understood.

#### Identifying Subcellular mRNA Localization in Drosophila

Much of the initial study of mRNA localization in the cytoplasm of Drosophila cells was driven by direct observation of transcript location. Visualization of specific mRNAs first became possible with the adaptation of in–situ hybridization (ISH) techniques for Drosophila where anti-sense probes (either DNA or RNA) hybridize to mRNA targets in fixed cells or tissues (Singer and Ward, 1982). The first examples of detection of RNAs in fly cells used radiolabeled antisense ISH probes on sectioned ovaries or late-stage embryos (Brennan et al., 1982; Hafen et al., 1983; Levine et al., 1983). Non-radioactive methods followed using digoxigenin, biotin, or other hapten UTP conjugates to synthesize ISH probes recognized by antibodies conjugated to alkaline phosphatase or peroxidase (Tautz and Pfeifle, 1989; O'neill and Bier, 1994). The development of practical methodologies for fluorescent in-situ hybridization (FISH) for Drosophila tissues expanded the utility of ISH allowing visualization and three-dimensional spatial reconstruction of mRNA localization within the cell by confocal microscopy (Hughes et al., 1996; Hughes and Krause, 1998, 1999). Later enhancements to FISH protocols, including signal amplification techniques, provided brighter signals facilitating high-throughput screens (Lécuyer et al., 2008; Wilk et al., 2010; Jandura et al., 2017). The utility of FISH was extended again with the development of single molecule (sm) FISH which allows an approximate detection at the resolution of a single mRNA (Femino et al., 2003). Recently smFISH has been successfully adapted to Drosophila cells and tissues (Bayer et al., 2015; Little and Gregor, 2018; Titlow et al., 2018).

However, our ability to observe the phenomenon of mRNA localization has traditionally exceeded our ability to probe the functional role in regulating the mRNA, in terms of translation or stability. Two types of cells feature prominently in past studies of mRNA localization in Drosophila. The first is the oocyte, which develops as a cyst of 16 germline cells, surrounded by epithelium consisting of somatic follicle cells. The second is the fertilized embryo, a coenocyte with multiple nuclei until 2:10 h of development, when membranes enclose individual nuclei into individual cells forming a cellularized blastoderm. In the syncytial embryo the best functionally understood example for mRNA localization is that of anterior bicoid (bcd), that helps establish body polarity, although there are many other known roles in the early embryo and germ cells, (reviewed in Cho et al., 2006; Lasko, 2011, 2012; Weil, 2014, 2015; Laver et al., 2015; Yamashita, 2018).

The widespread prevalence of examples of localized mRNA regulation events in oocytes and the early syncytial embryo prompts two alternative viewpoints. Either the nature of egg formation and syncytial development has been selected for these events, or cytoplasmic mRNA localization is a widespread event in all cell types and was merely easier to detect in relatively large cells like the early embryo.The preponderance of examples of localzied mRNAs and conserved functional requirements for mRNA regulatory proteins during later development suggests that the latter scenario is more likely. Proteins known to regulate mRNA localization in early Drosophila embryo development (e.g., Staufen, Stau) are conserved in metazoans, reviewed in Heraud-Farlow and Kiebler (2014) and Piccolo et al. (2014) or are required in somatic lineages such as neuroblasts that form post-cellularization (St. Johnston et al., 1991; Li et al., 1997; Matsuzaki et al., 1998). Similarly, some mRNAs, localized in germ cells or the early embryo such as: Cyclin B, oo18 RNA-binding protein, Protein kinase, cAMP-dependent, catalytic subunit 1, nanos (nos), or Heat Shock Protein 83 (Hsp83), are expressed during later development or conserved in organisms without a syncytial embryo (Raff et al., 1990; Gavis and Lehmann, 1992; Ding et al., 1993; Lantz and Schedl, 1994; Dubowy and Macdonald, 1998; Subramaniam and Seydoux, 1999; Tsuda et al., 2003). As described below, there has been some past evidence showing that mRNA localization is essential in regulating aspects of specific lineages such as differentiating neuroblasts (Knoblich et al., 1995; Broadus et al., 1998) and reviewed in Martin and Ephrussi (2009) and Medioni et al. (2012). Additional support for a more widespread role for mRNA localization during later development, which in this review refers to the somatic lineages formed post-cellular blastoderm, comes from ongoing FISH screens (Jambor et al., 2015; Wilk et al., 2016). These have now enumerated hundreds of mRNAs with specific localization patterns in a wide variety of cell lineages.

To outline the potential scope of mRNA regulation during later Drosophila development, we first describe the known examples of mRNA regulation during later Drosophila development. We then speculatively extrapolate potential roles for a large number of mRNAs, directly observed as subcellularly localized, during later Drosophila development. Finally, we highlight new methods that promise to enable the future determination of the functional roles for subcellular mRNA localization in the smaller, somatic cells that form the various tissues of the post-blastula embryo, larvae, and the adult.

#### PAST IDENTIFIED ROLES FOR mRNA LOCALIZATION DURING LATER DROSOPHILA DEVELOPMENT

Currently, the most well-characterized examples of functional roles for localized mRNAs during later Drosophila development, come from highly polarized cells such as neurons and epithelia. Like the oocyte and early embryo, the morphology of these cells is highly polarized, and likely facilitates observation of subcellular localization.

#### Localized mRNAs Direct Neural Stem Cell Differentiation

Embryonic neuroblasts (NBs) are neural stem cells that delaminate stereotypically from the ventral nerve cord during later (stage 9) embryonic development (Hartenstein and Campos-Ortega, 1984). NBs divide asymmetrically from stages 9 to 11 with one self-renewing daughter, and a smaller daughter called a ganglion mother cell (GMC). GMCs differentiate at stage 13 into neuronal and glial lineages. During late embryogenesis, a portion of the NBs become quiescent and then during early larval stages neuroblasts re-enter the cycle and begin the second wave of neurogenesis undergoing multiple rounds of asymmetric cell divisions exiting the cell cycle in pupal stages (Homem and Knoblich, 2012). In these cells, mRNA localization is coupled with cell division to direct asymmetric inheritance of transcription factors directing differentiation.

The bazooka (baz) mRNA encodes the Drosophila Par-3 homolog and is localized to an apical cytoplasmic crescent in embryonic NBs, reviewed in Homem and Knoblich (2012). Baz protein is also localized in an apical crescent, but specifically in metaphase NBs. Apical Baz is required for proper orientation of the spindle in mitotic NB cells, and localization failure leads to misorientation of the spindle relative to the apical/basal pole, resulting in mispositioning of the GMCs and defects in a portion of GMC fates (Kuchinke et al., 1998). Prospero protein is asymmetrically localized in NBs and is portioned to the GMCs (Hirata et al., 1995; Knoblich et al., 1995). The prospero (pros) mRNA encoding a transcription factor that defines GMC identity is asymmetrically localized, initially at the apical cortex and then to the basal cell cortex during NB cell division (Broadus et al., 1998). The localization of baz and pros requires Stau and Inscuteable (Insc). Stau binds the pros mRNA 3′ UTR directly. Binding of Stau is required for the basal localization of pros mRNA, but not Pros protein (Broadus et al., 1998). Stau localizes to an apical crescent in interphase NB cells, but during mitosis, Stau is found at the basal cortex. Another basally localized protein, Miranda (Mira), is also required for both Pros protein and pros mRNA localization via interaction with Stau (Schuldt et al., 1998). Insc regulates pros mRNA relocalization from the apical to the basal cortex in late interphase to prophase cells (Li et al., 1997). Notably, Insc mRNA is cortical during interphase yet is found throughout the cytoplasm during mitosis, whereas Insc protein is always localized at the apical cortex of NB cells. In embryonic NBs, Egalitarian (Egl) is required for Insc localization (Mach and Lehmann, 1997). Egl, Bicaudal-D (Bic-D) and the Dynein transport complex function during oogenesis and embryogenesis and in embryonic NBs to localize Insc mRNA (Hughes et al., 2004). Other Insc regulators have also been identified in NBs, including DEAD-box RNA dependent ATPases that control many aspects of RNA metabolism (reviewed in Putnam and Jankowsky, 2013) and Abstrakt (Abs) required for translation of Insc protein but not for Insc mRNA localization in embryonic neural stem cells or NBs (Irion et al., 2004). Ultimately, despite a relatively well-developed mechanistic knowledge of how pros mRNA is localized, the functional role for this localization remains unclear as pros mRNA and protein are localized independently, and the two pathways may redundantly direct GMC fate (Broadus et al., 1998).

Localization of mRNAs and their encoded proteins are also required for establishing NB polarity during larval neural differentiation. Subcellular localization of mira mRNA is required for this process (Bertrand et al., 1998). Using a combination of a MS2 RNA labeling system and nanobody expression to detect protein, misdirection of mira mRNA to nuclear, apical or basal regions, identified two pools of mira mRNA during mitosis (Ramat et al., 2017). One pool localized to the spindle, and the other localized at the basal pole of the NB. When mira mRNA was directed away from the basal pole, there were defects in mitosis (Ramat et al., 2017). Mira protein is co-localized to the basal pole via direct interaction with mira mRNA either directly or through recruitment of further factors. This effect is reminiscent of how oskar (osk) mRNA localization in the oocyte is required for localized translation of the Oskar protein which is then required to maintain osk mRNA localization (Rongo et al., 1995).

Subcellular mRNA localization directs several aspects of embryonic and larval NBs by supporting the establishment of polarity that is essential for NB self-renewal and correct differentiation of GMCs into neurons or glia. The same process is also essential for the correct differentiation of the adult nervous system. As the investigation into the molecular machinery required for NB polarity and asymmetric division continues, it is likely that additional contributions of mRNA localization will be identified as essential. For example, mRNA localization events could regulate how proteins interact with the actomyosin skeleton to direct spindle orientation, which is required for proper NB division and differentiation. New protein players in these processes (e.g., Moesin, Moe) have been identified (Abeysundara et al., 2018), but a similar role of localization of the encoding mRNAs in these processes have not yet been examined.

### LOCALIZED mRNAs REGULATE DEVELOPMENT AND PLASTICITY OF DENDRITES AND AXONS

Neuronal axons can be extremely long, and in many different organisms localized mRNAs have been identified, that regulate corresponding local translation of protein production (reviewed in Rodriguez-Boulan and Powell, 1992; Piper and Holt, 2004; Yoo et al., 2010; Jung et al., 2012; Sahoo et al., 2018). Local translation is thought to facilitate rapid cellular response for events like neuronal circuit-based local remodeling of dendrites and synapse numbers, as the time it takes mRNAs to emerge from the nucleus, would drastically slow down a remodeling response (Medioni et al., 2012). Previous ex vivo and in vivo studies in growing Xenopus and mouse axons have demonstrated a clear link between axonal mRNA localization, local translation and the direction of axon growth (Medioni et al., 2012). Remodeling of the neurons in terms of pruning, regrowth and branching of axons is required for the refinement of neural circuits governing larval and adult behaviors (Medioni et al., 2012). While it has been assumed that localized mRNA regulating local protein translation are also conserved in Drosophila axons and dendrites, to date there have been relatively few studies confirming this, some of which are highlighted below (Macdonald and Struhl, 1988; Brechbiel and Gavis, 2008; Misra et al., 2016) reviewed in (Rodriguez-Boulan and Powell, 1992).

One Drosophila cell type, where spatiotemporal mRNA localization has been shown to regulate changes in differentiated axons, is the mushroom body γ neurons of the larval brain. Mushroom bodies play a role in olfactory learning and memory (Heisenberg et al., 1985; Heisenberg, 2003). During larval development and pupal metamorphosis, mushroom body axonal branches are pruned selectively. These subsequently regrow to form adult specific branches (Lee et al., 1999; Watts et al., 2003). IGF-II mRNA-binding protein (Imp) was identified by mutagenesis as important for axonal remodeling or regrowth of axons that have been pruned, but not in their initial axon growth (Medioni et al., 2014). Using live imaging of pupal brains, it was observed that GFP–Imp is localized to specific RNP particles that move actively via microtubule-dependent transport within axons undergoing remodeling (Medioni et al., 2014). Imp selectively associates with the 3′UTR of chickadee (chic) mRNA (encoding the fly Profilin homolog) which localizes to growing γ neurites (Medioni et al., 2014).

The translational repressor proteins Nos and Pumilio (Pum) are required for germline development and establishing abdominal polarity in the early embryo (Asaoka-Taguchi et al., 1999; Cho et al., 2006). Gain and loss of function studies during later development show that both Nos and Pum are required to regulate dendrite branching in specific subsets of larval dendritic arborization (da) neurons including Class IV (not class I or II) in the peripheral nervous system (Ye et al., 2004). The shape, branching patterns and growth of the dendrites are correlated with the activity of the neuron. During development, these neurons undergo morphogenesis to form extensive arborization trees, providing easily observed phenotypes and thus, are used extensively for forward-genetic screens in flies. Direct imaging showed that nos mRNA is localized not only in the cell body but also in RNPs which are distributed along the dendrite and axon processes of class IV da neurons in a process mediated by recognition of sequences in the 3′ UTR (Brechbiel and Gavis, 2008). Live cell imaging of the nos mRNA showed that dynein machinery components are required for transport of nos RNP particles in the dendrites (Xu et al., 2013). Also, RBPs Rumpelstiltskin (Rump) and Osk, known to be required for localization of nos mRNAs in oocytes, are also required in the formation and transport of nos RNP particles in dendrites (Xu et al., 2013).

These known examples confirm that mRNA localization is required in both early development as well as later, during morphogenesis of differentiated neurons. Intriguingly, many of the localized mRNAs and their localization factors appear to be the same in these two systems. Supporting the conclusion that a common regulatory strategy may be shared between early development and neurogenesis, an RNA interference screen for RNA regulatory proteins that affects dendrite morphogenesis in Class IV da neurons identified some proteins and a translation factor previously shown to regulate maternal mRNA localization in embryos and oocytes (Olesnicky et al., 2014). Further investigation into which and how mRNAs are implicated in the dendrite morphogenesis will be a future area of interest and study.

Most of the currently known examples of localized mRNA translation in Drosophila neurons largely mirror those in mammals supporting an assumption that these events are conserved (reviewed in Medioni et al., 2012). The localization of mRNAs is essential for the proper axon guidance, formation and remodeling of dendrites to form neural circuits throughout development, as discussed above. The proper localization of mRNAs are also required for memory and learning in both flies and humans (reviewed in Greenspan, 2003; Agnès and Perron, 2004; Puthanveettil, 2013; Olesnicky and Wright, 2018). However, genetic screens in flies are starting to identify additional functional roles for proteins that likely have a role in mRNA localization (for example Song et al., 2007; Martin and Ephrussi, 2009; Hayashi et al., 2014; Misra et al., 2016).

### LOCALIZED mRNAs ARE REQUIRED AT NEUROMUSCULAR JUNCTIONS

The neuromuscular junction (NMJ) is a highly specialized region where motor neurons synapse to specific muscle targets (Menon et al., 2013). Formation of new synapses is required during early neuronal development, and synapse growth requires targeting of specific mRNAs to the NMJ in addition to the localized recruitment of proteins and organelles (Medioni et al., 2012). It is also thought that the localized translation of mRNAs underlies plasticity at synapses (Kindler and Kreienkamp, 2012; reviewed in Jung et al., 2012). Drosophila larval NMJs have emerged as a powerful in vivo model to study the role of localized mRNAs and localized translation in synaptic development and plasticity. In Drosophila larvae there are 32 motor neurons per abdominal hemisegment, and the NMJ is quite large and easily imaged. Larval NMJs are composed of structures called synaptic boutons that are arranged like beads on a string and exhibit developmental and functional plasticity while being stereotypically organized (Keshishian et al., 1996).

Localized mRNAs and localized translation of mRNAs in the motor neuron or NMJ are required for both the development of synapses and the plasticity of the NMJ presynaptically and post-synaptically. This is mediated by RNPs that are transported along neuronal processes in response to stimuli or development. How RNPs reach the correct location at the NMJ after exiting the nucleus remains an open question. Several groups have shown that RNPs generally move on dynein or kinesin motors and there are also some studies that implicate actin filaments or actinbased motors (Doyle and Kiebler, 2011; Medioni et al., 2012). Studies using genetic or proteomic approaches have identified some mRNA targets and RNA binding proteins at the NMJ (for example Raut et al., 2017) and reviewed in Hörnberg and Holt (2013), but there are likely many more required in this dynamic structure.

The fly NMJ has also been informative in understanding the underlying mechanisms required for localizing mRNAs in neurons including the role of the actin cytoskeleton (Packard et al., 2015). The actin-binding protein Muscle-specific protein 300 kDa (Msp300, also known as Syne1) is required to localize specific mRNAs post-synaptically. mRNAs including par-6 and Magi mRNA are enriched at the postsynaptic region of the NMJ while others such as discs large 1 (dlg1) are not. In Msp300 mutants, there is a loss of localization of par-6 and magi but not dlg1. This is due to defective transport of the par-6 and magi mRNAs as opposed to a defect in export from the nucleus or stability of the mRNA transcripts (Packard et al., 2015). Msp300 was demonstrated to be required for maturation of the synaptic boutons. Msp300 protein is organized into long striated filaments termed "railroad" tracks that extend from the nucleus to the edge of the NMJ (Packard et al., 2015). This organization is thought to work in conjunction with an unconventional myosin motor protein Myosin 31DF (Myo31DF) for proper localization of these postsynaptic mRNAs (Packard et al., 2015).

Similar to that which occurs in neurons and neuronal stem cells, the Drosophila NMJ represents an excellent example of the conservation of mRNA localization events between human and Drosophila (Vazquez-Pianzola and Suter, 2012). Additionally, the mRNA localization events in NMJs repeat themes from earlier developmental stages where localized proteins and mRNA targets the functioning of oocytes, and early embryogenesis are also active during later developmental stages. Again, because of its relatively large size and polarized morphology, the Drosophila NMJ is an elegant, easily visualized, and genetically amenable system by which both pre- and postsynaptic roles of localized mRNA an RNA binding proteins can be analyzed. In summary, localization of mRNAs or RNA binding proteins is an essential part of many aspects of neuronal differentiation and function during later Drosophila development. While many localized mRNAs with localized translation are known, the molecular mechanisms related to the role of this local translation or the mechanisms that recruits the mRNA to specific cell domains, have yet to be discerned. Further investigation into the specific localization and function of these players, should provide further insights into the formation and plasticity of the neuronal system.

### KNOWN ROLES FOR mRNA LOCALIZATION IN EPITHELIAL CELL LINEAGES

The role of mRNA localization in later Drosophila development is far less characterized in cell types other than neurons. The numerous examples of neuronal mRNA localization may be an overrepresentation and may reflect critical morphological features of highly polarized neurons and neuronal stem cells, such as large size and polarization which facilitate the discovery of localized mRNAs. Fascinatingly, many protein regulating mRNAs in oocytes, early embryos and neurons are expressed in multiple lineages during later development, (Brown and Celniker, 2015) making it likely that mRNA targets are regulated by cytoplasmic localization in other cell lineages that compose the majority of gastrulating embryos, larvae, and adults.

#### Localized mRNAs Encoding Proteins Involved in Establishing Epithelial Cell Polarity

Many other proteins involved in establishing apical/basal polarity in epithelial cells have localized mRNAs. Cell junctions are multi-protein structures localized to the apical-lateral or lateral membrane that are best characterized in epithelial cell lineages (Tepass et al., 2001; Cavey and Lecuit, 2009; Tepass, 2012). Atypical protein kinase C (aPKC), Crumbs (Crb), Stardust (Sdt), Baz, and Patj help establish the apical plasma membrane domain and have been shown to interact directly in various cells of epithelial lineage (Tepass et al., 1990; Bhat et al., 1999; Bachmann et al., 2001; Hong et al., 2001; Médina et al., 2002; Nam and Choi, 2003; Hutterer et al., 2004; Sen et al., 2015). Similar to what occurs in neural stem cells, baz mRNA is restricted to a narrow apical domain in the cytoplasm of epithelial cells in late-stage embryos (Kuchinke et al., 1998). A similar pattern of apical mRNA localization was observed with sdt mRNA. The mechanism of this apical transport of sdt mRNA includes alternative splicing of sdt to include an exon which directs apical transport in a dynein-dependent manner (Horne-Badovinac and Bilder, 2008). Notably, a dynein-dependent mechanism also targets the crb mRNA to the apical region of the epithelial-lineage somatic cells (follicle cells) that surround the developing oocyte (Li et al., 2008). For polarized epithelial cells, mRNA localization does seem to have a functional role. Work using mammalian cells also suggests that there may be specialized regulation centers that coregulate mRNAs that are encoding junctional proteins. Recently, a role for localized translation of collections of mRNAs was restricted to small cytoplasmic regions above nascent adhesion sites in mammalian amoeboid cell lineages. These were termed spreading initiation centers (SICs) (Bergeman et al., 2016). It will be particularly interesting to see if there is similar co-regulation of mRNAs encoding adhesion complex proteins in Drosophila embryo and larval cells and if these events are conserved in other organisms.

### APICAL LOCALIZATION OF mRNAs ENCODING SECRETED PROTEINS

mRNAs encoding secreted proteins are directed to the ER by both translation dependent signal peptide-mediated and translationindependent pathways (reviewed in Hermesh and Jansen, 2013; Cui and Palazzo, 2014). However, there is evidence that mRNA localization is critical for regulating signaling events between epithelial cells, independently of SRP-mediated trafficking to the ER. In Drosophila epithelial cells, the mRNA encoding wingless (wg) is directed to the region just under the apical plasma membrane, within the cytoplasm of ectodermal cells, in stage 4–6 embryos. This mRNA localization is required for the production of an active Wg signaling protein (Simmonds et al., 2001). Notably, wg mRNAs are associated with punctate cytoplasmic RNP particles that are transported to the apical cytoplasm in a dynein-dependent mechanism (Wilkie and Davis, 2001; Najand and Simmonds, 2007). The cis-acting signals for wg mRNA localization and anchoring are found within the 3′UTR of the mRNA directing aggregation of multiple wg mRNAs, which appears as discrete cytoplasmic foci (Simmonds et al., 2001; Najand and Simmonds, 2007; Dos Santos et al., 2008). The regulation of translation and localization of wg mRNA are not linked directly as non-translatable wg mRNAs, and reporter genes fused to the wg 3 ′UTR are localized equally, as well as mRNAs with an intact open reading frame (Simmonds et al., 2001; Najand and Simmonds, 2007). The requirement for apical localization of the wg mRNA also calls into question where the translated protein enters the ER/Golgi complex for secretion. There are examples of apically localized sub-regions of the ER in highly polarized cells that also have multiple examples of localized mRNAs such as Drosophila neuroblasts (Smyth et al., 2015; Eritano et al., 2017), but the coincidence of wg mRNA and specialized ER domains have not yet been studied. Thus, similar to what has been shown for neurons, there is evidence that mRNA localization has a functional role in polarized epithelia in Drosophila embryos after they cellularize. However, how these mRNA localization events regulate the encoded proteins remains mostly elusive.

### The Present State of Affairs: There Are Many Different Localized mRNAs in Many Different Cell Lineages During Later Drosophila Development

Based on the few known examples, the roles of mRNA localization have been found in most of the somatic cells that make up the gastrulating embryo and were not considered to be that prevalent in larval and adult tissues. However, in the past few years, the number of known localized mRNAs in later development has increased significantly. Systematic screens have identified localized mRNAs in numerous embryo somatic cell lineages, larval gut, imaginal discs, salivary glands and adults, which has significantly changed how mRNA localization is viewed in terms of Drosophila development. Firstly, most mRNAs manifest some pattern of subcellular localization in one or more cell lineages. Secondly, localized mRNAs encode a wide variety of proteins with diverse functions, far more than those few that have been previously characterized.

### DETERMINING THE EXTENT OF mRNAs LOCALIZATION DURING LATER DROSOPHILA DEVELOPMENT

The advent of aptamer tags based on specific mRNA hairpin motifs facilitated the tracking of mRNAs in live cells (reviewed in Weigand and Suess, 2009). Different RNA aptamers recruit specific binding proteins, which then fuse to fluorescent proteins that demark localization of tagged mRNA within the cytoplasm. The most common of these is MS2 tagging, using an RNA motif bound by the MS2 coat protein (Bertrand et al., 1998), reviewed in (Heinrich et al., 2017a). A transgene encoding MS2 coat-protein GFP, fused to a nuclear localization signal, is coexpressed in cells with MS2 tagged GFP. Transgenes expressing mRNAs with multiple MS2 aptamers (e.g., 24x) that recruit MS2-GFP, prevent it from entering the nucleus and marks the tagged mRNAs. These techniques have been adapted for Drosophila (reviewed in Abbaszadeh and Gavis, 2016) and have facilitated screening based on direct observation of mRNA localization in live neurons. Misra et al. recently performed such a screen in Class IV da neurons using semi-random transposon insertion of an MS2 RNA aptamer into the genome to track the encoded mRNAs. 541 lines were screened, and 47 genes had transcripts that are subcellularly enriched in class IV da neuron processes (Misra et al., 2016). Many of the encoded proteins were previously associated with subcellularly localized mRNAs including CG9922, coracle (cora), fatty acid binding protein (fabp), scheggia (sea), High mobility group protein D (HmgD), and schnurri (shn) (Misra et al., 2016).

An alternative to insertional screens is direct observation of mRNA localization by ISH. In the past few years, several groups have provided a significant resource to the fly community via large-scale FISH screens that assay the localization of thousands of different mRNAs in cells of late-stage embryos, larvae and adults (Olesnicky et al., 2014; Jambor et al., 2015; Wilk et al., 2016; Zhang et al., 2017). Many of these are publicly available in the searchable Fly-FISH database (http://fly-fish.ccbr. utoronto.ca/) and the Dresden Ovary Table (DOT) database (http://tomancak-srv1.mpi-cbg.de/DOT/main.html) (Wilk et al., 2013, 2016; Jambor et al., 2015). Of particular interest, 167 mRNAs have different localization patterns in different cell types or at different times during development, suggesting that localization is dynamic and is cell lineage dependent (Wilk et al., 2010). Examination of this relatively unbiased screening data suggests that rather than being a rare event, at least half of the mRNAs are restricted in their distribution within the cell. For example, the Fly-FISH database reports localization data including approximately 6,800 mRNAs expressed in postsyncytial (past stage 4) embryos, and larval tissues via lowmagnification FISH images (Wilk et al., 2010). Of these, 3509 (52%) are annotated as having a subcellular localization pattern in embryonic or larval tissues (Wilk et al., 2016) (**Figure 1**). Notably, one of the most commonly annotated patterns of localization was "cytoplasmic foci" (Wilk et al., 2016), which may suggest incorporation into one or more cytoplasmic RNPs or organelles. The localization data from these screens as well as examples curated from the literature has been collated in searchable databases such as RNALocate (http://www.rnasociety.org/rnalocate), (Zhang et al., 2017). However, the precise number of unique localized mRNAs is hard to determine, as different groups use variable language and ambiguous or nonstandard gene annotations.

### Many Localized mRNAs Cluster by Protein Function

From the above databases, 4049 unique genes are annotated as "subcellularly localized" in cells in post syncytial embryos (stage 4+), larva and adults. To define the scope of mRNA localization during later development, commonalities of the proteins encoded by localized mRNAs were identified. To disambiguate differences in gene names reported by different screens, mRNA lists were validated using the "ID converter" function of FlyBase (http:// flybase.org/convert/id) (Gramates et al., 2017). Non-proteincoding genes were eliminated from the combined list and ambiguous gene names corrected manually. The resulting list of 3549 unique genes was clustered by gene ontology (GO) terms based on the "cellular component" of the proteins encoded by each (Ashburner et al., 2000; Tweedie et al., 2009; Consortium, 2017). A PANTHER Overrepresentation Test (version 13.1) identified GO terms that were enriched by proteins encoded by localized mRNAs, compared to their frequency in the whole genome (**Table 1**). The proportional overrepresentation of GO terms relative to the whole genome was visualized using REVIGO (**Figure 2**) (Supek et al., 2011).

As expected, over 200 localized mRNAs encode proteins destined for the plasma membrane. Many of these mRNAs encode secreted proteins or proteins that are integral to

membranes that are trafficked for translation into the ER. However, the localization patterns of these mRNAs included both "basal" and "apical" as well as "cytoplasmic foci," suggesting that there are different modes of regulation. Mirroring the known roles for mRNA localization in regulating junctional proteins, another overrepresented superset of terms encompassed cytoskeletal elements including "apical cytoskeleton" and "junction." Of particular interest was that a significantly overrepresented cluster were proteins trafficked to specific organelles including the mitochondria, lysosome, centrosomes/spindles, and peroxisomes.

The role of localized translation at organelles is an emerging area of interest in other organisms and is likely similarly conserved in most cells, including later development. The concept of nucleic acids being targeted to specific organelles to direct protein translation has been suggested in several organisms (reviewed in Weis et al., 2013). Except for mRNAs that encode secreted proteins and are directed to the ER, few functional examples of mRNAs targeted to specific organelles are known in Drosophila. Remarkably, the GO-enrichment analysis of mRNAs localized during later Drosophila development indicated the prevalence of two terms not previously strongly associated with mRNA localization events: peroxisomes and centrosomes. The association of localized mRNAs cells with centrosomes is particularly interesting in light of how mRNA localization contributes to defining spindle orientation and differentiation of neuronal stem cells, as described above. However, in some cases the observed localization pattern of the mRNA correlates directly with the organelle, suggesting local regulation of translation. In other cases, the lack of correlation of the mRNA location with the organelle, suggests other regulatory events. Below we consider three examples of potential roles for mRNA localization in regulating cell cortex/junctional mediated polarity, peroxisome or centrosomes during later Drosophila development.

#### MANY LOCALIZED mRNAs ENCODE CELL CORTEX AND JUNCTION PROTEINS

Given the previous demonstrations of a functional role for transcript localization for the junctional proteins baz, std, and crb, it is perhaps not particularly surprising that the mRNAs encoding other proteins involved in the establishment or maintenance of cellular junctions are also localized. The mRNA encoding Moe is concentrated apically in follicle cells at stage 10 (Jambor et al., 2015). Moe is the single fly ortholog of Ezrin-Radixin-Moesin (ERM) proteins that link the apical membrane to the cortical actin cytoskeleton (Solinet et al., 2013). There is some evidence that Moe may also regulate mRNA export from the nucleus (Kristó et al., 2017). More notably, Moe interacts directly with other proteins encoded by mRNAs localized apically in cellularized embryos, including Crb and Patj (Médina et al., 2002). The Moe protein has also been reported to interact with Eb1, Chic (profillin) (Medioni et al., 2014), and Chd64 a transgelin 2 ortholog (Guruharsha et al., 2011). The mRNAs encoding each of these proteins TABLE 1 | PANTHER analysis of enrichment of GO terms for 3549 mRNAs shown to be localized post-syncytial-stage (+2:10) of embryo development compared to the distribution of the same GO terms over the entire *Drosophila* genome.


are localized apically in follicle cells (Jambor et al., 2015). Notably, Chd64 mRNA shows basal enrichment in stage 6, 7 embryos (Wilk et al., 2016), implying a potential regulatory mechanism sequestering translation away from the apical membrane. The Patj mRNA is localized adjacent to the apical cell junction in the epithelial cells in stage 4–17 embryos (Wilk et al., 2016). Patj interacts with Par-6, Sdt, and aPKC, again all encoded by mRNAs localized apically. Similarly, the expanded (ex) mRNA is localized apically in stage 2–10 follicle cells (Jambor et al., 2015). Ex is an EPB41/Protein 4.1+ERM (FERM) domain protein that localizes to apical cellcell junctions (McCartney et al., 2000). Merlin (Mer) is an apically localized regulator of the adherens junction (Lajeunesse et al., 1998) and the Mer mRNA is localized apically in follicle cells (Jambor et al., 2015). The mRNA encoding Dlg1 described earlier is also localized at the membrane in late-stage embryos (Wilk et al., 2016).

Many mRNAs that have roles in embryo epithelial cell or follicle cell polarity including the organization of the cellular cortex are localized in Drosophila. The mRNA encoding Drosophila Shroom is localized apically in stage 2–8 follicle cells and cellularized embryos (stage 6–9) (Jambor et al., 2015; Wilk et al., 2016). Shroom encodes two isoforms, both proteins localized apically, one to the adherens junction and one to apical membrane (Bolinger et al., 2010). Shroom proteins regulate cell morphology in animals, by acting on the actin/myosin network during gastrulation (Lee et al., 2009). Similarly, the Dystrophin (Dys) mRNA is restricted to the cortex of embryo epithelia, and similar cortical enrichment is seen in follicle cells, somatic cells and border cells (stage 9–10) (Jambor et al., 2015; Wilk et al., 2016). Dys is best known for its role in anchoring membrane/cytoskeletal elements in contractile muscle (Constantin, 2014). However, Drosophila Dys is also involved in establishing cellular polarity in imaginal discs and oocytes (Dekkers et al., 2004). Notably, a similar role for Dystrophin has also been shown for mammalian muscle stem cells (Keefe and Kardon, 2015). Another mRNA with cortical localization in the embryo is Tropomyosin 1 (Tm1), encoding a protein involved in muscle contraction, oogenesis, and regulation of osk mRNA localization (Erdélyi et al., 1995; Veeranan-Karmegam et al., 2016; Gáspár et al., 2017). Several other actin-interacting proteins have mRNAs with cortical localization in follicle cells including Jitterbug (Jbug, Drosophila filamin) (Jambor et al., 2015). However, unlike what has been shown previously, co-localization

of mRNAs encoding these junctional proteins are not facilitating local protein translation. In the case of the Jbug and short stop (shot) encoding Drosophila Spectraplakin, both mRNAs localized basally in Stage 6–9 embryo ectodermal cells (Wilk et al., 2016), away from the region where the protein is localized. Other apically localized mRNAs that encode proteins that interact with cytoskeletal elements include Scraps (Scra, orthologous to Anillin), the Drosophila ortholog of Facin Actin-bundling protein 1 Singed (Sn) and α-actinin (Actn) (Jambor et al., 2015; Wilk et al., 2016).

#### PEROXISOMES

Peroxisomes are cytosolic organelles involved in lipid metabolism and detoxifying reactive oxygen species (Smith and Aitchison, 2013). The Peroxin (Pex) genes encode proteins involved in peroxisome biogenesis, a process that includes vesicles budding the ER that fuse and mature by importing peroxisome enzymes (Pex1 and Pex5) or peroxisome fission (Pex11) and are conserved in Drosophila (Baron et al., 2016). It has been proposed that localization of Pex mRNAs to peroxisomes may direct translation into the peroxisomal membrane tethering them in a fashion reminiscent of ERprotein targeting (Haimovich et al., 2016a). Pex proteins are highly conserved between yeast; humans and Drosophila (Mast et al., 2011; Faust et al., 2014; Baron et al., 2016). Traditional FISH screens identified multiple mRNAs encoding peroxisomeassociated proteins as localized in various cell lineages in late-stage Drosophila embryos (**Table 2**). However, correlating a functional role between the location of mRNAs and their product is difficult as encoded proteins involved in the same process are trafficked to different cytoplasmic regions (e.g., Pex16 and Pex19). Particularly interesting is the pattern of Pex5 mRNA encoding the cytoplasmic transporter that directs proteins to the peroxisome. Pex5 mRNA as it was observed in foci, surrounding the nucleus while the mRNA encoding the Pex5 recycling protein Pex1 is restricted apically. In addition to the Peroxins, there are also 20 localized mRNAs encoding peroxisome resident enzymes including Catalase (Cat, **Table 2**) (Wilk et al., 2016).

Studies in yeast have also shown that a significant number of peroxin mRNAs localize to peroxisomes or other peroxisomeassociated organelles (e.g., Pex3 and the ER). In yeast, the Puf5 RNA binding protein, related to Drosophila Pum is required for Pex14 localization to the peroxisome (Zipor et al., 2009). Puf5p also binds the yeast-specific Pex22 mRNA (Gerber et al., 2004). It has been proposed that the association of mRNAs encoding cytoplasmic Pex proteins with peroxisomes foster local translation and Insertion incorporation of peroxisomal membrane proteins (Weis et al., 2013; Haimovich et al., 2016a). However, other mRNAs associated with the exterior membrane of peroxisomes isolated from mouse liver were also identified (Yarmishyn et al., 2016). This included mRNAs encoding Pex6,

#### TABLE 2 | Localized mRNAs encoding peroxisome proteins Wilk et al. (2013).


Pex11a/b, and Pex19, and Peroxisomal Membrane Protein 70 kDa (Pmp 70) as well as mRNAs encoding homologs to several peroxisomal localized enzymes including Hmgcs1, Acaa1a/b, Hsd14b4, Paox, Nudt7, Acox, Baat, and Acsl5 as peroxisome associated (Yarmishyn et al., 2016). However, FISH was used to confirm peroxisome localization of only one of these mRNAs, Hmgcs1 (Yarmishyn et al., 2016). While the prevalence of mRNA localization of peroxisome mRNAs is striking and suggests a functional role, the localization pattern of these mRNAs encompasses apical and basal restriction, perinuclear patterns and cytoplasmic foci in embryo ectoderm and various larval tissues (Wilk et al., 2016). The conservation of the phenomenon of Pex mRNA localization in Drosophila provides support that this event may have functional consequences during peroxisome biogenesis, fission or steady state homeostasis.

#### CENTROSOMES

As described above, there are several known roles for mRNA localization, to define the orientation of the mitotic spindle in Drosophila neural stem cells. However, systematic FISH assays suggest that mRNAs encoding several components of the centrosome or spindle are themselves localized, suggesting a more direct role in mRNA localization. Centrosomes are found

exclusively in metazoan cells (Bornens, 2012). Centrosomes encapsulate centrioles in an electron-dense pericentriolar material (PCM) of dynamic composition and size (Brito et al., 2012). During interphase, the centrosome acts as the primary cellular microtubule-organizing center (MTOC) involved in cellular trafficking, motility, adhesion, and polarity, while during mitosis, they help establish the spindle. Following mitosis, the centrosome contains both a mature centriole (mother) and a newly formed immature centriole assembled during the previous cell cycle (daughter). Assembly of the daughter centrosome occurs during S-phase, through recruitment of PCM proteins to the daughter centrioles (Gogendeau and Basto, 2010; Nigg and Stearns, 2011; Brito et al., 2012; Habermann and Lange, 2012; Mahen and Venkitaraman, 2012).

A functional role for recruitment of mRNAs to the centrosome/spindle has been posited and then subsequently discounted several times. The first general reports of nucleic acids within the centrosome were made in the mid-twentienth century (Stich, 1954; Ota and Shimamura, 1956; Rustad, 1959; Zimmerman, 1960; Ackerman, 1961; Hartman et al., 1974; Dippell, 1976; Moyne and Garrido, 1976; Zackroff et al., 1976; Rieder, 1979; Snyder, 1980). RNA as a potential spindle or centrosome component was first described 40 years ago (Heidemann et al., 1977; Peterson and Berns, 1978). However, lacking functional data, these findings were discounted as contamination. The most compelling empirical support for a functional role for mRNA localized at the spindle comes from Ilyanassa (snail) embryos (Lambert and Nagy, 2002; Kingsley et al., 2007). Similarly, localization of mRNAs encoded by centrosome genes was observed in Spisula solidissima (clam) oocytes (Alliegro et al., 2006, 2010; Alliegro and Alliegro, 2008). Finally, cytoplasmic mRNA regulation is required for normal spindle pole body function in yeast, although it is not known if this is required generally or locally at the MTOC (Unger, 1977; Volpe et al., 2003; Sezen et al., 2009).

Seventy-one Drosophila mRNAs localized to the spindle/centrosome in late stage-embryos or follicle cells encoding centrosome, spindle, or centriole associated proteins. Notably, some were annotated (**Table 3**) with a centrosome or spindle localization pattern (e.g., Centrocortin, Cen and Girdin), or a perinuclear pattern which would encompass centrosomes (α-Tubulin at 84B, α-Tub84B, pavarotti, pav, centrosomin, Grip91, spindle defective 2, spd2, and γ -Tubulin at 37C). However, some of these mRNAs (α-Tub84B, Cen, Girdin, pav, spd-2) as well as others, localized to cytoplasmic foci (Gamma tubulin ring protein 91, Kinesin-like protein at 10A, non-claret disjunctional, Spindle assembly abnormal 6, and scrambled). The annotation of cytoplasmic foci could encompass centrosomes but could equally encompass other destinations including regulatory RNPs. Notably, the stages or tissues where these were annotated varies considerably (**Table 3**), implying the potential for developmental regulation as well. The Drosophila centrosome proteome is well-characterized facilitating direct correlation of localized mRNAs to centrosomally localized proteins. Müller et al. identified 24 known and 227 previously unknown centrosome-associated proteins via mass spectrometry (Muller et al., 2010). Notably, the mRNAs encoding Aurora A and Polo kinases, involved in regulating spindle formation/mitosis, were localized to cytoplasmic foci and for Polo in a perinuclear pattern (Jambor et al., 2015; Wilk et al., 2016). With the high degree of correlation between centrosome proteins and centrosome/perinuclear localized mRNAs, a role for local regulation of translation for functions related to centrosomes in Drosophila is an attractive hypothesis.

### Looking to the Future: Drosophila Is Well-Positioned to Advance Understanding of the Role of mRNA Localization During Later Development

The Drosophila oocyte and early embryo provided a wealth of knowledge regarding the prevalence, regulation and functional roles for localized mRNAs during early development. What remains to be determined is if this regulatory event is similarly functionally prevalent during later development. It is known that localized mRNA have regulates polarized neural lineages. It will be interesting to see what roles localized mRNAs have in other stem-cell populations and the various other polarized cell lineages required for development into the adult form. The examples considered above are only a sample of the thousands of mRNAs annotated as localized in one or more cell types during later stage Drosophila development (**Figure 1**), yet they are indicative of the existence of yet-to-be-discovered regulatory examples. The current challenge in the field is to now to determine which of the long list of transcripts that show subcellular localization, represents functional regulatory events.

Unfortunately, in most cases, the phenomenon of localization has not been linked to the effect of mRNA translation or degradation. It is possible to assay the translation state of a specific mRNA by determining the presence or absence of ribosomes. Traditionally, the stage of translation has been tested by profiling the mRNAs associated with purified polysomes (reviewed in Chassé et al., 2017; Seimetz et al., 2018), but this runs the risk of not detecting specific local differences in translation. Recently, several new methods have been reported that link single molecule RNA (smRNA) imaging to detection of ribosomes on that specific mRNA. The first of these, "translating RNA imaging by coat protein knockoff " (TRICK) was shown to be viable in Drosophila oocytes (Halstead et al., 2015). TRICK detects the initial passage of ribosomes along the ORF of an mRNA expressed specially constructed reporter. These mRNAs can be individually tracked in live cells by 24x MS2 aptamers in the 3′UTR. The ribosome passage will displace fluorescent protein reporters associated with 6xPP7 aptamer sequences cloned in frame to the protein sequence. The fluorescent PP7 binding protein was displaced by the passage of ribosome along the ORF of the reporter. In 2016, several groups published a combination of methods combining mRNA aptamer-based detection of single mRNAs in living cells with different methods to detect ongoing protein translation (Morisaki et al., 2016; Pichon et al., 2016; Wang et al., 2016; Wu et al., 2016; Yan et al., 2016). However, the utility of these methods to study gastrulating embryos or dissected larval tissues has not yet been established as they all rely on single-molecule imaging using microscope TABLE 3 | Localized mRNAs encoding centrosome proteins (Wilk et al., 2013).


techniques developed for relatively thin tissues or individual cells. These aptamer-tagging methods should be approached with caution, however. Recently, several groups have suggested caution in interpreting the localization or degradation of mRNAs including these tags. There has been a considerable back-andforth regarding the consequences of introducing aptamer tags that would recruit a large protein complex (e.g., multiple MS2- GFP) affecting mRNA stability and localization (Garcia and Parker, 2015, 2016; Haimovich et al., 2016b; Heinrich et al., 2017b). Recently, the Singer laboratory has developed a modified form of the MS2 aptamer that should allow for more "normal" recruitment to mRNA processing bodies or mRNA localization (Tutucci et al., 2018). These live-cell methods to track mRNAs can be coupled to complementary methods that image newly synthesized proteins include fluorescent non-canonical amino acid tagging (FUNCAT) using a modified methionine analog (Tom Dieck et al., 2015) or tetracysteine (TC) motifs that bind biarsenical fluorescent dyes (Rodriguez et al., 2006).

Alternative approaches to tracking individual mRNAs in live cells employ aptamers that bind and induce fluorescence of various chemicals (e.g., Broccoli or RNA Mango) have been developed. These would not suffer from the effect of recruiting additional protein complexes to an mRNA (Paige et al., 2011; Dolgosheina et al., 2014; Filonov et al., 2014; Autour et al., 2018). One of the newest of these chemical/aptamer systems that shows promise is Riboglow, a riboswitch based system that recruits a Cobalamin + fluorophore combination (Braselmann et al., 2018). Cobalamin effectively quenches the fluorophore until bound by the RNA target. While Riboglow has shown promise with respect to signal-to-noise, it requires bead loading to get the detection reagent into cells, which may limit use in Drosophila late-stage embryos and larval tissues. Overall, the potential advantage of chemical/aptamer systems over aptamer/protein-FP combinations is that a chemical/aptamer complex is significantly smaller than a fluorescent protein/aptamer complex. A second advantage is that with chemical/aptamer pairs, fluorescence is only induced upon target binding. The major disadvantage is that unlike MS2 or similar systems, the fluorescent signal, especially using single aptamers is relatively weak. Thus, these work well for relatively highly expressed RNAs like rRNAs but have not yet been shown to be practical for relatively less common mRNAs, nor would they be sufficient for imaging mRNA localization within complex tissues. Similar to aptamer-based live-cell mRNA detection; these methods can also be coupled with complementary detection of newly translated protein.

All of these methods described above, correlating translation to a localized mRNA, depend on expressing transgenes that express highly modified mRNAs with multiple different inserted motifs, making them relatively impractical for high-throughput approaches. However, a method to correlate ribosomes and mRNAs expressed from endogenous, unmodified genes in fixed cells based on smFISH to ribosome RNAs, has recently been described that would be practical for high-throughput use in whole fly embryos analogous to the FISH screens described above. The FLorescent Assay to detect Ribosome Interactions with mRNA (FLARIM) (Burke et al., 2017) is an extension of the single molecule hybridization chain reaction (HCR), (Shah et al., 2016). This method uses two different smFISH probe sets detecting the ORF and the 18S rRNA marking the ribosome (Burke et al., 2017). FLARIM uses smHCR for smRNA detection of the target and 18S RNA detection, but other smFISH such as quantitative Forced InTeraction (qFIT) or other lowcost FISH-based procedures should also be similarly effective (Gaspar et al., 2017). Detection of localized mRNA decay is also possible, via co-localization of specific transcripts with known cytoplasmic RNPs, (reviewed in Towler and Newbury, 2018). Additionally, smFISH approaches have also been shown to be able to measure mRNA decay in yeast or trypanosomes (Kramer, 2017; Trcek et al., 2018), although these have not yet been adapted to Drosophila.

The other recent advance that will facilitate examination of the functional roles for mRNA localization in the relatively smaller cells that comprise the bulk of later embryo, larval and adult development is super resolution (SR) microscopy. These techniques have now been adapted to Drosophila (reviewed in Rodal et al., 2015) Recently, 3D-Structured Illumination Microscopy (SIM), using the Deltavision OMX system, was used to detect smFISH signals at the Drosophila NMJ (Titlow et al., 2018). Another SR technique that show promise for live imaging in later stage Drosophila embryos is lattice lightsheet imaging (LLSM) (Planchon et al., 2011; Chen et al., 2014). Other LSM methods that have recently been shown to be suitable for imaging later Drosophila embryos (e.g., +15 h) include reflected LSM (R-LSM) (Greiss et al., 2016) and Tilted LSM (TLSM) (Fadero et al., 2018; Gustavsson et al., 2018). An alternative method to improve resolution within cells of later stage Drosophila embryos or tissues is expansion microscopy (ExM). ExM effectively makes cells/tissues larger via treatment with polymer hydrogels (Chen et al., 2015) ExM has been shown to be compatible with smFISH methods including HCR FISH, facilitating imaging mRNA localization in cells deep within tissues (Chen et al., 2016). ExM has recently been shown to be feasible for examining later stages of fly development including such tissues as the adult brain (Mosca et al., 2017; Gao et al., 2019). Most importantly for the study of late stage fly development, ExM can be performed in tandem with enzymatic digestion of the embryo cuticle, facilitating tissue expansion in late-stage embryos (Jiang et al., 2018).

#### Identifying the Mechanisms That Localize mRNAs During Later Development

While outside the scope of this review, the other major challenge that remains is to determine the conservation of mechanisms that direct mRNA to specific cytoplasmic regions in the various cells of the late-stage embryo, larva or adult and what roles this localization plays in regulating the encoded protein. Since the FlyBase release FB2018\_03 (June 16, 2018), there are 913 known RNA binding proteins encoded by the Drosophila genome (Gramates et al., 2017). Many of the proteins known to localize during oocyte or early embryo development are expressed in one or more lineages in later development. Notably, there have been several advances in identifying protein binding to specific RNAs that would be compatible for use in complex tissues (Lee et al., 2013; Di Tomasso et al., 2016; Autour et al., 2018), reviewed in (Faoro and Ataide, 2014).

### The Future Is Bright for Studying mRNA Localization During Later Drosophila Development

The study of mRNA localization in Drosophila, especially during later development, is at an exciting crossroad. The wealth of data from traditional FISH-based screens provides a valuable resource outlining the full scope of localized mRNAs, encoding proteins involved in multiple cellular processes, and the possibility that these processes may be linked to localized transcripts in other organisms. The advent of contemporary smFISH techniques, including those that can also locally detect translational states, provide viable avenues to correlate existing phenomenological observation to the functional roles. Recent improvements in both microscope resolution available for fluorescent imaging, as well as the advent of workflows for robust and practical three-dimensional electron-microscope imaging (Xu et al., 2017), will improve the capacity to observe sub-cellular restriction mRNA restriction and specific organelle localization, especially in the relatively smaller-sized

epithelial cell lineages that form developing tissues. These methods will help link localization of mRNAs during later stage Drosophila development to what is undoubtedly an equally broad spectrum of functional consequences of the proteins they encode.

#### AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

#### REFERENCES


#### ACKNOWLEDGMENTS

We thank Sophie Keegan for critical reading of this manuscript. This work was supported by a Natural Sciences and Engineering Research Council of Canada grant (RGPIN-2017-05885) to AS and a Canadian Institutes of Health Research grant MOP142212 to SH. During the writing of this review, we came to realize that the body of literature on mRNA regulation during later Drosophila development is incredibly large. We apologize to all those whose research was not included due to space constraints.


morphogen gradients in Drosophila embryos. Curr. Biol. 16, 2035–2041. doi: 10.1016/j.cub.2006.08.093


cell growth, and survival. Annu. Rev. Cell Dev. Biol. 28, 655–685. doi: 10.1146/annurev-cellbio-092910-154033


Drosophila peripheral neurons. Curr. Biol. 14, 314–321. doi: 10.1016/j.cub.2004. 01.052


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Hughes and Simmonds. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Differential Regulation of the Three Eukaryotic mRNA Translation Initiation Factor (eIF) 4Gs by the Proteasome

Amandine Alard<sup>1</sup> , Catherine Marboeuf<sup>1</sup> , Bertrand Fabre<sup>1</sup>† , Christine Jean<sup>1</sup> , Yvan Martineau<sup>1</sup> , Frédéric Lopez<sup>1</sup> , Patrice Vende<sup>2</sup> , Didier Poncet<sup>2</sup> , Robert J. Schneider<sup>3</sup> , Corinne Bousquet<sup>1</sup> and Stéphane Pyronnet<sup>1</sup> \*

1 INSERM UMR1037, Centre de Recherche en Cancérologie de Toulouse, Equipe Labellisée Ligue Contre le Cancer and Laboratoire d'Excellence Toulouse Cancer, Université de Toulouse, Toulouse, France, <sup>2</sup> UMR9198 CEA, Institut de Biologie Intégrative de la Cellule (I2BC), Centre National de la Recherche Scientifique, Université Paris-Sud, Gif-sur-Yvette, France, <sup>3</sup> School of Medicine, New York University, New York, NY, United States

#### Edited by:

Maritza Jaramillo, University of Quebec, Canada

#### Reviewed by:

Motoaki Wakiyama, RIKEN, Japan Jerry Pelletier, McGill University, Canada

#### \*Correspondence: Stéphane Pyronnet stephane.pyronnet@inserm.fr

†Present address:

Bertrand Fabre, Technion Integrated Cancer Center, The Rappaport Faculty of Medicine and Research Institute, Haifa, Israel

#### Specialty section:

This article was submitted to RNA, a section of the journal Frontiers in Genetics

Received: 07 August 2018 Accepted: 07 March 2019 Published: 29 March 2019

#### Citation:

Alard A, Marboeuf C, Fabre B, Jean C, Martineau Y, Lopez F, Vende P, Poncet D, Schneider RJ, Bousquet C and Pyronnet S (2019) Differential Regulation of the Three Eukaryotic mRNA Translation Initiation Factor (eIF) 4Gs by the Proteasome. Front. Genet. 10:254. doi: 10.3389/fgene.2019.00254 The 4G family of eukaryotic mRNA translation initiation factors is composed of three members (eIF4GI, eIF4GII, and DAP5). Their specific roles in translation initiation are under intense investigations, but how their respective intracellular amounts are controlled remains poorly understood. Here we show that eIF4GI and eIF4GII exhibit much shorter half-lives than that of DAP5. Both eIF4GI and eIF4GII proteins, but not DAP5, contain computer-predicted PEST motifs in their N-termini conserved across the animal kingdom. They are both sensitive to degradation by the proteasome. Under normal conditions, eIF4GI and eIF4GII are protected from proteasomal destruction through binding to the detoxifying enzyme NQO1 [NAD(P)H:quinone oxidoreductase]. However, when cells are exposed to oxidative stress both eIF4GI and eIF4GII, but not DAP5, are degraded by the proteasome in an N-terminal-dependent manner, and cell viability is more compromised upon silencing of DAP5. These findings indicate that the three eIF4G proteins are differentially regulated by the proteasome and that persistent DAP5 plays a role in cell survival upon oxidative stress.

Keywords: mRNA translation, eIF4G, DAP5, PEST, NQO1, NRF2, proteasome, oxidative stress

### INTRODUCTION

In eukaryotes, most nuclear encoded mRNAs are modified at their 5<sup>0</sup> end with a cap-structure (m7GpppN, where N is any nucleotide). Once mRNA has been exported to the cytoplasm, one function of the cap is to facilitate mRNA translation into protein by the ribosome. Ribosomes are recruited at the mRNA 5<sup>0</sup> cap by eIF4F (reviewed in Merrick, 2015), a complex composed of three proteins: the cap-binding protein eIF4E, the RNA-helicase eIF4A and the scaffolding protein eIF4G which interacts with both eIF4E and eIF4A. eIF4G also interacts directly with eIF3, a translation initiation factor bound to the small ribosomal subunit, and with the poly(A) binding protein (PABP). Thus, through multiple interactions, eIF4G plays a central role in capdependent translation initiation by bridging the mRNA 5<sup>0</sup> cap structure (via eIF4E) to the poly(A) tail (via PABP), and to the ribosome (via eIF3).

Two eIF4G protein homologs have been characterized: eIF4GI and eIF4GII (Gradi et al., 1998). Although both clearly function in translation initiation, they differ in various aspects. Distinct phosphorylation sites targeted by different signaling pathways and with specific biological functions have been mapped in both amino-acid sequences. For instance, both eIF4GI and eIF4GII interact with the MAPK-interacting protein kinases MNK1 (Pyronnet et al., 1999) or MNK2 (Scheper et al., 2001), but only eIF4GI has been described as an MNK1/2 substrate (Orton et al., 2004). A third more distant protein homolog termed DAP5 (also called NAT1, eIF4GIII, or p97) has been identified (Imataka et al., 1997). The DAP5 polypeptide is devoid of N-terminal PABP- and eIF4E-binding sites but possesses domains interacting with eIF4A, eIF3 (Imataka et al., 1997) and MNK1/2 (Pyronnet et al., 1999). Consistently, DAP5 has been implicated in the specific translational regulation of a subset of mRNAs (Lee and McCormick, 2006) and in eIF4E-independent translation when cap-dependent translation is altered such as upon exposure to different stresses (Nevins et al., 2003). The three members of the eIF4G family thus appear to serve as fine regulators of translation initiation under various physiological or stress conditions. However, how the steady state level of each protein is controlled and whether they can be differentially targeted to degradation upon stress remain poorly understood.

### MATERIALS AND METHODS

#### Cell Culture and Compounds

NIH-3T3 cells were grown in standard conditions as described previously (Galés et al., 2003). To obtain cells expressing significantly low levels of DAP5, HEK-293 cells grown in standard conditions were first used to produce viruses upon transfection of the packaging plasmids pPAX2 and pMD2, and a pTRIPZ vector containing a tetracycline-inducible promoter driving the expression of the TurboRFP fluorescent reporter (GE Dharmacon Technology). ShRNAs directed against DAP5 (sh1-DAP5: 5<sup>0</sup> -TACCTCTAGTAATGGGCT TTA-3<sup>0</sup> and sh2-DAP5: 5<sup>0</sup> -AACCAGCCAAAGCCTTAAATT-3<sup>0</sup> ) or a non-silencing scrambled sequence (shNS: 5<sup>0</sup> -AATTCT CCGAACGTGTCACGT-3<sup>0</sup> ) were cloned into pTRIPZ using EcoRI and XhoI restriction sites. NIH-3T3 cells were transduced with cell-free virus-containing supernatants and selected against 4 µg mL−<sup>1</sup> puromycin during 48 h. RFP-positive and puromycin resistant cells were sorted (MoFlo XDP, Beckman) to establish derivative pools with stable expression of the shRNA constructs. As compared to shNS, the concentration of doxycycline producing an optimal down-regulation of DAP5 in both sh1-DAP5 and sh2-DAP5 cells was 4 µg mL−<sup>1</sup> . Puromycin, doxycycline, cycloheximide, lactacystin, MG-132, dicumarol and H2O<sup>2</sup> were from Sigma and they were dissolved as recommended by the manufacturer.

#### MTT Assay

For each condition, triplicates of native or stably transfected NIH-3T3 cells seeded in 96-well plates were let grown for 24 h, treated with or without 4 µg mL−<sup>1</sup> doxycycline for 48 h and incubated in the absence or presence of H2O<sup>2</sup> for 16 or 24-h. Cell survival was monitored by measuring absorbance at 570 nM with a microplate reader (Mithras-LB-940, Berthold) next to incubation with MTT (Euromedex).

### Plasmids and Transient Transfections

Plasmids (depicted in **Figure 2A**) used for transient transfections of NIH-3T3 cells were described earlier (Pyronnet et al., 1999) and are as follows: pcDNA3-HA-eIF4GI, pcDNA3- HA-eIF4GII and pcDNA3-HA-DAP5 (encoding full length proteins); pcDNA3-HA-4GI-N and pcDNA3-HA-4GII-N (encoding N-terminal fragments of eIF4GI and eIF4GII). pcDNA3-HA-4GII-C containing the C-terminal two-thirds of eIF4GII was constructed by insertion of a EcoRI-XhoI PCR fragment amplified from the pcDNA3-HA-eIF4GII using the forward catgacGAATTCcgactttacaccagcctttgct and reverse catgacCTCGAGttttagttatcctcagactcctc primers. The NQO1 expression plasmid was as described previously (Alard et al., 2009). Transient transfections of NIH-3T3 cells was carried out with GeneJetTM (SignaGen Laboratories) according to the manufacturer's instructions. Proteins were expressed for 36 h and cells were either processed immediately for immunoblotting or processed following treatment with various compounds (where indicated).

### Co-immunoprecipitation and Immunoblot Analyses

Preparation of cell extracts, co-immunoprecipitation and immunoblotting were carried out as previously described (Pyronnet et al., 1999). The antibodies used were as follows: anti-eIF4GI and anti-eIF4GII (gifts of Prof. Nahum Sonenberg); anti-DAP5 (CliniSciences #610742); anti-HA-7 (Sigma); antiβ-tubulin (GeneTex #6288022); anti-4E-BP1, anti-NRF2 and anti-p53 (Cell Signaling Technologies #9452, #12721, and #1C12, respectively); anti-Core 20S (Enzo Life Sciences #PW8155); and anti-NQO1 (Santa Cruz #C19).

### Protein Sequences Analyses

To test for the presence of potential PEST motifs, amino-acid sequences of eIF4GI, eIF4GII and DAP5 from different animal species (described in **Table 1**) were up-loaded into the ePESTfind software at EMBOSS explorer<sup>1</sup> . After running out the ePESTfind software, only the potential PEST motifs displaying a score >5.0 were taken into account in this study. The alignment showing conservations between eIF4GI and eIF4GII sequences was performed using the LALIGN program<sup>2</sup> .

## RESULTS

### eIF4GI and eIF4GII Exhibit Shorter Half-Lives Than DAP5

To explore how the amounts of the three different eIF4G family members are controlled, their respective half-lives were first

<sup>1</sup>http://emboss.bioinformatics.nl/cgi-bin/emboss/epestfind <sup>2</sup>https://embnet.vital-it.ch/software/LALIGN\_form.html



monitored in non-transformed NIH-3T3 fibroblasts. Cells were treated with cycloheximide (CHX) to arrest protein synthesis and time-dependent decreases in the cellular contents of eIF4GI, eIF4GII, and DAP5 were visualized by western-blotting. Protein synthesis was efficiently blocked by CHX as attested by a metabolic labeling of cells with puromycin and its subsequent detection into nascent polypeptides by westernblotting (**Figure 1A**, left). Both eIF4GI and eIF4GII exhibited short half-lives (∼5 and ∼3 h, respectively) as compared to DAP5 which remained unaltered after 8 h of treatment with CHX (**Figure 1A**, middle and right). One probable explanation for the relative short half-lives of eIF4GI and eIF4GII in cells is that they are rapidly degraded by the proteasome. Consistently, eIF4GI has been shown earlier to be a proteasomal substrate with a relative short half-life (Baugh and Pilipenko, 2004). We therefore tested this possibility by incubating cells with MG-132, a specific proteasome inhibitor. Both eIF4GI and eIF4GII markedly accumulated in cells following treatment with MG-132 while the amount of DAP5 increased only slightly over the same period (**Figure 1B**). Furthermore, the fast decrease in eIF4GI or eIF4GII amount observed upon inhibition of protein synthesis by CHX was less pronounced when cells were co-incubated with MG-132 (**Figure 1C**). These data indicate that eIF4GI and eIF4GII exhibit faster turnovers than DAP5 in growing fibroblasts likely due to more rapid degradation by the proteasome (and probably higher rates of synthesis).

#### Involvement of PEST-Containing eIF4GI or eIF4GII N-Terminus in Proteasomal Degradation

We then searched how eIF4GI, eIF4GII, and DAP5 are differentially targeted to the proteasome for degradation. eIF4GI

FIGURE 1 | Differential half-lives of eIF4GI, eIF4GII and DAP5. (A) NIH-3T3 cells were treated at different times with 50 µg mL−<sup>1</sup> cycloheximide (CHX) and incubated with 10 µg mL−<sup>1</sup> puromycin (puro) 10 min before lysis. Proteins were visualized by western-blotting with the indicated antibodies (left and middle) and signals were quantified by densitometric analysis (right). Data are the means ± SD of three separate experiments. (B) NIH-3T3 cells were treated at different times with 20 µM MG-132 and proteins visualized by western-blotting as indicated. (C) NIH-3T3 cells were co-treated at different times with 50 µg mL−<sup>1</sup> CHX and 20 mM MG-132 and proteins visualized by western-blotting as indicated.

(bottom) cDNAs, NIH-3T3 cells were untreated or treated at different times with 50 µg mL−<sup>1</sup> CHX and proteins visualized by western-blotting as indicated.

and eIF4GII are considered as two functional homologs because they both contain domains interacting with key translation initiation factors (including PABP, eIF4E, eIF3, and eIF4A) and with the translation regulators MNK1 and MNK2 kinases. The more distant homolog DAP5 shows similarities only with the C-terminal two thirds of eIF4GI or eIF4GII as it lacks the N-terminal third and as a consequence cannot interact with PABP or eIF4E (**Figure 2A**, top). We therefore suspected that N-terminal features shared only by eIF4GI and eIF4GII could be responsible for their more rapid turnovers. In silico predictions (ePESTfind, EMBOSS) revealed the existence of five putative PEST-motifs (sequences enriched in proline, glutamate, serine and threonine) with variable scores in each eIF4GI and eIF4GII polypeptides. Similar computer-predicted PEST motifs were identified earlier in the eIF4GI amino acid sequence (Anand and Gruppuso, 2005). Five and four PEST motifs were detected in the N-terminal thirds of eIF4GI and eIF4GII, respectively, while one with a low score was found in the C-terminal two-thirds of eIF4GII (**Figure 2A**, middle and bottom). In contrast, no sequences reaching computer-predicted PEST requirements could be detected in the DAP5 polypeptide. PEST-motifs are known to target proteins for degradation by the proteasome either dependently on or independently of ubiquitination (Mathes et al., 2008). They have been found and validated in other key short-lived proteins including c-MYC (Gregory and Hann, 2000), members of the I-kappaB family (Lin et al., 1996; Park et al., 2014) and ornithine decarboxylase (ODC) (Ghoda et al., 1992). To test for their implication in the fast turnovers of eIF4GI and eIF4GII, experiments designed to monitor protein half-lives and similar to those described in **Figure 1** were repeated using HA-tagged full-length eIF4GI, eIF4GII, and DAP5 (named HA-4GI, HA-4GII, and HA-DAP5; **Figures 2A**, middle), or HA-tagged N-terminal segments of eIF4GI and eIF4GII each containing 5 or 4 PEST motifs (named HA-4GI-N and HA-4GII-N; **Figure 2A**, middle). Following transfection and CHX treatment, HA-4GI and HA-4GII exhibited half-lives as short as those observed for the endogenous proteins, while HA-DAP5 devoid of PEST motifs remained unaltered (**Figure 2B**, left). In addition, HA-4GI-N and HA-4GII-N were also similarly short-lived upon inhibition of protein synthesis by CHX, but accumulated upon inhibition of proteasomal activity by MG-132 (**Figure 2B**, right). In contrast, HA-4GII-C containing the C-terminal two-thirds of eIF4GII but lacking its N-terminal third was more resistant to degradation upon CHX treatment, and showed a stability similar to that of HA-DAP5 (**Figure 2C**). These data suggest that PEST-containing N-terminal thirds of

eIF4GI and eIF4GII form signals sufficient for targeting them to proteasomal degradation.

#### DAP5 Is Resistant to Degradation Under Oxidative Stress

eIF4GI can be destructed directly by the 20S proteasome (Baugh and Pilipenko, 2004) and we (Alard et al., 2009) and others (Attar-Schneider et al., 2014) have shown that it is protected from degradation through its binding to NQO1, an observation made initially for the two other short-lived proteins p53 (Asher et al., 2003) and ODC (Asher et al., 2005); and more recently extended to other key proteins including HIF-1α (Oh et al., 2016). NQO1 protects candidate proteins from degradation by the proteasome through its direct interaction with the 20S proteasome (Moscovitz et al., 2012). However, upon oxidative stress which recruits NQO1 and its quinone oxidoreductase activity for detoxifying reactive oxygen species, eIF4GI (and other protected proteins) no longer binds to NQO1 and becomes more rapidly degraded by the proteasome (Alard et al., 2009). To check whether a similar degradation of eIF4GII or DAP5 occurs, the fate of both proteins was monitored under oxidative stress. Increasing concentrations of H2O<sup>2</sup> resulted in a degradation of eIF4GII sensitive to the proteasome inhibitor lactacystin, while the amount of DAP5 remained unchanged (**Figure 3A**). This suggested that eIF4GII is subjected to a similar mechanism of deregulation than that of eIF4GI under oxidative stress. The possibility that eIF4GII also interacts with NQO1 has been therefore verified. Co-immunoprecipitation experiments confirmed the interaction between eIF4GI and NQO1 (**Figure 3B**, left), and revealed that eIF4GII similarly interacts with NQO1 when co-immunoprecipitation is performed with either anti-eIF4GII (**Figure 3B**, middle) or anti-NQO1 (**Figure 3C**, right) antibodies.

The protective binding of NQO1 to eIF4GI (Alard et al., 2009), p53 (Asher et al., 2003), and ODC (Asher et al., 2005) is disrupted by dicumarol, an NQO1-specific inhibitor which provokes the accumulation of intracellularly produced ROS. Consistently, incubation of cells with dicumarol provoked the degradation of eIF4GI and eIF4GII, while the amount of DAP5 was not affected (**Figure 3C**). p53 was used here as positive control of protein degradation induced by dicumarol (**Figure 3C**), although its degradation and recovery followed faster kinetics than those of eIF4GI or eIF4GII. In addition, HA-tagged fulllength as well as N-terminal thirds of both eIF4GI and eIF4GII were all degraded next to dicumarol treatment (**Figure 3D**), indicating that PEST-containing N-terminal domains of the two homologs are sufficient to mediate proteasomal degradation under oxidative stress.

### DAP5 Is Involved in Cell Survival Under Oxidative Stress

Because DAP5, but not eIF4GI or eIF4GII, was unaffected by oxidative stress, it was probable that this translation initiation factor was involved in the cellular response to oxidative stress.

This hypothesis was tested by using three pools of NIH-3T3 fibroblasts, two of them engineered to express distinct doxycycline-inducible shRNAs directed against DAP5 (named sh1-DAP5 and sh2-DAP5) and one to express scrambled shRNAs (named shNS). As compared to the shNS pool, the treatment of cells with doxycycline for 48 h efficiently down-regulated DAP5 protein expression in sh1-DAP5 and sh2-DAP5 pools (**Figure 4A**). Before testing the possible involvement of DAP5, survival of the three pools of NIH-3T3 cells under oxidative stress was first monitored in the absence of doxycycline to ensure that the processes of antibiotic selection and cell sorting (see section "Materials and Methods") did not generate a pool of cells with an intrinsically (i.e., independent of DAP5 amount) distinct response to oxidative stress. A dose effect of H2O<sup>2</sup> during 24 h confirmed that survival of the three pools of cells was similarly affected by oxidative stress (**Figure 4B**). Then, this experiment was repeated but in the presence (or not) of doxycycline for 48 h to down-regulate DAP5 expression followed by treatment with H2O<sup>2</sup> during 16 or 24 h. The data clearly showed that DAP5 down-regulation altered cell survival under oxidative stress at both times tested (**Figure 4C**).

### NRF2 or NQO1 Expressions Is Independent of DAP5 Under Oxidative Stress

One important factor induced by and required for the response to oxidative stress is NRF2 (recently reviewed in Bellezza et al., 2018). NRF2 induces the transcriptional activation of genes capable of detoxifying intracellular ROS, including NQO1 which acts through its quinone oxidoreductase activity (Venugopal and Jaiswal, 1996). We therefore hypothesized that NRF2 and/or NQO1 expression could be altered at the translational level upon down-regulation of DAP5. This assumption was supported by the fact that oxidative stress is known to inhibit general cap-dependent translation initiation while DAP5 is believed to play a role in cap-independent translation under stress (Nevins et al., 2003), and that a cap-independent mode of NRF2 mRNA translation has been described upon oxidative stress (Li et al., 2010). The impact of oxidative stress on cap-dependent translation was first looked in shNS and in sh2-DAP5 cells in the absence or presence of doxycycline. In both cell pools, H2O<sup>2</sup> treatment provoked a significant dephosphorylation of 4E-BP1, as shown by accumulation of its α hypophosphorylated isoform (**Figure 5**, left). As hypophosphorylated 4E-BP1 sequesters the cap-binding translation initiation factor eIF4E, this supported the notion that cap-dependent translation is actually inhibited in our cell models, and that NRF2 (and likely NQO1) expression could be controlled by DAP5 in a cap-independent manner. The experiment was therefore repeated with the three pools of cells untreated or treated with doxycycline. Both NRF2 and NQO1 expressions were actually induced upon treatment with H2O2, but such inductions were not affected by DAP5 down-regulation in either of the two sh1-DAP5 and sh2-DAP5 stable cell lines (**Figure 5**, right). These data revealed that although required for cell survival, DAP5 is not involved in NRF2 or NQO1 protein induction under oxidative stress.

### DISCUSSION

These data indicate that the intracellular amounts of the three eIF4G family members are differentially regulated by the proteasome. The DAP5 polypeptide devoid of N-terminal PEST motifs is more stable than eIF4GI and eIF4GII proteins. Curiously, the N-terminal segments of eIF4GI and eIF4GII containing the functional PEST motifs are the less conserved portions among the two proteins (**Supplementary Figure S1**). The domain structures of eIF4GI and eIF4GII have been extensively studied. As compared to their well characterized C-terminal two-thirds, no folded domains have been identified in their N-terminal thirds although individual shorter stretches of amino-acids may fold upon binding to their respective partners such as PABP and eIF4E (Marintchev and Wagner, 2005), and likely NQO1 (**Figure 3B**). The N-terminal third of eIF4GI or eIF4GII can therefore be viewed as an intrinsically disordered and flexible segment allowing changes in conformational states rendering eIF4GI or eIF4GII capable of creating

FIGURE 5 | Induction of NRF2 and NQO1 proteins under oxidative stress is independent of DAP5. Protein extracts of stably transfected NIH-3T3 cells grown in the absence or presence of doxycycline (Dox) for 48 h and untreated or treated with 1 mM H2O<sup>2</sup> for 4 h were subjected to western-blotting with the indicated antibodies. The bottom-to-top α–β–γ symbols denote hypo- to hyperphosphorylated 4E-BP1 isoforms.

numerous contacts with different proteins involved in protein synthesis. Intrinsically disordered portions of proteins have been often considered as probable signals for proteasomal degradation (Suskiewicz et al., 2011). It is thus possible that the apparently non-conserved but PEST-containing N-terminal segment of eIF4GI or eIF4GII has yet evolved to serve a dual function: (i) it provides a necessary flexibility for assembly of translation initiation complexes and (ii) it forms a signal for proteasomal degradation when not protected by binding partners. Consistently, proteasomal degradation of eIF4GI and eIF4GII coincides with disruption of their binding to NQO1 and to eIF4E (next to 4E-BP1 hypophosphorylation; **Figure 5** and Alard et al., 2009), and likely with disruption of their binding to PABP as oxidative stress leads to nuclear re-localization of the protein where it is not expected to interact with eIF4GI or eIF4GII (Salaun et al., 2010). Interestingly, the presence of PEST sequences in the N-terminal third of eIF4GI or eIF4GII is a feature conserved in animal species belonging to different branches of the animal kingdom. Indeed while only anecdotic (i.e., non-conserved among species) and low score PEST motifs

mammalian protein sequences have been copy/pasted in protein sequences of other species even if they have not been always experimentally confirmed.

were detected in eIF4GI or eIF4GII C-terminal two thirds or in DAP5 polypeptides, eIF4GI or eIF4GII N-terminal thirds of all species contain PEST motifs including those with the highest scores (**Figure 6**).

Our data also indicate that persistent DAP5 is involved in cell survival upon oxidative stress, although neither NRF2 nor NQO1 expression is affected by DAP5 knock-down. Whether DAP5 still plays a role in the expression of a subset of genes through a selective translational mechanism under oxidative stress remains to be elucidated. If this is the case, how DAP5 could function under oxidative stress? Clues may arise from what happens when cells are exposed to other stresses such as hypoxia. We (Azar et al., 2013) and others (Koritzinsky et al., 2006) have actually shown that hypoxia blocks cap-dependent mRNA translation through eIF4E sequestration by the hypophosphorylated forms of 4E-BP1 and blocks global mRNA translation through eIF2α phosphorylation. However, it has been shown recently that DAP5 selectively recruits the ribosome to target mRNAs under hypoxia via its direct interaction with eIF2β (Liberman et al., 2015; Bryant et al., 2018), thus circumventing the inhibitory effect of eIF4E sequestration on cap-dependent translation and the inhibitory effect of eIF2α phosphorylation on global translation. Together with eIF2β and eIF2γ, eIF2α belongs to the eIF2 trimeric translation initiation complex whose function in translation (i.e., binding of the charged tRNA to the small ribosomal subunit) is inhibited next to eIF2α phosphorylation (reviewed in Lasfargues et al., 2013). Since H2O2-induced oxidative stress also provokes eIF2α phosphorylation (MacCallum et al., 2006) and eIF4E sequestration while sparing DAP5 (our data), a similar DAP5-eIF2β-dependent selective translational mechanism may occur. Additionally, DAP5 may stimulate eIF4Eindependent translation initiation of a specific subset of mRNAs by recruiting the ribosome through its very recently described direct interaction with eIF3 (de la Parra et al., 2018).

#### REFERENCES


### AUTHOR CONTRIBUTIONS

AA and SP designed the experiments and wrote the manuscript. AA, BF, CM, FL, and PV performed the experiments. CB, CJ, DP, RS, and YM helped in writing and edited the manuscript. SP conceived and supervised the project. All authors have reviewed and approved the final manuscript.

#### FUNDING

This work was supported by grants from INSERM, La Ligue Contre le Cancer (programme Equipes Labellisées), and Investissements d'Avenir via the programs CAPTOR and LABEX TOUCAN to SP.

### ACKNOWLEDGMENTS

We are grateful to Prof. Nahum Sonenberg for eIF4GI and eIF4GII antibodies and cDNAs.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene. 2019.00254/full#supplementary-material

FIGURE S1 | The PEST motifs reside into the less conserved regions of eIF4GI and eIF4GII. The PABP and eIF4E binding domains are highlighted in green and red, respectively. The PEST motifs are highlighted in light blue. The values appearing next to accession numbers refer to the first and last amino acids (aa) of eIF4GI or eIF4GII protein sequences used to create the alignment. "." stands for similar amino acids; ":" stands for identical amino acids.



cellular stress is mediated by apoptotic fragments of eIF4G translation initiation factor family members eIF4GI and p97/DAP5/NAT1. J. Biol. Chem. 278, 3572– 3579. doi: 10.1074/jbc.m206781200


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Alard, Marboeuf, Fabre, Jean, Martineau, Lopez, Vende, Poncet, Schneider, Bousquet and Pyronnet. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

digital media

of impactful research

article's readership