Functional Information Stored in the Conserved Structural RNA Domains of Flavivirus Genomes

The genus Flavivirus comprises a large number of small, positive-sense single-stranded, RNA viruses able to replicate in the cytoplasm of certain arthropod and/or vertebrate host cells. The genus, which has some 70 member species, includes a number of emerging and re-emerging pathogens responsible for outbreaks of human disease around the world, such as the West Nile, dengue, Zika, yellow fever, Japanese encephalitis, St. Louis encephalitis, and tick-borne encephalitis viruses. Like other RNA viruses, flaviviruses have a compact RNA genome that efficiently stores all the information required for the completion of the infectious cycle. The efficiency of this storage system is attributable to supracoding elements, i.e., discrete, structural units with essential functions. This information storage system overlaps and complements the protein coding sequence and is highly conserved across the genus. It therefore offers interesting potential targets for novel therapeutic strategies. This review summarizes our knowledge of the features of flavivirus genome functional RNA domains. It also provides a brief overview of the main achievements reported in the design of antiviral nucleic acid-based drugs targeting functional genomic RNA elements.


INTRODUCTION
The great plasticity of RNA virus genomes allows them to perform different functions during the infectious cycle, helping viral populations adapt to novel molecular and cellular contexts, and to escape host defenses. It also contributes toward the development of resistance to antiviral drugs. These feats are achieved by the genome preserving a degree of variability while avoiding challenges to viral fitness. Genome variability can become a threat to viral survival if it reaches the error catastrophe limit (Schuster, 1993;Eigen, 2002), but RNA viruses have overcome this by storing information required for essential functions in discrete, highly conserved, genomic RNA structural domains. These complexly folded regions may overlap the nucleotide sequence coding for viral proteins. They play out their different biological roles (e.g., in replication, translation, or encapsidation) by directly recruiting viral and/or cellular factors, or by forming high-order regulatory structures via the establishment of long-range RNA-RNA interaction networks resulting in the formation of the complex global structures required for correct viral functioning. By means of this dynamic folding, the RNA genome can perform functions during the viral cycle other than simply coding for proteins .
Flavivirus spp. (from now on flaviviruses) belong to the family Flaviviridae. They are small (40-65 nm diameter), enveloped (icosahedral nucleocapsid) viruses with a positive single-stranded RNA genome. The genus includes important human pathogens responsible for ongoing/recurrent outbreaks of disease in areas where such diseases are not traditionally endemic; West Nile virus (WNV), dengue virus (DENV, perhaps the most important human pathogen of the genus) and Zika virus (ZIKV), for example, are all dramatically expanding their original geographic distribution. Other well-known flaviviruses include the causal agents of yellow fever (YFV), Japanese encephalitis (JEV), St. Louis encephalitis (SLEV), Murray Valley encephalitis (MVEV), or tick-borne encephalitis (TBEV) among of over 70 flaviviruses that have been identified. Some authors believe there could be over 2,000 left to discover (Pybus et al., 2002).
Most flaviviruses are transmitted to vertebrate hosts by the bite of haematophagous arthropods (thus classifying them within the heterogeneous group of arboviruses). Flaviviruses have traditionally been assigned to one of three clusters according to their arthropod vectors (Kuno et al., 1998;Cook and Holmes, 2006;Cook et al., 2012): mosquito-borne (MBFV), tick-borne (TBFV), and no-known-vector (NKV) flaviviruses (Table 1). These clusters can be further divided into clades and species. The members of the MBFV and TBFV clusters replicate in vertebrates and arthropods, while the NKV flaviviruses can be subdivided into two clades infecting solely bats or rodents, with no arthropod vector involved in the infective cycle. A fourth cluster, gathers together the insect-specific flaviviruses (ISFV), has recently been defined and characterized (Cook et al., 2012). It is the most divergent group and can be subdivided according to the mosquito host involved (mainly Aedes spp. and Culex spp.). ISFVs do not infect any vertebrate host (Table 1). Finally, Tamana bat virus (TABV), which infects exclusively mammalian cells, shows no serological relationship with any other flavivirus, and has only very distant phylogenetic relationships with them. Its taxonomic position, therefore, is not well defined (de Lamballerie et al., 2002;Roby et al., 2014).
Certainly, flaviviruses pose health problems for humans (and some other vertebrates) that may be associated with enormous social and economic costs. Over the last decade, the number of outbreaks of flavivirus-induced disease has increased all over the world. The main causes include the geographic expansion of their mosquito vectors, and increasing human travel to the areas of highest infection risk. They cause non-specific symptoms in the initial phase of infection in humans, which hinders their control, and as for other RNA viruses, no efficient therapeutic or immunoprophylactic strategies have been developed. The World Health Organization 1 and the Centers for Disease Control 2 therefore both cite flaviviruses as a global health threat.
The functional importance of the highly conserved structural genomic RNA domains in different RNA viruses (Romero-López and Berzal-Herranz, 2013) renders them potential therapeutic targets for new antiviral drugs. This review focuses on the role of the functionally active structural RNA domains identified in the flavivirus genome. Their mechanisms of action in the regulation of essential functions of the viral cycle are discussed, and a short overview is provided of the flavivirus subgenomic RNAs (sfRNAs). Recent advances in the development of novel therapeutic strategies entailing the use of nucleic acid-based agents to target RNA molecules are also described.

Cell Entry and Internalization
The mechanism by which flaviviral particles attach to the cell membrane is only partially understood. It has been reported that host surface glycoproteins interact with the viral envelope proteins to initiate attachment (Chen et al., 1997b;Kroschewski et al., 2003;Davis et al., 2006). Attachment might also be mediated by integrins, cytoskeleton proteins, and cholesteroldependent lipid raft pathways (Medigeshi et al., 2008;Bogachek et al., 2010). Internalization is then mediated by clathrin-coated vesicles (Chu and Ng, 2004). The subsequent acidification of these vesicles causes the viral capsid proteins to fuse with the vesicle membrane, releasing the viral genome into the cytoplasm (Allison et al., 1995) (Figure 1A). This then reaches the surface of the endoplasmic reticulum (ER) where the molecular environment that allows the viral cycle to proceed is created, while preventing interferon response signaling (Hoenen et al., 2007;Welsch et al., 2009).

Translation and Replication
During the early phase of the flavivirus cycle, the viral genome is mainly employed as mRNA in viral protein synthesis. The initiation of translation occurs in a cap-dependent manner. In DENV, the process starts with the binding of the eukaryotic initiation factor 4E (eIF4E) to the 5 cap, and the further recruitment of eIF4G and eIF4A (Merrick, 2004;Paranjape and Harris, 2010). This ribonucleoprotein complex binds to the 43S particle (40S + eIF1A + eIF3) and the AUG start codon can then be recognized (Khromykh and Westaway, 1997). Finally, the 60S ribosomal subunit is recruited and translation starts. The resulting polyprotein product is cleaved by viral and host proteases into three structural (capsid C, precursor of membrane prM, and envelope E) and seven non-structural (NS1, NS2A, NS2B, NS3, NS4A, NS4B, and NS5) proteins (Nowak et al., 1989) ( Figure 1B).
Once viral proteins levels are appropriate, the ER membrane undergoes structural rearrangements that promote the formation of replication complexes (Welsch et al., 2009;Gillespie et al., 2010;Kaufusi et al., 2014). In addition to the circular RNA genome conformation -stabilized by long-distance 3 -5 interactions (see below) (Khromykh et al., 2001;Zhang et al., 2008a) -and viral proteins, different host cell factors including AUF1 (Friedrich et al., 2014), eEF1A (p52) (Brinton, 2001), the TIAR (T-cell intracellular antigen-related), and TIA-1 (T-cell intracellular antigen-1) proteins (Li et al., 2002;Emara and Brinton, 2007), La protein (Vashist et al., 2009), PABP (Polacek et al., 2009b) and PTB (polypyrimidine tract binding protein) (Agis-Juarez  It has been recently shown that flaviviruses suppress host protein synthesis in human cells early post infection (host translation shutoff) while viral RNA translation is maintained (Roth et al., 2017). This strategy to ensure an efficient viral cycle consecution has been widely reported for other arboviruses such as the alphaviruses. In the case of flaviviruses, the precise molecular mechanisms leading to the translation shutoff remains elusive. It does not respond to the canonical pathways of translation control; several and not exclusive mechanisms might be involved in the host translation suppression (Roth et al., 2017). It is worth noting that this process is coupled to a switch from cap-dependent to cap-independent viral protein synthesis (Edgil et al., 2006;Roth et al., 2017). By a non-IRES mediated mechanism, flavivirus genome can subvert the lack of eIF4E to initiate viral translation in a 5 and 3 UTR dependent manner. Under these conditions, both ends of the viral genome are brought together to initiate the direct recruitment of translation initiation factors, thus by-passing the eIF4E requirements. This fact confers to the viral genome the great advantage of being able to translate viral proteins under limiting protein synthesis conditions, as highly differentiated cells (Edgil et al., 2006).

Assembly of Structural Proteins for Virion Formation
Newly synthesized viral RNA genomes are assembled with structural proteins to form new, infectious particles. The genome packaging process is guided by mature viral capsid protein (C) at the ER (Schrauf et al., 2009). The resulting nucleocapsid is enveloped by a lipid bilayer belonging to the host cell (Chu and Ng, 2004) with the prM and E proteins embedded in it. These immature virions are transported to the Golgi, where the E and prM proteins are modified to yield the mature virion. The acidic pH of the Golgi causes a conformational rearrangement in which immature viruses lose their spiky prM-E trimer projections and acquire a smooth surface composed of E homodimers (Mukhopadhyay et al., 2005) ( Figure 1A). Finally, the infectious particles are released by exocytosis at 8-10 h post infection (hpi). Peak extracellular virus titres are usually observed at 18-24 hpi (Chu and Ng, 2004).

THE FLAVIVIRUS GENOME
The flaviviral genome consists on a positive-sense singlestranded RNA molecule approximately 11,000 nt long, varying depending on the species. It bears a type 1 cap structure at its 5 end (m 7 GpppAmp) (Ray et al., 2006;Zhou et al., 2007;Saeedi and Geiss, 2013) but it lacks a polyA tail in the 3 end (Wengler, 1981;Brinton et al., 1986). The RNA genome contains a single ORF flanked by untranslated regions (UTRs) Wengler et al., 1985;Castle and Wengler, 1987). It serves as a messenger for the synthesis of a single polyprotein that is processed by viral and cellular proteases (Brinton, 2002) to yield 10 different products (Rice et al., 1985) ( Figure 1B). The flanking UTRs are defined by discrete, functionally active structural RNA elements that play important roles in the viral cycle. These can be divided into essential partners in the infection process (e.g., promoters) and other elements not essential for viral RNA propagation but which help to regulate the processes involved. The functional RNA elements of all flaviviruses appear as highly conserved, complex folding regions, despite the lack of extensive sequence conservation across the Flavivirus genus (Brinton, 2014).

The 5 End of the Genomic RNA
Various functional RNA elements have been identified in the 100 nt-long 5 UTR and the 5 end of the coding sequence of the flavivirus genome (Brinton and Dispoto, 1988;Liu et al., 2013) (Figure 2). The 5 UTR is relatively short in comparison with that of the IRES-dependent members of the Flaviviridae family. Different isolates of the same flavivirus show strong sequence conservation, and significant identity is observed among members of the same flavivirus group. Less nucleotide conservation is seen among members of different groups (Brinton and Dispoto, 1988), in contrast with the observed conservation in RNA folding (Cahour et al., 1995;Thurner et al., 2004). Preliminary structural studies of this region suggested the predicted secondary structures to be functionally important -due to their similar size and shape -in different flavivirus genomes (Brinton and Dispoto, 1988;Gritsun et al., 1997;Leyssen et al., 2002;Gritsun and Gould, 2007b). Further studies support the requirement of these functional structural elements for RNA synthesis both in vitro and in cell culture (Cahour et al., 1995;Filomatori et al., 2006;Lodeiro et al., 2009;Li et al., 2010). The functional role of the 5 UTR elements in RNA replication and translation has been examined mostly in DENV and extrapolated to other flavivirus (Cahour et al., 1995;Filomatori et al., 2006;Lodeiro et al., 2009). Here we focus on MBFV flavivirus genome 5 structures (Figure 2).
The ∼70 nt-long domain at the extreme 5 terminus is known as the SLA element, and this is conserved across all flavivirus groups. It folds into a Y-shape and has a main stemloop structural element plus a smaller side stem-loop (SSL) that emerges from it. The size of the essential SSL stem and the sequence of its loop vary across flaviviruses (Leyssen et al., 2002;Thurner et al., 2004;Filomatori et al., 2006;Gritsun and Gould, 2007b;Dong et al., 2008) (Figure 2). The SLA architecture is recognized by viral RNA polymerase (NS5), an RNA-dependent-RNA polymerase (RdRp) involved in viral replication (Filomatori et al., 2006). It has been reported that residues located at the basal portion of the stem-loop, in the upper stem, and in the internal loop, are critical for NS5 binding and activity (Dong et al., 2008;Li et al., 2010). In addition, the SLA element is involved in directing the addition of the cap structure at the 5 end of the viral genome during RNA synthesis (Zhou et al., 2007;Zhang et al., 2008c). This is catalyzed by the guanylyl-and methyltransferase (MTase) activities of NS5 RNA-dependent-RNA polymerase (RdRp), and requires the relocation of the 5 end of the nascent genomic transcript at the MTase active site (Ray et al., 2006). This event seems to be dependent on the local conformation of the RNA. These features make the SLA element an essential partner in viral translation and replication (Ray et al., 2006;Li et al., 2010). This observation is reinforced by the fact that the folding of SLA is preserved across flaviviruses, regardless of any sequence differences in this region (Filomatori et al., 2006;Lodeiro et al., 2009).
In most flaviviruses, including DENV and WNV, a second, smaller stem-loop (SLB) is present downstream of SLA that shows a certain variability in its size and shape (Brinton and Dispoto, 1988). The SLB element bears the AUG translation initiation codon, which is embedded within its stem portion in a poor Kozak initiation context in many MBFVs, but in a strong context in TBFVs (Clyde and Harris, 2006;Clyde et al., 2008) (Figure 2). An oligo(U) tract providing an at-least-10 nt spacer between the two stem-loop structures has been observed in DENV (Lodeiro et al., 2009). In WNV, two sequence stretches -UAR and DAR I -involved in genome cyclization are embedded within this structural domain. DENV, however, contains only UAR (Figure 2, see also genome cyclization section).
The stable, highly conserved hairpin cHP follows the SLB element at its 3 end, and expands into the first nucleotides of the capsid coding region of DENV and WNV (Figure 2). It was first identified in DENV, and despite the reduced conservation of its sequence it was predicted to be preserved in the mosquitoand TBFV flaviviruses (Clyde and Harris, 2006). cHP governs the selection of the translation initiation codon by directly positioning the ribosomal complex close to the "functional" AUG in the SLB element. Importantly, translation initiation efficiency at the appropriate codon is related to the thermodynamic stability of the cHP element (Clyde and Harris, 2006). It is reported that the introduction of stable secondary structural elements (e.g., stem-loops) downstream of an AUG codon embedded in a poor Kozak context, improves the recognition of the optimal starting triplet by pausing the translation machinery, which must unwind the hairpin (Kozak, 1990). This favors prolonged contact with the correct AUG start codon. Thus, cHP acts as a translation enhancer. In addition, it has been shown to have a role as a cis-replicating element in WNV and DENV (Clyde et al., 2008). cHP thus became the first known functional RNA domain with a dual functional role in the flaviviruses life cycle (Clyde et al., 2008), highlighting the efficiency of the information coding system based on structural RNA units. During early infection, translation initiation is promoted. At this stage, the viral genome has not acquired the replication competent circular conformation, but rather exhibits an extended cHP stem-loop which temporarily makes the ribosomal complex linger at the correct AUG start codon to favor its recognition. The switch to replication might occur through the establishment of longdistance RNA-RNA interactions between the 5 and 3 genome ends (see below). These contacts induce the acquisition of a circular conformation, which determines a slight shortening of the cHP stem, thus allowing for rearrangements in the translational-competent scaffold and the further recruitment of factors required for viral RNA synthesis (Clyde et al., 2008).
Another conserved domain within the capsid coding regionthe conserved capsid-coding region 1 (CCR1; Figure 2B) -was first described in the DENV genome. It was shown to modulate the DENV life cycle in mammalian and mosquito cells, likely acting during a post-RNA synthesis stage and possibly regulating viral assembly (Groat-Carmona et al., 2012). It was later found in TBEV, in which it was shown to be important for efficient viral translation (Rouha et al., 2011). Despite its high sequence and structure conservation in DENV and TBEV serogroups, it is not well-conserved across the flavivirus genus (Groat-Carmona et al., 2012).

The 3 End of the Genomic RNA
The 3 end of the genome terminates in a 700 nt-long untranslated region (3 UTR) that lacks a poly(A) tail. It ends in a conserved CU OH dinucleotide (Wengler, 1981;Brinton and Dispoto, 1988) in MBFV and TBFV, except in some strains of TBEV (Mandl et al., 1991). The 3 UTR of flavivirus genomes is essential for viral replication (Men et al., 1996;Zeng et al., 1998). Its structure and functional characterization has mostly been deciphered in MBFVs. The 3 UTR can be subdivided into three autonomously folded regions, domains I-III (Figure 3), which show different degrees of sequence and structure conservation across members of the genus, with the 3 extreme region -known as small hairpin 3 stem-loop (sHP-3 SL) -the most conserved of all. A defining feature within the 3 UTR is the presence of duplications of structural cassettes. These are composed of various structural elements in MBFV and TBFV, but not in ISFV or NKV flaviviruses. Compelling experimental evidence indicates each duplicated cassette to play a different role in viral replication. An association between the duplication of structural elements and the capacity of the genome to replicate in mammalian and arthropod hosts has been established (for review see Villordo et al., 2016).
Domain I is located just downstream of the translation stop codon. In most flaviviruses it appears as a hypervariable sequence followed by two conserved stem-loop domains (SL-I and -II) similar in sequence and structure (Figures 3, 4); in YFV (Wang et al., 1996), the NKV flaviviruses (Leyssen et al., 2002), and ISFVs [for a review see (Blitvich and Firth, 2015)], however, there is only one stem-loop (SL). In YFV, domain I contains tandem repeats in hairpin structures (RYFs) unique to the ISFV group (Bryant et al., 2005). The SL of the NKV flaviviruses is similar to those of the TBFVs (Villordo et al., 2016), while differences are observed in the structure of this region within the two main subgroups of the ISFVs -classical ISFVs (cISFV) and dual-host affiliated ISFVs (dISFV) (Blitvich and Firth, 2015). Although all ISFVs contain multiple sequence repeats (Gritsun et al., 2014), cISFVs are characterized by folding into short hairpins, and the dISFVs into an SL similar to those seen in MBFVs (Villordo et al., 2016). Domain I of the prototypical DENV-2 comprises a duplicated SL preceded by the hypervariable tract. The nucleotides of the apical loop of both SLs are involved in the formation of pseudoknots with the nearby downstream sequence (forming PK1 and PK2) (Thurner et al., 2004). In the JEV group (Table 1), the hypervariable region folds into an AUrich stem-loop (SL-I) followed by a highly conserved branched element (SL-II) immediately preceded by a short conserved hairpin (RCS3) (Brinton, 2014) (Figure 3). This structural unit (SL-I•SL-II•RCS3) is repeated to yield the SL-III, SL-IV, and CS3 elements (Proutski et al., 1999) (Figure 3). Deletion and sequence mutation analyses of SL-I and II, and of the motifs RCS3 and CS3, have revealed their roles as regulatory replication elements (Lo et al., 2003;Pijlman et al., 2008). Importantly, the apical loop of SL-II is involved in the formation of a pseudoknot structure (PK1) with the single stranded region immediately downstream (Figures 3, 4). The formation of this PK is critical for infectivity (Lo et al., 2003;Pijlman et al., 2008). A second pseudoknot, PK2, is formed in the repeated structural unit SL-III•SL-IV•CS3. Interestingly, several GNRA-like motifs are found in domain I, suggesting this region to function as a protein recruiting platform and as a nucleation center for direct RNA-RNA interactions (Sztuba-Solinska et al., 2013). TBFV duplicated stem-loops are Y-shaped (Gritsun and Gould, 2007a) -a different type of folding than seen in MBFV genomes. It is remarkable that their involvement in PK formation with downstream sequences is preserved, emphasizing the functional significance of the PK structural element.
Domain II is moderately conserved and in MBFV and NKV flaviviruses contains a characteristic structure known as a dumbbell (DB); this is involved in the formation of a PK structural element (Figures 3, 4). In the DENV and JEV groups ( Table 1), it contains a sequence motif duplicated in tandem (RCS2 and CS2) that forms an essential component of the respective functional dumbbell structures 5 DB and 3 DB (Figure 3) (Shurtleff et al., 2001;Silva et al., 2008). YFVs contain a pseudo-dumbbell (ψ-DB) which may be derived from a duplicated DB. In DENV, a pseudoknot structure (PK) has been proposed which involves the highly conserved 5 nt-long motif in the apical loop (top loop, TL1) of the 5 hairpin in the 5 DB element plus the corresponding downstream singlestranded complementary sequence (Olsthoorn and Bol, 2001;Sztuba-Solinska et al., 2013). This structural element is likely to be formed in WNV as well (PK3, Figure 3). This architecture is functionally important for viral replication (Men et al., 1996;Lo et al., 2003;Alvarez et al., 2005), translation (Wei et al., 2009;Manzano et al., 2011) and infectivity (Proutski et al., 1999). NKV flaviviruses and dISFVs contain a single copy of the DB structure but there is no evidence of its involvement in a PK structure. In contrast TBFVs have no DB structure in this region, although they do have two different SLs. The more proximal, which is often duplicated, is known as GC-SL since its loop has a conserved GGC stretch involved in the formation of a PK element. The distal SL, AU-SL, is very stable and its loop contains a conserved AAUU sequence that participates in the formation of a second PK (Villordo et al., 2016). Domain III is defined by the highly conserved terminal genomic functional elements sHP (short hairpin) and 3 SL FIGURE 5 | Long-range RNA-RNA contacts in the circular flavivirus genome. The diagram shows the circular conformation of representative flavivirus genomes mediated by long range RNA-RNA interactions. Colored boxes indicate the interacting sequences involved in genome cyclization. Lines above them represent their length in WNV (solid black lines), DENV-2 (dotted lines), and YFV (dashed lines). The sequence motifs within the 5 and the 3 ends are represented below the diagram for these MBFV models. Sequences correspond to the Kunjin virus MRM 61C strain (GenBank accession number L24511.1 for 5 UTR and L24512.1 for 3 UTR), the dengue virus serotype 2 (DENV-2) 16681 strain (GenBank accession number NC_001474) and the yellow fever virus (YFV) 17D vaccine strain (GenBank accession number X03700.1) respectively. The diagram also shows where the viral RdRp polymerase (yellow) binds in the genome ends. (Figures 3, 4). The presence of both has been confirmed by chemical probing (Brinton et al., 1986;Shi et al., 1996), SHAPE (selective 2 -hydroxyl acylation analyzed by primer extension) (Sztuba-Solinska et al., 2013) and nuclear magnetic resonance (NMR) analysis (Davis et al., 2013). The sHP element of domain III consists of a 5 bp stem and a highly conserved 6 nt apical loop (Brinton et al., 1986;Olsthoorn and Bol, 2001) that resembles the typical GN N RA motif. This suggests sHP to be a potential recruitment region of protein factors or to be involved in the establishment of RNA-RNA interactions. Partially overlapping with this sHP element, a highly conserved 24 ntlong sequence (CS1, Figure 3) has been shown indispensable for virus replication in DENV (Men et al., 1996). CS1 contains sequences involved in genome cyclization (Wengler and Castle, 1986;Hahn et al., 1987) (see below). The functional requirement of the CS1 nucleotides not involved in cyclization has not been explained. The terminal 3 SL is an essential structural element with a small number of conserved sequence stretches: the terminal 5 -CU OH -3 and surrounding residues (Wengler, 1981;Brinton et al., 1986;Wengler and Castle, 1986) and the apical loop (Elghonemy et al., 2005;Tilgner et al., 2005) (Figure 3). In addition, all flaviviruses show a bulge in the upper portion of the 3 SL stem (Yu and Markoff, 2005) (Figures 3,  4). This bulge induces a bend in the duplex, which might be required for NS5 protein recognition. The functions of sHP and 3 SL have been studied in depth, and are essential for viral replication (Blackwell and Brinton, 1995;Men et al., 1996;Khromykh and Westaway, 1997;Zeng et al., 1998;Bredenbeek et al., 2003;Villordo et al., 2010;Villordo and Gamarnik, 2013) and the completion of the viral cycle (Brinton et al., 1986;Hahn et al., 1987;Zeng et al., 1998;Khromykh et al., 2001;Alvarez et al., 2005;Tilgner et al., 2005;Yu and Markoff, 2005;Yu et al., 2008). They perform their functions likely by interacting with non-structural viral proteins (Chen et al., 1997a) and cellular factors such as eEF1A (Blackwell and Brinton, 1997;Davis et al., 2013), the La autoantigen (De Nova-Ocampo et al., 2002) and PTB (polypyrimidine tract binding protein) (De Nova-Ocampo et al., 2002). The role of 3 SL during viral translation initiation has also been widely studied, but the results obtained have been discrepant (Li and Brinton, 2001;Holden and Harris, 2004;Alvarez et al., 2005). Since flaviviruses do not bear a poly(A) tail, it has been proposed that 3 SL contributes to the recruitment of poly(A) tail binding protein (PABP) (Polacek et al., 2009b), and subsequently to ribosome recruitment and assembly. Finally, orthologous domains in DENV might be related to disease outcome (Mangada and Igarashi, 1997;Leitmeyer et al., 1999), suggesting a role for 3 structural domains in virulence.

SUBGENOMIC FLAVIVIRUS RNAs
In addition to the accumulation of genomic RNA during flavivirus infection, subgenomic, non-coding flavivirus RNA molecules (sfRNAs) ranging from 300 to 500 nt-long accumulate in the cytoplasm (Urosevic et al., 1997;Lin et al., 2004;Pijlman et al., 2008). These molecules are the result of incomplete digestion of the viral genome by the host cell 5 -3 exoribonuclease Xrn1, which cleaves the viral RNA but stalls at defined locations in the highly folded 3 UTR (Pijlman et al., 2008) (Figure 4). This resistance to Xrn1 activity is dependent on specific residues; these have been elucidated for WNV (Pijlman et al., 2008;Funk et al., 2010), YFV (Silva et al., 2010), DENV-2 (Chapman et al., 2014b), and MVEV (Chapman et al., 2014a), and are confirmed to be conserved across flaviviruses (Chapman et al., 2014a). Such residues share a common structural environment defined by a three-way junction and a characteristic and conserved pseudoknot, PK1 (Pijlman et al., 2008;Chapman et al., 2014b) (Figures 3, 4), which is essential for sfRNA generation. In the DENV and JEV groups ( Table 1), this structure has been located within the SL-I and SL-II of Domain I of the 3 UTR, respectively (Pijlman et al., 2008;Funk et al., 2010) (Figures 4A,B), while in YFV the single SL (SL-E) provides the stalling point (Silva et al., 2010) (Figure 4C). In MBFVs, it has been reported that the abrogation of PK1 leads to the production of shorter species of sfRNAs derived from the Xrn1 stalling at the downstream pseudoknot structures PK2 [SL-II in DENV-2 (Figure 4B), SL-IV in JEV ( Figure 4A) and ψ-DB in YFV (Figure 4C)] and/or PK3 [5 DB in JEV and DENV and DB in YFV (Figure 4)] (Pijlman et al., 2008;Funk et al., 2010;Chapman et al., 2014a). Consecutive pseudoknots therefore appear to act as security or check points to assess the production of sfRNAs. The three-way junction organizes the three-dimensional folding by bringing the basal stem and the 3 apical loop of the structure close together, yielding a ringlike topology with the free 5 end inside it, as determined by X-ray crystallography (Chapman et al., 2014a). Thus, rather than providing a simple unfolding mechanism, Xrn1 turns the ring inside-out to provide access to the susceptible residues at the 5 end. This architecture may also be responsible for the selection of directionality during extension by viral polymerase (Chapman et al., 2014a).
From a functional point of view, full-length sfRNAs play important roles in regulating the switch between translation and replication during the infectious cycle (Lin et al., 2004). They promote cytopathic effects and pathogenicity in mice (Pijlman et al., 2008;Funk et al., 2010;Chapman et al., 2014a;Liu et al., 2014) and they disrupt the generation of a proper immune response at different levels, while shortened sfRNA species lead to attenuated viral forms. In particular, full-length sfRNAs inhibit the antiviral activity of IFN-α/β by an unknown mechanism (Schuessler et al., 2012), as well as that of the antiviral RNAi pathway, probably by acting as Dicer decoy substrates (Schnettler et al., 2012). Intrinsic to sfRNA formation, Xrn1 function is inhibited and, thus, endogenous mRNAs are accumulated (Moon et al., 2012). Moreover, DENV-2 sfRNA has been shown to interact with stress granules (Bidet and Garcia-Blanco, 2014).
Detailed information on the roles of sfRNAs is provided in recent reviews (Roby et al., 2014;Clarke et al., 2015;Charley and Wilusz, 2016).

GENOMIC CYCLIZATION IN FLAVIVIRUS
The acquisition of a circular conformation in viral RNA genomes is a successful strategy that provides important advantages in the completion of the infective cycle. First of all, it efficiently ensures the propagation of undamaged, full-length genomes (Hahn et al., 1987). Further, the initiation of protein synthesis and the replication process is governed by the establishment of a closed loop topology. Transitions between different steps of the viral cycles are directly dependent on the existence of complex networks of RNA-RNA contacts .
The acquisition of the circular topology is mediated by direct, long distance RNA-RNA interactions between different complementary sequence motifs at the 5 and 3 ends of the genome (Figure 4). Such interactions have been probed by psoralen/UV crosslinking assays (You et al., 2001), electrophoretic mobility shift assays (Alvarez et al., 2005;Zhang et al., 2008a), atomic force microscopy (Alvarez et al., 2005), and structure probing (Dong et al., 2008;Polacek et al., 2009a). Though some of the complementary sequence motifs involved in genome cyclization show low conservation rates across the flaviviruses, the circularization mechanism is ubiquitous and required for flaviviral propagation (Khromykh et al., 2001;Song et al., 2008).
In MBFV, at least three pairs of sequence motifs have been shown to participate in the cyclization process (Figures 4, 5). These include: (i) A highly conserved motif -the so-called 3 cyclization sequence (3 CYC) -is included in the conserved sequence CS1 (Figure 3) just upstream of the sHP domain at the 3 terminus of the genome. It contains an 8 nt-long stretch conserved across the MBFVs. The 3 CYC perfectly matches its complementary partner in the cHP domain at the extreme 5 end (5 CYC) (Hahn et al., 1987) (Figure 2). The 5 -3 CYC interaction (Figure 4) must be preserved for efficient virus replication (Khromykh et al., 2001;Corver et al., 2003;Lo et al., 2003;Kofler et al., 2006). Different studies have reported sequence preferences in the 5 -3 CYC pairs (Suzuki et al., 2008;Basu and Brinton, 2011). Flipping specific base pairs can have different effects on virus replication depending on their position within the interacting domain. Mutations affecting the central positions of the CYC sequence, but with maintained base pairing at the points of 3 -5 interaction, have little or no effect on replication, whereas base pairs flipped in the terminal positions severely affect viral replication. The role of the terminal and flanking CYC residues seems to be critical for initiating the interaction between the complementary sequences and for the preservation of the stability of the replication competent circular form.
(ii) The UAR pair, which involves residues upstream of the AUG start codon at the 5 end of the viral genome, 5 UAR, and a complementary sequence located within the basal portion of the stem in the 3 SL element, 3 UAR (Alvarez et al., 2005;Zhang et al., 2008c) (Figures 2-4). It has been suggested that switching from the formation of the stem to the longdistance interaction with the 5 UAR releases the 3 terminus of the viral genome for recognition by the flaviviral RNA polymerase (NS5) during the initiation of the minus-strand RNA synthesis (Zhang et al., 2008c;Polacek et al., 2009a;Filomatori et al., 2011;Davis et al., 2013) (Figure 5). (iii) The DAR sequences motifs. In the DENV group, a single sequence motif 5 DAR within the linker between the SLB and cHP stems (at the 5 end of the genome) interacts with the corresponding complementary 3 DAR sequence (included in the CS1 sequence) within the sHP stem at the genome 3 terminus. In the JEV group (Table 1), two DAR motifs have been described -5 DAR I and 5 DAR II -within the stem and the base of the SLB domain, which interact, respectively, with 3 DARI and 3 DARII (Dong et al., 2008;Friebe and Harris, 2010;Friebe et al., 2011) (Figures 2-5).
During the initiation of minus-strand RNA synthesis, NS5 first recognizes the SLA element and the 5 DARII in the context of a circular RNA, and interacts with 3 DARI and II, probably leading to the initiation of viral replication (Dong et al., 2008) (Figure 5). These findings suggest a role for protein recruitment in DAR interactions and the subsequent genome cyclization process.
Data derived from structural, phylogenetic and functional analyses have allowed a theoretical model of the genomic cyclization process to be proposed (Friebe et al., 2011). Accordingly, the latter is initiated via the interaction between the 5 and the 3 CYC motifs (Polacek et al., 2009a). The duplex then further extends via the DAR contacts which "open" the sHP element (Friebe and Harris, 2010;Friebe et al., 2011). Additional UAR-mediated interactions help to unwind the basal portion of the 3 SL domain to further promote conformational rearrangements within the 3 end of the viral genome. Recently, a cis-acting element present in the capsid coding sequence of DENV was found to interact with 5 DB at the 3 UTR, forming a PK structural element. This interaction was proven to have a different effect on viral RNA replication in mosquito and mammalian cells (de Borba et al., 2015).
Tick-borne genomic cyclization occurs by the formation of at least the two long-distance interactions 5 -3 CSA and 5 -3 CSB (Mandl et al., 1993;Khromykh et al., 2001). The sequence motifs involved in these interactions are unrelated to those in MDFVs. The 5 -3 CSA interaction is the equivalent of the 5 -3 UAR interaction in MBFV, despite being located at different positions (Mandl et al., 1993) and is also crucial for RNA synthesis (Khromykh et al., 2001). The 5 CSB and 3 CSB motifs are located at genomic positions similar to 5 CYC and 3 CYC in MBFVs, but their interaction is not essential in TBFV replication (Kofler et al., 2006). Genome circularization in TBFVs is also enhanced by a kissing-loop contact involving two stem-loops, 5 SL6 and 3 SL3, located in the capsid coding region at the 5 and 3 ends, respectively (Tsetsarkin et al., 2016). The 5 SL6 domain was previously shown to be required for efficient replication (Tuplin et al., 2011).
In NKV flaviviruses, two interactions have been predicted involved in genome cyclization. The first involves a sequence motif located upstream of the AUG start codon and a complementary one within the 3 SL; the second is established between a motif within the capsid coding region and the corresponding counterpart upstream of the 3 SL (Leyssen et al., 2002).
In addition to the RNA-RNA interactions, genomic cyclization might be stabilized by viral and host protein factors recruited by different genomic RNA structural domains (Blackwell and Brinton, 1997;Ta and Vrati, 2000;De Nova-Ocampo et al., 2002;Garcia-Montalvo et al., 2004). These factors include the La protein (De Nova-Ocampo et al., 2002;Vashist et al., 2009), polypyrimidine-tract binding protein (PTB) (De Nova-Ocampo et al., 2002;Kim and Jeong, 2006) or translation elongation factor 1α (eEF-1α) (De Nova-Ocampo et al., 2002). Interestingly, such proteins are involved, at different extent, with the progression of the translation process, which points to cyclization as a feasible strategy to control viral protein synthesis. Different RNA helicases as FBP1 (far upstream element-binding protein), DDX3, DDX5, and DDX6 have also been proved to bind to both the 5 and the 3 UTRs of the flavivirus genome, and affect replication in opposite ways (Chien et al., 2011;Ward et al., 2011;Li et al., , 2014. These findings demonstrate that the control of the cyclization event is mediated by the RNA recruitment of host factors. They also show that flaviviruses can use the genome cyclation for regulating transitions between different steps of the infective cycle. Finally, host proteins related to mRNA splicing such as hnRNPA2 (Katoh et al., 2011) or Lsm1 (Dong et al., 2015) interact with the cyclization sequence motifs and/or with functional RNA domains located in the untranslated regions. The recruitment of these proteins is required for and efficient viral replication process though their molecular mechanism is still unknown.
A proper balance between linear and circular forms of the genome is required to ensure the initiation of plus-strand RNA synthesis, encapsidation, and even the switch from translation to replication. This is because sequences involved in the cyclization process overlap with essential structural domains that cannot be formed in the circular topology. In this context, the thermodynamic stability of the single structural domains is critical for efficient transition from one conformation to another. Thus, mutations that stabilize the circular or the linear form spontaneously revert to "less stable" architectures (Clyde et al., 2008;Villordo et al., 2010;Iglesias et al., 2011). The genome cyclization operates as a control system regulating the progression of the flavivirus infective cycle.

NUCLEIC ACIDS TARGETING FLAVIVIRUS GENOMES
The information and functions encoded in structural genomic RNA domains render interference with the proper folding of these elements a candidate means of interfering with viral propagation. In this context, the use of nucleic acids as therapeutic agents is of growing interest. The development of any such therapy, however, must overcome a number of challenges, including the maintenance of the stability of the nucleic acid agents and efficient delivery to the target cell. These problems have been largely addressed by combinatorial chemistry, and a range of chemical nucleotide substitutions are now available. The use of chemically modified oligonucleotides has resulted in the improvement of the pharmacodynamic and pharmacokinetic properties of these antiviral nucleic acids-based antiviral agents (Haasnoot and Berkhout, 2009).
In recent decades, pioneering work into antisense oligonucleotide-based inhibitors has laid the ground for the design of thus-based antiviral compounds (Haasnoot and Berkhout, 2009). Antisense oligonucleotides are short nucleic acids with sequences complementary to those of their targets. They interfere with the function of essential regions within RNA molecules by different mechanisms. The first attempt to design antisense oligonucleotides against a flavivirus RNA genome used the DENV genome as a model (Raviprakash et al., 1995). A set of propynil-phosphorothioate-modified antisense oligonucleotides targeting five regions throughout the viral RNA showed that interfering with the sequence motif surrounding the translation initiation codon and the SL-IV domain within the 3 UTR was an effective antiviral strategy in cell culture.
These preliminary but promising results in DENV prompted further efforts to develop other antisense oligonucleotides against other flaviviruses, such as WNV. The use of phosphorodiamidate morpholino oligomers (PMOs) as potential anti-WNV drugs has been reported . Two oligonucleotides targeting the extreme 5 end of the viral genome and the 3 CYC motif were found to efficiently interfere with viral translation and replication. Further, the conjugation of these PMOs at their 5 end with an arginine-rich peptide (PPMO) improved uptake by cells, yielding an agent capable of strongly suppressing the viral cycle. It was suggested that the high conservation rate of the targeted regions allowed the design of sets of PPMOs targeting a spectrum of related flaviviruses belonging to the JEV group (Deas et al., 2007). This could lead to important advances in the use of nucleic acidbased compounds, not only as inhibitory molecules but also as biotechnological tools for the detection of different viruses in biological samples. In addition, modified PMOs could help us understand the molecular mechanisms underlying the function of the targeted structural RNA domains.
Cellular RNA interference (RNAi) has also been widely examined in recent years as a means of generating novel antiviral RNA molecules. This strategy is based on the design of short, double-stranded RNA molecules (the so-called small interfering RNAs or siRNAs), which are loaded into the RNA-induced silencing complex (RISC). The sense strand of the duplex then guides the complex to the target region, where it base-pairs fully to induce degradation of the target RNA molecule. As antisense oligonucleotides, siRNAs can be chemically produced or endogenously synthesized from appropriate expression vectors. Numerous authors have reported the use of siRNAsboth in cell culture and in infected mice -against the coding region of the WNV genome (Mccown et al., 2003;Bai et al., 2005;Geiss et al., 2005;Kumar et al., 2006;Ong et al., 2006Ong et al., , 2008Yang et al., 2008) and the conserved functional domains within the 3 UTR (Zhang et al., 2008b;Anthony et al., 2009). The results confirm the potential of this strategy in the development of new antiviral compounds.
The use of RNA or DNA aptamers (short oligonucleotides that efficiently and specifically bind to a target molecule) represents another promising strategy for developing antiviral agents against flaviviruses. They also provide an interesting means of developing molecular tools for deciphering the functional role of genomic structural elements, and therefore the identification of potential therapeutic targets. This has already been shown for other, closely related viruses such as HCV (Marton et al., 2011(Marton et al., , 2013Fernández-Sanlés et al., 2015) as well as non-related viruses such as HIV (Sánchez-Luque et al., 2014). Aptamers can be chemically modified quite easily to increase their stability and improve their efficiency.
The successful clinical use of any of the above strategies is conditioned by the appearance of resistant mutants. In fact, WNV particles resistant to PMOs targeting the conserved 3 UAR sequence motif have already been isolated (Zhang et al., 2008b). They contained a single nucleotide mutation in the target sequence that impaired or weakened the PMO interaction, while the 5 UAR-3 UAR base-pairing was restored by the selection of a compensatory mutation. Novel strategies are therefore required, based on combining antiviral compounds with different specificities, including recognition of specific structural features, and even different mechanisms of action. In this context, the use of antisense oligonucleotides, siRNAs and other nucleic acid molecules (e.g., aptamers) in combination with other drugs, such as interferon or neutralizing antibodies, may provide effective and potent antiviral cocktails.

CONCLUDING REMARKS
The acquisition of compact genomes was an important evolutionary achievement of RNA viruses; these genomes can store all the information required for the completion of the infectious cycle in reduced packages. This is possible due to the existence of a supracoding system beyond the nucleotide sequence, defined by discrete, folded domains, and higherorder structures. These elements operate both alone and in combination to create complex networks of contacts that regulate multiple steps of the viral cycle, and to recruit host and viral factors. Understanding how host-virus interactions shape viral evolution will help to elucidate the factors that govern the emergence of new viruses and the expansion of already known RNA viral pathogens. The lack of technics or experimental approaches to determine the RNA structure and to analyze the kinetics of RNA-RNA interactions in cell culture, together with the lack of experimental strategies to specifically interfere with the folding of the RNA genomic elements, represent an important limitation for understanding their function in the viral cycle. Importantly, the phylogenetic conservation of the genomic RNA structural domains and their interactions across members of Flavivirus, provide alternative and complementary potential targets to the viral proteins for novel antiviral compounds. Advances made in the field of nucleic acid synthesis have provided excellent candidate molecules for fighting RNA viruses by interfering with the essential functions performed by their genomic functional domains. Different pharmaceutical companies are now investigating the potential of nucleic acid therapeutic strategies, assessing long-term antiviral responses and trying to minimize secondary effects.

AUTHOR CONTRIBUTIONS
All authors participated in writing the manuscript (led by CR-L and AB-H), commented upon, and approved its final version. AF-S and PR-M prepared the figures.

FUNDING
Work in our laboratory is supported by the Spanish Ministerio de Economía y Competitividad (BFU2012-31213 and BFU2015-64359-P) and the Consejería de Economía Innovación, Ciencia y Empleo, Junta de Andalucía (CVI-7430). It is also partially funded by FEDER funds from the EU.