In-depth Bioinformatic Analyses of Human SARS-CoV-2, SARS-CoV, MERS-CoV, and Other Nidovirales Suggest Important Roles of Noncanonical Nucleic Acid Structures in Their Lifecycles

Noncanonical nucleic acid structures play important roles in the regulation of molecular processes. Considering the importance of the ongoing coronavirus crisis, we decided to evaluate genomes of all coronaviruses sequenced to date (stated more broadly, the order Nidovirales) to determine if they contain noncanonical nucleic acid structures. We discovered much evidence of putative G-quadruplex sites and even much more of inverted repeats (IRs) loci, which in fact are ubiquitous along the whole genomic sequence and indicate a possible mechanism for genomic RNA packaging. The most notable enrichment of IRs was found inside 5′UTR for IRs of size 12+ nucleotides, and the most notable enrichment of putative quadruplex sites (PQSs) was located before 3′UTR, inside 5′UTR, and before mRNA. This indicates crucial regulatory roles for both IRs and PQSs. Moreover, we found multiple G-quadruplex binding motifs in human proteins having potential for binding of SARS-CoV-2 RNA. Noncanonical nucleic acids structures in Nidovirales and in novel SARS-CoV-2 are therefore promising druggable structures that can be targeted and utilized in the future.


Introduction
The order Nidovirales is a monophyletic group of animal RNA viruses. This group can be divided into the six distinct families of Arteriviridae, Coronaviridae, Mesnidovirineae, Mononiviridae, Ronidovirineae and Tobaniviridae. All known Nidovirales have single-stranded, polycistronic RNA genomes of positive polarity (Modrow et al., 2013). Due to the Severe acute respiratory syndrome- China), this viral group is now extensively studied (Hung, 2003;Cowling et al., 2015;Oboho et al., 2015;Li et al., 2020).
The small size of RNA virus genomes is in principle linked to their limited ability for RNA synthesis, which is directly connected to a replication complex containing RNA-dependent RNA polymerase (RdRp) without reparation mechanisms (Gorbalenya et al., 2006). These RNA viruses can generate one mutation per genome per replication round (Drake and Holland, 1999). This combination of features means that RNA viruses are able to adapt to new environmental conditions, but they are limited in expanding their genomes because they must keep their mutation load low so that their survival is possible (Eigen and Schuster, 1977;Lauring et al., 2013;Carrasco-Hernandez et al., 2017).
Nidovirales bind to their host cell receptors on the cell surface, after which fusion of the viral and cellular membranes is mediated by one of the surface glycoproteins. This mechanism, which progresses in the cytoplasm or endosomal membrane, releases the nucleocapsid into the host cell's cytoplasm.
After genome "transportation," translation of two replicase open reading frames (ORFs) is initiated by host ribosomes. This results in large polyprotein precursors that undergo autoproteolysis to eventually produce a membrane-bound replicase/transcriptase complex. This complex initiates synthesis of the genome RNA and controls the synthesis of structural and some other proteins. New virus particles are assembled by association of the new genomes with the cytoplasmic nucleocapsid protein and subsequent envelopment of the nucleocapsid structure. Subsequently, the viral envelope proteins are inserted into intracellular membranes and targeted to the site of virus assembly (most often membranes between the endoplasmic reticulum and Golgi complex) and then they meet up with the nucleocapsid and trigger the budding of virus particles into the lumen of the membrane compartment. The newly formed virions then leave the cell by following the exocytic pathway toward the plasma membrane (Lai and Cavanagh, 1997;Gorbalenya, 2001;Snijder et al., 2003;Ziebuhr, 2004;Gorbalenya et al., 2006).
Today, SARS-CoV-2 is being studied intensively by scientists all over the world due to the ongoing 2020 coronavirus pandemic (Cohen, 2020). The origin of this virus is unknown, but recently a two-hit scenario was proposed wherein SARS-CoV-2 ancestors in bats first acquired genetic characteristics of SARS-CoV by incorporating a SARS-like receptor-binding domain through recombination prior to 2009; subsequently, the lineage that led to SARS-CoV-2 accumulated further, unique changes specifically within this domain (Patino-Galindo et al., 2020). As true of SARS-CoV, cell entry by SARS-CoV-2 depends upon angiotensin-converting enzyme 2 (ACE2) and transmembrane protease, serine 2 (TMPRSS2) for viral spike protein priming (Hoffmann et al., 2020).
The genome of SARS-CoV-2 was sequenced and annotated in early January 2020 . A recent study revealed that the transcriptome of SARS-CoV-2 is highly complex due to numerous canonical and noncanonical recombination events. Moreover, it was found that SARS-CoV-2 produces transcripts encoding unknown ORFs and at least 41 potential RNA modification sites with an AAGAA motif were discovered in its RNA (Kim et al., 2020).
G-quadruplex binding proteins (QBPs) play crucial roles in many signaling pathways, including such biologically highly relevant activities as cell division, dysregulations of which lead to cancer development (Wu et al., 2008;Brázda et al., 2014). QBPs have been found to be involved in various viral infection pathways. An interesting example is HIV-1 nucleocapsid protein NCp7. It has been described how this protein helps to resolve an otherwise very stable G-quadruplex structure in viral RNA that stalls reverse transcription (Butovskaya et al., 2019). Various G-quadruplex-forming aptamers are used as drugs against many different viral proteins, suggesting a prominent role for QBPmediated regulation. Among many other viruses, in particular Hepatitis C virus, HIV-1, and SARS-CoV are targets of these G-quadruplex-forming aptamers (Platella et al., 2017). Quadruplex binding domain has been found in nonstructural protein 3 (Nsp3) (Lei et al., 2018). This so-called SARS-UNIQUE-DOMAIN (SUD), and especially its M subdomain, was observed to be essential for SARS replication in host cells. Deletion or even substitution mutations in key RNA-interacting amino acids were shown to result in viral inability to replicate within host cells (Kusov et al., 2015). Moreover, this subdomain was found also in MERS and several other coronaviruses (Kusov et al., 2015).
G-quadruplexes are secondary nucleic acid structures formed in guanine-rich strands (Burge et al., 2006;Vorlíčková et al., 2012;Kolesnikova and Curtis, 2019). These have been detected in various This is a provisional file, not the final typeset article genomes, but most extensively they have been described in human genomes (Chambers et al., 2015;Bedrat et al., 2016;Hänsel-Hertsch et al., 2016). They are present also in viruses (Lavezzo et al., 2018;Frasson et al., 2019). G-quadruplexes probably play an important role in regulating replication in most viral nucleic acids (Lavezzo et al., 2018), and these structures have been suggested as targets for antiviral therapy (Métifiot et al., 2014;Ruggiero and Richter, 2018). Along with cruciforms and hairpins, which can be formed in nucleic acids by inverted repeats (IRs), G-quadruplexes are significant genome elements playing specific biological roles. They are involved, for example, in the regulation of DNA replication and transcription (Bagga et al., 1990;Limanskaya, 2009;. It has been demonstrated that IRs are important for various processes in viruses, including their genome organization (Li and Li, 2010;Ishimaru et al., 2013;Xie et al., 2017;Bridges et al., 2019). Another interesting RNA motif that has been used as a drug target and was found in SARS-CoV targeted by 1,4-diazepame is the "slippery sequence" followed by a pseudoknot (Plant et al., 2005). This structure, common among all coronaviruses, works based on ribosomal −1 frameshifting that switches on viral fusion proteins (Plant et al., 2005).
In all prokaryotic and eukaryotic cells, as well as viruses, there have been found sequence motifs such as IR sequences forming cruciforms and hairpins or G-rich sequences that form G-quadruplexes Lavezzo et al., 2018;Bartas et al., 2019). In the present research, we conducted a systematic and comprehensive bioinformatic study searching for the occurrence of IRs and putative quadruplex sites (PQSs) within the genomes of all known Nidovirales. The aim was to find one or more potential druggable RNA targets to address the present COVID-19 threat (Figure 1). FIGURE 1 | When outside host cells, the RNA of SARS-CoV-2 is highly spiralized by means of nucleocapsid phosphoproteins (top left). Once RNA is released into the cytoplasm and during replication/transcription processes, the formation of G-quadruplex and/or hairpins may take place (right). Stabilization of these structures by existing drugs may play a crucial role in inhibiting the viral lifecycle (bottom left).

Genome Source
Linear genomes of 109 viruses from the order Nidovirales were downloaded from the genome database of the National Center for Biotechnology Information (NCBI). Full names, phylogenetic groups, exact NCBI accession numbers, and further information are summarized in Supplementary Material 1.

Nidovirales Phylogenetic Tree Construction
Exact taxonomic identifiers of all analyzed Nidovirales species (obtained from Taxonomy Browser via NCBI Taxonomy Database [Drake and Holland, 1999]) were downloaded to phyloT: a tree generator (http://phylot.biobyte.de) and a phylogenetic tree was constructed using the function "Visualize in iTOL" in the Interactive Tree of Life environment (Letunic and Bork, 2019).

Analyses of PQSs Occurrence in Nidovirales Sequences
Nidovirales sequences were analyzed using our G4Hunter Web Tool . The software is capable to read the NCBI identifier of the sequences uploaded in a .csv file. The parameters for G4Hunter were set to 25 as window size and G4Hunter score above 1.2. The results for each analyzed sequence contained information about the size of the genome and number of putative PQSs.
All the results were merged into a single Microsoft Excel file where statistical analysis was then made.
We also downloaded features tables of each virus from the NCBI database. These tables contain information about known features in the genome of each species. We searched the occurrence of Gquadruplex-forming sequences before, inside, and after the specific features of each genome using a predefined feature neighborhood ± 100 nucleotides (nt

Analyses of IRs Occurrence in Nidovirales Sequences
All Nidovirales genomes were analyzed by the core of our Palindrome analyser webserver (Brázda et al., 2016). The software was modified to read NCBI identifiers of sequences from text files. The size of IRs was set to 6-30 nt, size of spacers to 1-10 nt, and maximally one mismatch was allowed. A separate list of IRs in each of the 109 sequences and an overall report were exported. The overall report contained the lengths of the analyzed sequences, total number of IRs found, numbers of IRs grouped by size of IR (6-30 nt), and sum of IRs longer than 8 nt, 10 nt and 12 nt. The software also counted frequency of IRs in each sequence. Frequencies of IRs were normalized per 1,000 nt. Features tables of 109 Nidovirales genomes were downloaded from the NCBI database and grouped by their names as stated in the feature table file. Analyses of IRs occurrence inside and around (before and after) these features was performed. The search for IRs took place in predefined feature neighborhoods ± 100 nt around and inside feature boundaries. We calculated the numbers of all IRs and of those longer than 8, 10, and 12 nt in regions before, inside, and after the features.

RNA Fold Predictions
In order to be able to display higher structures of the coronavirus genome, we used Galaxy's freeonline webserver (Afgan et al., 2018) and its RNA fold tool (Lorenz et al., 2011). This tool allows quick calculation of minimum free energy of secondary structures. We left the default parameters (Temperature 37 °C, Unpaired bases to participate in all dangling ends, Naview layout). We used SARS-CoV-2 genomic RNA sequence (NC_045512.2) in FASTA format as the input format. The output data were then displayed using the RNA plot tool [3]. We again left the default parameters (Naview layout, Output format Postscript .ps). We then downloaded the displayed secondary structures in high-resolution format. The raw data are provided in Supplementary Material 5.

Prediction of Human RNA-Binding Proteins Sites in SARS-CoV-2 RNA
The human SARS-CoV-2 RNA sequence was downloaded from NCBI (accession NC_045512.2) in FASTA format and inserted into the RBPmap (Mapping Binding Sites of RNA-binding proteins) webbased tool (Paz et al., 2014). The database of 114 human experimentally validated motifs was used.
Both the "High stringency" and "Apply conservation filter" options were used. The output was further filtered in Excel to keep only those hits below p-value = 1.10 -6 . The complete results are provided in Supplementary Material 6.

Phylogenetic Relationships in Nidovirales
According to the current state of knowledge, the order Nidovirales can be divided into six distinct families, and there are two unclassified species (Figure 2). Coronaviridae is the largest family and consists of 56 species. Among these are 12 species able to infect humans, including SARS-CoV, MERS-CoV, and SARS-CoV-2. Second largest is the Arteriviridae family with 22 species, third largest is the Mesnidovirineae family with 13 species, fourth is the Tobaniviridae family with 11 species, Local DNA structures in Nidovirales 8 This is a provisional file, not the final typeset article including 1 species able to infect humans. Fifth largest is the Ronidovirineae family with 5 species, and sixth is the Mononiviridae family with only 1 species.

FIGURE 2 | Phylogenetic relationships in
Nidovirales. SARS-CoV-2 is not yet recognized by iToL, but, based upon current evidence, it probably will be placed close to the Bat Hpbetacoronavirus/Zheijang2013 and Severe acute respiratory syndrome-related coronavirus (SARS-nCoV) branches (Lu et al., 2020).

Variation in PQS Frequency in Nidovirales
We analyzed the occurrence of PQSs using G4Hunter in all 109 known genomes of Nidovirales. , and there were no PQSs above the 1.6 G4Hunter score threshold. In general, a higher G4Hunter score means a higher probability of G-quadruplexes forming inside the PQS (Bedrat et al., 2016). Genomic sequence sizes, total PQS counts, and PQS frequencies characteristics are summarized in Table 1. To test whether the PQS frequency per 1,000 nt in SARS-CoV-2 is significantly greater than in a random sequence, we generated 10 random sequences with length and GC content the same as in SARS-CoV-2. In that test, the mean PQS frequency per 1,000 nt was 0.16 with standard deviation of 0.05. We applied Wilcoxon signed-rank test with continuity correction and the resulting p-value was 0.0028. Thus, the PQS frequency per 1,000 nt in SARS-CoV-2 (0.03) is significantly lower than expected.
All PQSs found in ranges of G4Hunter score intervals and precomputed PQS frequencies per 1,000 nt are summarized in Table 2. The relationship between observed PQS frequency per 1,000 nucleotides and GC content in all analyzed Nidovirales sequences is depicted in Figure 3, the Beihai Nido-like virus has both the highest GC content and highest PQS frequency. Within Nidovirales with Humans as host, however, the highest PQS frequency was found in Breda virus, which does not have the highest GC content in the dataset.

Nidovirales Human All
This is a provisional file, not the final typeset article Figure 4 shows the differences in PQS frequency by annotated loci. We downloaded features for every virus genome and analyzed the presence of PQS in each annotated sequence and in its proximity (before and after). The most notable enrichment of PQSs was located before 3′UTR, inside 5′UTR, before mRNA, and after miscellaneous RNA. The lowest PQS frequencies were found after 3′UTR, before 5′UTR, before and inside miscellaneous RNA, and in miscellaneous features.
FIGURE 4 | Differences in PQS frequency by annotated locus. The figure shows the PQS frequencies between annotations from the NCBI database. We analyzed frequencies of all PQSs inside, before, and after the annotations.

Variation in Frequency of Inverted Repeats in Nidovirales
To find short IRs in Nidovirales genomic sequences, we utilized the core of the Palindrome analyser (Brázda et al., 2016). We used values for Palindrome analyser returning inverted repeats capable to form stable hairpins (size of IRs was set to 6-30 nt, size of spacer to 1-10 nt, maximally one mismatch).
Genomic sequence sizes, total IR counts, and IR frequencies characteristics are summarized in  (Saberi et al., 2018). The lowest IR frequency was noticed in Beihai Nido-like virus 1 (19.23). Noteworthy is that this is the species of Nidovirales with the highest GC content and PQS frequency. Differences in IR frequency by annotated locus are depicted in Figure 5. By families, the highest mean IRs frequency was found in Coronaviridae (37.08) and lowest in Ronidovirineae (26.37).
Novel human coronavirus SARS-CoV-2 has relatively high IR frequency per 1,000 nt, in comparison with other Nidovirales. To test if the frequency of IRs per 1,000 nt in SARS-CoV-2 is significantly greater than in random sequence, we generated 10 random sequences with length and GC content the same as in SARS-CoV-2. The mean frequency of IRs per 1,000 nt was 30.23, with standard deviation 0.22. We applied Wilcoxon signed-rank test with continuity correction to find that p-value was equal to 0.0029. Thus, the IR frequency per 1,000 nt in SARS-CoV-2 (40.23) is significantly higher than expected. When we inspected differences of IR frequencies according to hosts, we found the highest mean IR frequency per 1,000 nt to be in Nidovirales infecting Humans (38.13), then in Vertebrates (34.25), and the lowest mean frequency is in Invertebrates (31.77). This is a provisional file, not the final typeset article A summary of all IRs found in ranges of different IR sizes and precomputed IR frequencies per 1,000 nt is shown in Table 4. Although generally the frequency of IR presence decreases with the IR length, there are notable differences between groups and also between viruses with different hosts. The most as well as longest IRs occur in the Coronaviridae group and in viruses having humans as a host. IRs 12 bp long and longer are very rare in the Ronidovirineae group. The relationship between IRs length and stability of resulting secondary structure is not simple. While some authors believe that longer IRs are more stable, others suggest that there is an energy optimum defined by arm and spacer length (Sinden et al., 1991;Brázda et al., 2016;Georgakopoulos-Soares et al., 2018). This is a provisional file, not the final typeset article FIGURE 5 | Differences in IR frequency by annotated locus. The chart compares IR frequencies per 1,000 nt between "gene" annotation and other annotated locations from the NCBI database that were found in genomes more than 10 times. We analyzed frequencies of all IRs (all) and of IRs with lengths 8 nt and longer (8+), 10 nt and longer (10+,) and 12 nt and longer (12+) within annotated locations (inside) as well as before and after annotated locations.

Prediction of Human SARS-CoV-2 RNA Folding
It is very likely that coronavirus RNA in the viral capsid is compactly folded into a spiral-like structure due to nucleocapsid phosphoprotein N, as was proposed earlier for SARS-CoV (Chang et al., 2014).
But what structure does the RNA takes after the capsid content is released into the cytoplasm of the host cell? We made a global prediction of RNA folding for SARS-CoV-2 and in a negative control (random RNA sequences with the same GC content and length). It was clearly visible that there arose a much more compact RNA structure in SARS-CoV-2 (Figure 6). FIGURE 6 | RNA fold (Lorenz et al., 2011) prediction for SARS-CoV-2 RNA molecule (left) and random sequence negative control (right). This figure shows a high level of SARS-CoV-2 genome folding via complementarity of particular RNA regions and forming of hairpins and/or cruciforms. RNA fold prediction was carried out using default parameters via Galaxy webserver (Afgan et al., 2018), which enables queries of lengths longer than 10,000 nucleotides.

Coronaviruses
Consisting of nearly 2,000 amino acids, Nsp3 is the largest multi-domain protein in Nidovirales.
Nevertheless, Nsp3's role is still largely unknown. It is believed that the protein plays various roles in coronavirus infection. Nsp3 interacts with other viral Nsps as well as RNA to form the replication/transcription complex. It also acts on posttranslational modifications of host proteins to antagonize the host innate immune response. On the other hand, Nsp3 is itself modified in host cells by N-glycosylation and can interact with host proteins to support virus survival (Lei et al., 2018). Both SARS-CoVs (2003 and have the PQSs in their genomes and have a retained M region of the SUD domain in protein Nsp3 that is reported to be critical for interacting with G-quadruplexes (Kusov et al., 2015). MERS has no PQS in its RNA sequence and, strikingly, it has a deletion in this region of the SUD domain suggesting parallel evolution simplifications.
This is a provisional file, not the final typeset article FIGURE 7 | Comparison of SUD domain M region in three pathogenic human coronaviruses. Nsp3 proteins of three pathogenic human viruses (SARS-CoV-2, SARS-CoV, and MERS-CoV) were aligned and part of the critical G-quadruplex binding M region of the SUD domain (525-577 aa according to the SARS-CoV reference sequence) was visualized. As predicted according to Kusov et al. (2015), the G-quadruplex-interacting residues K565, K568, and E571 are highlighted in green (upper consensus panel).

Prediction of Human RNA-Binding Proteins Sites in SARS-CoV-2 RNA
Using the RBPmap tool, we predicted human RNA-binding proteins sites in SARS-CoV-2 RNA. While holding to very stringent thresholds, three highly promising candidate human RNA-binding proteins were predicted. SRSF7 was predicted to bind viral RNA in nucleotide position 26,199 (NC_045512.2), which is in fact exactly the binding motif for this protein (ACGACG). Protein HNRNPA1 was predicted to bind viral RNA in nucleotide position 23,090-23 097 (exact binding motif GUAGUAGU). The last protein found was TRA2A in nucleotide position 3,056-3,065 and position 3,074-3,083, in both cases with one mismatch (the motifs found were GAAGAAGAAG and the experimentally validated motif for TRA2A is GAAGAGGAAG). All three of these proteins share multiple RGG-rich novel interesting quadruplex interaction (NIQI) motifs, which are common to most human G-quadruplex-binding proteins (Brázda et al., 2018;Huang et al., 2018;Masuzawa and Oyoshi, 2020). All find individual motif occurrence (FIMO) hits below p-value = 1.10 -4 are enclosed in Supplementary Material 7. The proteins SRSF7, HNRNPA1, and TRA2A are involved in mRNA splicing via the spliceosome pathway (Szklarczyk et al., 2015).

Discussion
Local DNA structures have been shown to play important roles in basic biological processes, including replication and transcription Travers and Muskhelishvili, 2015;Surovtsev and Jacobs-Wagner, 2018). PQS analyses of human viruses have clearly demonstrated that these sequences are conserved also in the viral genomes (Lavezzo et al., 2018) and could be targets for antiviral therapies (Wang et al., 2015;Krafčíková et al., 2017;Ruggiero and Richter, 2018). For example, the conserved PQS sequence present in the L gene of Zaire ebolavirus and related to its replication is inhibited by interaction with G-quadruplex ligand TMPyP4, and this finding has led to suggestions that G-quadruplex RNA stabilization could constitute a new strategy against Ebola virus disease (Wang et al., 2016). Our results found only one PQS in the genomic sequence of SARS-CoV-2, that having an absolute  (Fehr and Perlman, 2015). It would be of great value to know whether TMPyP4 or other G-quadruplex stabilizing compounds can inhibit replication processes of SARS-CoV-2. RNA hairpins, which are formed by IRs, are basic structural elements of RNA and play crucial roles in gene expression and intermolecular recognition. Conserved palindromic RNA structures have been found in many viral genomes, including HIV-1, and play a crucial role in their replication (Liu et al., 2018). We have found an abundance of IRs inside 5′UTR in Nidovirales genomes. In general, 5′UTR is an important locus for regulation of viral replication and gene expression. It has been demonstrated that stem integrity of phylogenetically conserved stem-loop structure located in 5′UTR of the PRRSV virus from the Arteriviridae family is crucial for viral replication and subgenomic mRNA synthesis. Similar secondary structures have been proposed for several viruses from the Arteriviridae and Coronaviridae families (Lu et al., 2011). Our analyses of annotated features in Figure 5 therefore support this report.
The discrete locations of specific IRs in viral genomes could therefore be additional targets for their regulation.
Targeting viral proteins is usually effective only against specific viral strains and fails even for closely related viral species. It appears that targeting host proteins should be able to provide a response toward a wider spectrum of viruses inasmuch as different viruses exploit common cellular pathways. Many nucleocapsid protein NCp7 helps to resolve G-quadruplex formation and therefore enables virus to spread. Stabilization of this quadruplex has been targeted by several experimental compounds. This treatment slowed or inhibited viral growth (Butovskaya et al., 2019).
Virus reproduction is dependent on the cellular transcription machinery, and therefore the interaction of the cellular proteins with viral RNA could be another target for antiviral therapy (Roberts et al., 2009). Our analysis predicted several QBPs to be capable to bind SARS-CoV-2 RNA. One of these, HNRNPA1 protein, is responsible for nuclear-cytoplasm shuttling (Garcia-Moreno et al., 2018). Like some other RNA-binding proteins, HNRNPA1 forms so-called membraneless organelles. These organelles are assemblies of proteins along with RNA or DNA that condense in specific cellular loci.
The organelles can undergo transition from liquid-like droplets to amyloid fibrils, and mutations in socalled low complexity domains of these proteins lead to formation of amyloid aggregations that are found in many neurodegenerative diseases (Gui et al., 2019). HNRNPA1 is involved in many different cellular processes and has been targeted in various diseases. One such example is the varicella zoster virus that causes chicken pox. Moreover, this response to varicella zoster virus has been connected with autoimmune disease that complicates multiple sclerosis (Kattimani and Veerappa, 2018).
HNRNPA1 is known to bind and resolve G-quadruplex formed in TRA2B promoter and to promote its transcription. Dysregulation of this binding leads to progression of colon cancer. This interaction has been targeted by the well-known G-quadruplex stabilizer pyridostatin, which led to decreased transcription from the TRA2B promoter (Nishikawa et al., 2019). Furthermore, HNRNPA1 is coexpressed with bromodomain and extraterminal domain protein BRD4 in human tumor samples. It has been shown experimentally that the well-described, naturally occurring polyphenolic flavonoid quercetin inhibited this protein and thereby led to better susceptibility to treatment in cancer patients (Pham et al., 2019). Both SRSF7 and TRA2A proteins, which are predicted to interact with SARS-CoV-2 RNA, play roles in alternative splicing (Ghosh et al., 2016). SRSF7 is a serine and argininerich splicing factor and is part of the spliceosome. Its expression has been connected to several types of cancer and it has been shown that its knockdown induced p21 expression and thus reduced cancer development (Saijo et al., 2016). Little is known about TRA2A protein. It was first identified in insects together with its paralog TRA2B, which has been studied to a greater extent . It has been found that TRA2A can promote paclitaxel therapy and promote cancer progression in triplenegative breast cancers (Liu et al., 2017). A role of TRA2A protein in regulation of HIV1 virus replication has been described. Both TRA2A and TRA2B bind to a specific HIV1 sequence and regulate its replication within the cell through alternative splicing of viral RNA (Erkelenz et al., 2013).
Scientists around the world are united in their efforts to find an effective therapy against coronavirus disease . Among the most promising candidates are remdesivir and chloroquine.
Remdesivir is an adenosine analogue that incorporates into nascent viral RNA chains and results in premature termination. Preliminary data have shown that remdesivir effectively inhibited virus infection in a human liver cancer cell line . Chloroquine is a potential broadspectrum antiviral drug and it already has been widely used as a low-cost and safe anti-malarial and autoimmune disease drug for more than 70 years. Application of chloroquine causes elevation of endosomal pH and also interferes with terminal glycosylation of the cellular receptor ACE2. This probably has a negative influence on virus-receptor binding and abrogates the infection (Vincent et al., 2005). Drug repurposing seems like a very good strategy for quickly and at low cost finding a new therapy for new human diseases (Oprea and Mestres, 2012). To date, there are few substances with Gquadruplex stabilizing features. One example is topotecan (also known by its brand name Hycamtin ® ), which frequently is used for treating ovary cancer. If everything else fails, it seems to be an option. On the other hand, topotecan cause unpleasant side effects, such as nausea, vomiting, and diarrhea (Topotecan -Chemotherapy Drugs -Chemocare). Many studies have described strong G-quadruplex stabilization effects (Li et al., 2018;Satpathi et al., 2018), which might be one possible mode of action.
Moreover, it has been proven that higher structures of nucleic acids, and G-quadruplex especially, might be stabilized by use of various natural substances. Berbamine is one such substance and is a component of traditional Chinese medicine. It frequently is used for treating chronic myeloid leukemia or melanoma and has strong binding affinity to G-quadruplex structures (Tan et al., 2014). Viral nucleic acids and their loci with G-quadruplex-forming potential are in all cases very specific and are promising molecular targets for treating serious diseases. Evidence of G-quadruplex formation as a potential target for therapy was proven for Hepatitis A, flu virus, and HIV-1 (Métifiot et al., 2014).
Because all the aforementioned cases concern RNA viruses, the viral G-quadruplexes are localized in the cytoplasm of the host cell. One of the main conclusions to be drawn from this study is that sequences able to form G-quadruplex structures in the Nidovirales (infecting human) family are very rare, and this suggests intentional suppression during evolution to simplify viral RNA replication.
Finding the proper stabilizers of viral higher RNA structures might be crucial for inhibiting or stopping This is a provisional file, not the final typeset article viral RNA replication in order to gain time for the immune system to deal successfully with an infection.