The epitranscriptome of Vero cells infected with SARS-CoV-2 assessed by direct RNA sequencing reveals m6A pattern changes and DRACH motif biases in viral and cellular RNAs

The epitranscriptomics of the SARS-CoV-2 infected cell reveals its response to viral replication. Among various types of RNA nucleotide modifications, the m6A is the most common and is involved in several crucial processes of RNA intracellular location, maturation, half-life and translatability. This epitranscriptome contains a mixture of viral RNAs and cellular transcripts. In a previous study we presented the analysis of the SARS-CoV-2 RNA m6A methylation based on direct RNA sequencing and characterized DRACH motif mutations in different viral lineages. Here we present the analysis of the m6A transcript methylation of Vero cells (derived from African Green Monkeys) and Calu-3 cells (human) upon infection by SARS-CoV-2 using direct RNA sequencing data. Analysis of these data by nonparametric statistics and two computational methods (m6anet and EpiNano) show that m6A levels are higher in RNAs of infected cells. Functional enrichment analysis reveals increased m6A methylation of transcripts involved in translation, peptide and amine metabolism. This analysis allowed the identification of differentially methylated transcripts and m6A unique sites in the infected cell transcripts. Results here presented indicate that the cell response to viral infection not only changes the levels of mRNAs, as previously shown, but also its epitranscriptional pattern. Also, transcriptome-wide analysis shows strong nucleotide biases in DRACH motifs of cellular transcripts, both in Vero and Calu-3 cells, which use the signature GGACU whereas in viral RNAs the signature is GAACU. We hypothesize that the differences of DRACH motif biases, might force the convergent evolution of the viral genome resulting in better adaptation to target sequence preferences of writer, reader and eraser enzymes. To our knowledge, this is the first report on m6A epitranscriptome of the SARS-CoV-2 infected Vero cells by direct RNA sequencing, which is the sensu stricto RNA-seq.


Introduction
The SARS-CoV-2 has a single-stranded positive-sense RNA genome that replicates exclusively in the cytoplasm of the infected cell. In this cycle, viral RNAs are never synthesized from DNA, and therefore, no viral RNAs are ever produced by transcription (Ricardo-Lax et al., 2021). In the infected cell, the virus generates four RNA species: genomic RNA copies generated by replication (positive sense single-strand), subgenomic RNAs (positive-sense) and the corresponding negative-sense intermediates of both genomic and subgenomic RNAs. (V'kovski et al., 2021). The non-structural proteins are directly translated from the positive single-strand genomic RNA and post-translationally processed. The infected cell contains, therefore, viral RNAs (genomic and subgenomic) and cellular transcripts (tRNAs, rRNAs, mRNAs, lncRNAs, microRNAs, etc…).
Cellular and viral RNAs are subjected to chemical modifications that increase the nucleotide diversity from the four canonical bases to over 150 different bases and nucleotides that add important biological features to RNAs and are necessary for proper biological function. Among these modifications the m6A is the most common (Brocard et al., 2017). Both viral and cellular RNAs are substrates for enzymes that add the methyl group (writers), recognize the signals (readers) and remove the methyl groups (erasers). The m6A methylation is associated with the consensus motif "DRACH", where the third position is the N 6 -Methyladenosine flanked by D = G or A or U, R = G or A, and H = A or U or C (Bayoumi and Munir, 2021).
Besides the viral RNA synthesis, the infection by viruses alters the gene expression patterns of host cells (Wyler et al., 2021). It is also expected that the viral infection affects the epitranscriptomics landscape of infected cells and the epigenomic pattern of viral RNAs, by modification of RNA nucleotide moieties, such as methylation. Nucleotide residue modifications are essential for proper functionality of viral RNAs and cellular transcripts by regulating RNA stability, cellular localization, translational control, and immune response escape (Gokhale et al., 2016;Li et al., 2021). Until the advent of the Oxford Nanopore technology for direct RNA sequencing (Oxford Nanopore Technologies, Oxford Science Park, Oxford, UK), the sequence of RNAs were obtained by chemical sequencing of very small RNAs (<100 bases) or deduced from cDNA sequencing, improperly called "RNA-seq" (Mardis and McCombie, 2017;Viehweger et al., 2019). Also, base modifications were identified by harsh chemical treatments, low resolution antibody-based techniques or laborious mutated reverse transcriptase synthesis (Kietrys et al., 2017). The Oxford Nanopore direct RNA sequencing, the true RNA-seq, is simple, low-cost technique that readily identifies canonical and modified nucleotide residues, as they are in vivo, by use of a properly trained computational neural network. Because the direct RNA sequencing does not employ PCR or any other synthesis, it is free from PCR bias, polymerase errors, phasing and fluorophore crosstalk (Pfeiffer et al., 2018).
The Vero cells are largely used for basic and pre-clinical research. In the course of SARS-CoV-2 Pandemic, these cells were vastly used in pre-clinical tests of a wide range of antiviral drug candidates (Agostini et al., 2019;Schultz et al., 2022), vaccine candidates (Folegatti et al., 2020;Loṕez et al., 2021), serving as a main screening method for selection of best candidates to progress on clinical trials. Moreover, the current CoronaVac inactivated virus vaccine (Sinovac Biotech) is produced in this type of cell (Bueno et al., 2021). Also, the gold-standard method to evaluate titles of neutralizing antibodies after COVID-19, or vaccination, use Vero cell as reporter (Sadoff et al., 2021;Wheatley et al., 2021), reinforcing the clinical and pharmacological relevance of these cells. It is expected that the Vero cell epitranscriptome might be different from the human Calu-3 cells but not so significantly, regarding the viral response, because of highly conserved genes and regulatory pathways in primates. In other words, humans and other primates share common traits, which supports the wide use of Vero cells in medical and pharmacological techniques of SARS-CoV-2 culturing and testing.
By using the direct RNA sequencing technique, we identified a major type of base methylation, the N 6 -Methyladenosine, or m6A, in viral RNAs from Vero infected cells (Campos et al., 2021). The Vero cell line is derived from the African Green Monkey, or vervet, and is widely used for in vitro Coronavirus propagation (Jasinska et al., 2013;Warren et al., 2015). Our results are consistent with previous m6A analysis using "nondirect RNA sequencing" or other studies that used direct RNAseq although detecting other methylation types, such as 5mC (Kim et al., 2020;Chang et al., 2021;Li et al., 2021;Liu et al., 2021b). Here we extended the viral epigenetic analysis to cellular RNAs of the infected Vero cells and compared to human Calu-3 cells.

Samples
Direct RNA sequencing reads from uninfected and infected Vero cells (Clone E6, ATCC ® CRL-1586 ™ ), and SARS-CoV-2 reads from Vero cell lysates and cell culture supernatants were obtained from (Kim et al., 2020;Taiaroa et al., 2020;Campos et al., 2021). Reads from uninfected, and infected Calu-3 cells, and SARS-CoV-2 reads from Calu-3 cell lysates were obtained from (Chang et al., 2021). The first set of samples used here comes from (Kim et al., 2020), and consists of reads from a sequencing experiment conducted on uninfected Vero cells and another sequencing experiment with Vero cells infected with SARS-CoV-2 (Wuhan-Hu-1 isolate GISAID ID: EPI_ISL_413016). The second set comes from (Taiaroa et al., 2020), and consists of reads from an experiment of infection of Vero cells by SARS-CoV-2 (Wuhan-Hu-1 isolate) and another from viral RNA reads from the supernatant of infected cells. The third sample set sequenced in our laboratory (Campos et al., 2021). Sequencing reads are from Vero cells infected by SARS-CoV-2 (Wuhan-Hu-1 isolate) and reads from viral RNAs of cell culture supernatants. Reads from Calu-3 cells infected by SARS-CoV-2 are from SARS-CoV-2/Australia/VIC01/2020, NCBI: MT007544.1. Sequencing data from uninfected Vero cells ("Vero 24h Control" SRA: SRR13089345) from (Chang et al., 2021) were also used. The reference sequences used were: The African Green Monkey annotated transcripts (Ensembl release 105 -Dec 2021) (Lee et al., 2014;Warren et al., 2015) and the Homo sapiens annotated transcripts (Ensembl Release 106 -Apr 2022), and the SARS-CoV-2 reference sequence (NC045512.2) as detailed below.

Methylation analysis
The "index" and "eventalign" modules implemented in Nanopolish (v-0.13.2) were used for the resquiggling step. Segmented raw signals generated in the previous step, and contained in the eventalign file, were pre-processed with 'm6anet-dataprep' and the predictions of m6A modifications in DRACH motifs were obtained via 'm6anet-run_inference', algorithms implemented in m6anet program (v-1.0.0) (Hendra et al., 2021). The program EpiNano v.1.2 Liu et al., 2021a) was also used for m6A detection, with the Epinano-SVM method (https://github.com/novoalab/EpiNano). The extraction of base-calling error features was performed with the "Epinano_Variants" module, and the m6A predictions were performed on RRACH motifs with the "Epinano_Predict " script included.

Analysis of DRACH motifs
To perform a detailed inspection on methylated DRACH motifs -checking for sequence information biasthe consensus sequences were obtained from the variation calling step. This procedure was performed with Longshot (v.0.4.1), with default settings for long reads, and establishing a minimum threshold of read depth for 30x coverage to accept the SNV occurrence (Edge and Bansal, 2019). The resulting VCF files obtained in this step were used to generate consensus sequences by the "bcftools consensus" -BCFTools (v.1.15) (Li, 2011). It was then possible to extract all methylated DRACH motifs, flanked by 5 nucleotides at each end to align and stack them using the Tidyverse package (v.1.3.1) (Wickham et al., 2019). To minimize the occurrence of m6a sites that may represent false positives in Vero cell samples (presence of methylated m6A sites with a reduced probability of predictions as a function of many transcripts/reads) and that could add noise to the analysis, the inspection of the DRACH motifs was performed using sequences containing m6A sites with ≥ 0.8 probability of modification threshold, and coverage ≥ 30x. To reduce the occurrence of false negatives in SARS-CoV-2 sequences with fewer reads, and a smaller genome size, the methylation probability threshold was set at ≥ 0.5. Because some DRACH motifs (including 5 nucleotides at each end) can be located at the coordinate ends of cellular and viral transcripts, and result in truncated sequences, the removal of these sequences was necessary. The analysis of nucleotide biases in DRACH motifs was performed with ggseqlogo package (v.0.1) (Wagih, 2017).

Functional enrichment analysis
To map transcripts against known functional information sources and detect statistically significant enriched terms, the gProfiler web server was used (https://biit.cs.ut.ee/gprofiler/gost) (Raudvere et al., 2019). Overrepresentation was evaluated in terms of Gene Ontology (GO) or through the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways (Kanehisa et al., 2019). The main GO categories analyzed were Molecular Function (MF), Biological Process (BP), Cellular Component (CC).

Statistical test for differential methylation
To compare the distributions of methylated sites per transcript in reads from uninfected and infected cells we performed the Wilcoxon-Mann-Whitney (WMW) test as implemented in R version 4.2.1 as a part of R base (Mann and Whitney, 1947;Fay and Proschan, 2010). This is a nonparametric test to compare differences between two independent groups when they are not normally distributed. Under the null hypothesis H0, the distributions of both groups are identical. The alternative hypothesis H1 is that the distributions are not identical, by detecting a significant difference in the medians. Violin plots were produced with package ggplot2 version 3.3.6. (https://www.rdocumentation. org/packages/ggplot2/versions/3.3.6/topics/ggplot).

Distribution of m6A sites in cellular transcripts
Direct RNA sequencing data from four experiments (13 datasets) were used for assembly. One sample was obtained by our group (Campos et al., 2021) and other three samples from (Kim et al., 2020;Taiaroa et al., 2020;Chang et al., 2021). The assembly statistics of these samples are shown in Table 1 In the uninfected Vero cell, 4228 m6A sites were found and these sites are distributed in 1871 transcripts and 1717 known genes. Most sites were mapped to transcripts encoded in the VCAN, AMOTL2, and DNAJB1 genes (Supplementary Table S1). The sample infected with SARS-CoV-2 (Kim et al., 2020) has 262 m6A sites mapped to 137 transcripts in 112 known genes, in which AMOTL2, TNFAIP3, PLK2 contain the highest number of modifications (Supplementary Table S2). The second infected sample (Taiaroa et al., 2020) revealed 1023 sites in 544 transcripts and 488 genes. Genes that contain the largest number of sites are CA12, ARHGAP29, and MYC (Supplementary Table S3). The sequencing of infected cells of our group showed a total of 35 m6A sites, in 24 transcripts that map to 14 known genes (Table 2) which is consistent with observed in other datasets.
The comparison of infected Vero cells from the studies by (Kim et al., 2020;Taiaroa et al., 2020), shows that they share 143 m6A sites. The MYC, PLK2, and PRSS23 genes, for example, share the same 7 methylation sites in each gene (Supplementary  Table S4). When we observed samples of infected Vero cells from (Kim et al., 2020), and the sample sequenced by our group, we found 6 m6A sites common to both, distributed over 4 transcripts, one of which mapped to the RACK1 gene (Supplementary Table S5). The samples of infected Vero cells from the study (Taiaroa et al., 2020), and from our sample, share 5 m6A sites, distributed over 4 transcripts (Supplementary Table  S6). All infected samples share only 3 m6A sites, located in 2 transcripts from unidentified genes, namely transcript IDs ENSCSAT00000015767.1 (m6A positions 758, coverage 604, and 966, coverage 787) and ENSCSAT00000011971.1 (m6A position 554, coverage 78) (Supplementary Table S7).
The alignment of reads, obtained from the infected Vero cell lysate from the (Kim et al., 2020), to the SARS-CoV-2 reference sequence, revealed 13 m6A sites, 12 of which map to ORF1ab (Supplementary Table S8). The SARS-CoV-2 epigenome from the sample sequenced by (Taiaroa et al., 2020), revealed 7 m6A sites, 4 of which are also located in ORF1ab (Supplementary Table S9). The SARS-CoV-2 epigenome from the cell lysate in our data showed the presence of 5 m6A sites, two of which located in ORF7b, and 2 located in the N gene (Supplementary  Table S10). SARS-CoV-2 epigenomes from samples from (Kim et al., 2020;Taiaroa et al., 2020), share 4 m6A sites, 3 of which are in ORF1ab (Supplementary Table S11). The epigenomes of the three samples obtained from Vero cell lysates share only one m6A site (Supplementary Table S12).
Because the sample set of (Kim et al., 2020) contained the largest number of mapped reads (Table 1)   to identify differentially methylated transcripts in the Vero cell infected with SARS-CoV-2 (Table 3). This analysis shows 55 differentially methylated sites distributed in 21 known genes. Among identified transcripts, are the Tensin 3 (TNS3), NUAK2, and METTL9 transcripts whose biological functions are relevant. Also, this same dataset allowed the identification of m6A sites unique to transcripts of the infected Vero cells (Table 4). This analysis reveals 47 sites distributed in 37 known genes with attention to the kinase NUAK2, Tensin 3 (TNS3), Ras member B (RHOB) and METTL9.

this dataset was used
The comparison of Vero cells data with Calu-3 reveals very similar pattern of transcripts and m6A methylation sites. Figure 1 outlines a set of comparisons made between Vero or Calu-3 cells, by m6A site, transcripts and known genes. Regarding the latter cell type, the Supplementary Tables S13, S14 show which sites are present in one sample or another (uninfected and infected cells), which sites are common to both (Supplementary Table S15), data from (Chang et al., 2021), or which sites are unique to each sample (Supplementary Tables S16, S17) (sample from Chang et al., 2021). Genes SDC1, PMM2, and SERPINA1 have the greatest number of unique m6A sites in the uninfected cell (Supplementary Table S17). In the infected cell, the unique sites are found mainly in the IFIT2, OASL and IFIT3 genes (40, 29 and 27 sites respectively) (Supplementary Tables S16, S17).
Quantitative analysis of differences in m6A methylation between uninfected and infected Vero cells was performed using the Wilcoxon-Mann-Whiteney test with two independent uninfected datasets and two independent infected datasets: (1) Kim et al., 2020 (Uninfected Vero) and Chang et al., 2021 (Uninfected Vero 24h); (2) Kim et al., 2020 (Infected Vero) and Taiaroa et al., 2020 (Infected Vero). For each dataset, the differentially methylated transcripts were filtered by two criteria: (a) there were more than 30 reads supporting the methylated site, and (b) the probability of modification being true was higher than 0.99. The number of methylated sites per transcript type was computed considering the quantity S=log (methylated sie/transcript), the (base 10) logarithm of the proportion of methylated sites per transcript, for all transcript types in each dataset.
The 4 pairwise comparisons with the WMW test results are shown as violin plots in Figure 2 for m6A sites detected by program m6anet. All comparisons show higher methylation levels in infected cells with p<0.05. Equivalent reults are obtained for the same analysis for m6A sites detected by program EpiNano as shown in Figure 3. These results are summarised in Table 5. As a control for the method, the WMW test was performed for the comparison of the two uninfected datasets, Kim et al., 2020 (Uninfected Vero) and Chang et al., 2021 (Uninfected Vero 24h), to show that they are statistically equivalent (Table 5 and Supplementary Figure S1). All comparisons between uninfected and infected samples have p<0.05 while between uninfected samples p>0.05. The lower bounds of the confidence intervals are always >0 in all uninfected versus infected comparisons while in the uninfected versus uninfected control the lower bound is <0. The results obtained with m6anet and EpiNano are equivalent, showing that irrespective of the m6A detection method the increase in m6A methylation is consistently observed (Table 5 and Figures 2, 3).
The analysis of the epitranscriptome of Vero cells as inferred using the EpiNano program using data from (Kim et al., 2020;Taiaroa et al., 2020;Chang et al., 2021) yield resuts equivalent to obtained with m6anet (Supplementary Tables S18-S22). However, m6anet detects m6A in DRACH motifs whereas   EpiNano detects RRACH, therefore yielding a slightly smaller number of inferred m6A sites due to the reduced degeneracy in the first position of the motif.
The SARS-CoV-2 m6A epigenome The epigenome of SARS-CoV-2 from infected Vero cell culture supernatants of (Taiaroa et al., 2020), shows a total of 30 methylated sites, 9 of which are found in the S gene, 4 in the ORF3a, 1 in the M gene, 2 in ORF6, 2 in ORF7a, 2 in ORF7b, and 7 in the N gene (Supplementary Table S23). The SARS-CoV-2 epigenome of (Campos et al., 2021) revealed the presence of 20 m6A sites, of which 3 are in the S gene, 4 in the ORF3a, one in the M gene, one in the ORF7a, 2 in the ORF7b, and 6 in the N gene (Supplementary Table S24). The samples obtained from the supernatants do not share m6A sites. The SARS-CoV-2 epigenome from infected Calu-3 cells reveals 11 m6A sites in ORF1ab (Supplementary Table S25).

Discussion
In the present study we tested the hypothesis that the infection of Vero cells by SARS-CoV-2 affects the m6A methylation patterns of cellular transcripts. For this, the transcriptome of the infected cell was sequenced using the Nanopore direct RNA sequencing method (Oxford Nanopore Technologies, Oxford Science Park, Oxford, UK). Datasets from four studies were compared. One from our group (Campos et al., 2021) and three others from (Kim et al., 2020;Taiaroa et al., 2020;Chang et al., 2021).
Statistical analysis of data here presented revealed that the m6A methylation of cellular RNAs is significantly higher in infected cells as compared to uninfected cells (Table 5, Figures 2,  3). This finding is supported by two different m6A detection programs (m6anet and EpiNano) Liu et al., 2021a;Hendra et al., 2021). The p-values of unifnfected versus infected sample comprarions are always <0.05 and the lower bound of confidence intervals are always >0 which indicates the  rubustness of the inference (Table 5). The functional enrichment analysis of these datasets revealed an increased methylation of genes involved in translation, peptide and amine metabolism which is consistent with a scenario in which the viral infection reduces the general translational activity of the cell, activation of stress-induced signaling pathways, and employing viral proteins that affect cellular translation and RNA stability to direct the translational machinery towards the synthesis of its own proteins (Nakagawa et al., 2016). The m6A methylation of transcripts involved in the general cellular translation function is consistent with observations that m6A methylation in coding domains slows down translation elongation because m6A leads to ribosome pausing in a codon-specific manner (Choi et al., 2016). However, recent studies have shown multiple roles of m6A in regulating translation and both positive and negative effects of this epitranscriptomic signal on protein synthesis have been reported (Mao et al., 2019). Methylation at different mRNA regions may have distinct functions, therefore it is important to elucidate the local effects of m6A on translation. Here we provided initial data on the general patterns of m6A in cellular transcripts and further studies are necessary to determine the local effects of m6A in individual transcripts. The quantitave analysis, via WMW test, shows that the global m6A methylation is higher in infected Vero cells as Summary of number of m6A sites, number of transcripts and number of known genes in host cell epitranscriptome data here analysed. The Venn diagrams show the number of m6A sites in uninfected and infected Vero cells (derived from African Green Monkeys) (Kim et al., 2020) (A), and between uninfected and infected Calu-3 cells (human derived) (Chang et al., 2021) (B). In each panel, the number of m6A sites, transcripts and known genes are displayed for each host cell type. compared to uninfected cells ( Figure 2). Also, results obtained using two different m6A detection programs, m6anet and EpiNano, yield equivalent results (Figures 2, 3). As discussed above it still remains to future work to determine if this global higher m6A methylation is inhibiting or enhancing the translatability of cellular mRNAs. The qualitative analysis, as discussed below, suggests that transcripts of genes involved in translation, peptide and amine metabolism are differentially m6A methylated upon SARS-CoV-2 infection. The analysis here presented allowed the identification of differentially methylated transcripts and m6A unique sites in the infected cell transcripts (Tables 3, 4) and confirms the general m6A pattern observed with miCLIP and RIP-seq techniques (Liu et al., 2021b). However, it must be noted that this study (Liu FIGURE 3 Violin plots of the distributions of differentially methylated transcripts between Uninfected and Infected Vero cell datasets of using program EpiNano. The areas indicate the data distribution of each sample and the horizontal bars in the middle of the areas indicate the Medians. The Effect size and p-values are as obtained by the Wilcoxon-Mann-Whitney test as described in Material and Methods. The abscissas idicate the samples and the ordinates indicate the quantity S=log(methylated sites/transcript), which is the logarithm with base 10 of the proportion of methylated sites per transcript, for all transcript types in each dataset.  (Kim et al., 2020), Taiaroa (Taiaroa et al., 2020) and Chang (Chang et al., 2021). et al., 2021b) used RIP-seq which do not have a 1 nucleotide resolution and miCLIP, athough claiming a 1 nucleotide resolution depends on antibody crosslink and cDNA sequencing. Among the three datasets here analyzed we decided to use the sample set of (Kim et al., 2020) for these analyses because this dataset contains the largest number of mapped reads (Table 1). This dataset allowed the identification of differentially methylated transcripts in the SARS-CoV-2 infected Vero cells (Table 3). This analysis revealed that at least 55 sites, distributed in 21 known genes, are differentially methylated. The majority of transcripts show a reduced m6A methylation upon infection such as TMED2, while a few show increased methylation, such as the proto-oncogene JUNB, a key transcriptional modulator of macrophage activation (Fontana et al., 2015) and the immediate early response IER3, involved in cellular stress response and inflammation (Arlt and Schäfer, 2011). Other interesting transcripts revealed in this analysis are Tensin 3 (TNS3), NUAK2, and METTL9 (detailed below). Also, this same dataset allowed the identification of m6A sites unique to infected Vero cell transcripts (Table 4). This analysis revealed 47 sites distributed in 37 known genes with attention to the kinase NUAK2, a critical target in liver cancer (Yuan et al., 2018), Tensin 3 (TNS3), a SH2 domain protein that contributes to tumorigenesis and metastasis (Qian et al., 2009), Ras member B homolog (RHOB), a member of the Rho GTP-binding protein family (Wennerberg and Der, 2004) and METTL9, a methyltransferase that mediates pervasive 1-methylhistidine modification in mammalian proteomes (Davydova et al., 2021).
The transcriptome-wide analysis shows very strong nucleotide biases in DRACH motifs of cellular transcripts, which use the signature GGACU, both in Vero cells and Calu-3 cells (Figures 4A-D (from 1 to 5) and 3' (from 11 to 15). The ordinates indicate the score in Bits as it deviates from the null hypothesis, in other words the stronger the bias, the higher the score. A and C have 100% frequency in positions 8 and 9 of all DRACH motifs analyzed whereas positions with zero or near zero indicate that the four canonical bases are at equilibrium frequency f (N) =0.25 in the same sampling space.
is GAACU (Figures 4E-I, L). In Influenza virus it has been shown that the DRACH motif biases are much less significant, and the Influenza virus signature is AAACN with frequencies A=0.50, G-0.25 and U=0.25 in the first position, A=G=0.50 in the second position and A=33.3, C=33.3, U=0.25 and G=0.83 in the fifth position (Bayoumi and Munir, 2021). In positions 3 and 4 respectively, A and C are 100%. This is substantially different from what we found in SARS-CoV-2 ( Figures 4E-I, L). Moreover, the Influenza virus study was based on cDNA sequences while our observations are based on direct RNA sequencing. Our data show that the sequence preference for methylation in the viral genome is different from the cellular transcripts. This is consistent with the fact that SARS-CoV-2 is a recent primate pathogen. We hypothesize that this virus might be undergoing an adaptive process that would result in the adjustment of its m6A methylation pattern to match those of the cellular transcripts because both use host encoded writer, reader and eraser enzymes (Li et al., 2021). It is important to note that the direct RNA sequencing has been validated by orthogonal methods to identify modified bases as revealed by the comparison with liquid chromatography-tandem mass spectrometry and methylated RNA immunoprecipitation sequencing (MeRIP-seq) (Li et al., 2021). Therefore, results obtained with direct RNA sequencing, and the downstream bioinformatic pipelines, readily identify modified bases, particularly methylated modifications confirmed by the above-mentioned techniques.
Functional enrichment analysis is a set of statistical methods to extract biological information from omics data in terms of Functional enrichment analysis of the infected Vero cell m6A epitranscriptome (dataset from Campos et al., 2021). A total of 24 transcripts common to infected cells was used in enrichment analysis (Table 1), with Gene Ontology and KEGG biological pathways as data sources for overrepresentation. The analysis was performed with default gProfiler web server options, with g:SCS algorithm for computing multiple testing correction for p-values. Terms are grouped by data sources (Gene Ontology classifications or KEGG biological pathways). The GO categories are in the left columns, green bars indicate the -log 10 of the p-value, blue and black squares indicate significant positive hits of the transcript IDs (vertical top columns) with GO categories. GO : MF, Molecular Function; GO : BP, Biological Process; GO : CC, Cellular Component and KEGG, Kyoto Encyclopedia of Genes and Genomes. functional categories. These methods are widely used for the analysis of gene and protein lists and regulatory elements (Garcia-Moreno et al., 2022). Taken together our results on functional enrichment analysis ( Figure 5 and Supplementary  Figures S2, S3), differential methylation (Table 3) and unique m6A in infected cells (Table 4) indicate that the cell response to viral infection not only changes the levels of mRNAs, as previously shown (Wyler et al., 2021), but also its epitranscriptional pattern.
Here, the epitranscriptomics of the Vero cell was studied because of its widespread use for Coronavirus isolation and propagation in vitro. This cell line is derived from the African Green Monkey, or vervet (Chlorocebus sabaeus) and therefore is a model, or an approximation, for the human infection pattern (Jasinska et al., 2013;Warren et al., 2015). Although the genomes of great apes, including humans, differ from monkeys by 7% it must be noted that in genomes on the order of 7 billion bases (diploid genomes) 93% identity means a difference of 490 million substitutions (Rhesus Macaque Genome Sequencing and Analysis Consortium et al., 2007). Therefore, results obtained from non-human primates must be taken in perspective and cannot be extrapolated in limine for humans, based only on a superficial notion of similarity (Jasinska et al., 2013;Woolsey et al., 2021). Consequently, future experiments on the epitranscriptome of human cell lines infected with SARS-CoV-2 are essential for a proper understanding of the human cellular response in the context of SARS-CoV-2 infection.

Data availability statement
The data presented in this study are deposited in the National Center for Biotechnology Information (NCBI) as BioProject ID: PRJNA861323, BioSamples: SAMN29911639 and SAMN29911623.