Transcriptomic Signature Differences Between SARS-CoV-2 and Influenza Virus Infected Patients

The reason why most individuals with COVID-19 have relatively limited symptoms while other develop respiratory distress with life-threatening complications remains unknown. Increasing evidence suggests that COVID-19 associated adverse outcomes mainly rely on dysregulated immunity. Here, we compared transcriptomic profiles of blood cells from 103 patients with different severity levels of COVID-19 with that of 27 healthy and 22 influenza-infected individuals. Data provided a complete overview of SARS-CoV-2-induced immune signature, including a dramatic defect in IFN responses, a reduction of toxicity-related molecules in NK cells, an increased degranulation of neutrophils, a dysregulation of T cells, a dramatic increase in B cell function and immunoglobulin production, as well as an important over-expression of genes involved in metabolism and cell cycle in patients infected with SARS-CoV-2 compared to those infected with influenza viruses. These features also differed according to COVID-19 severity. Overall and specific gene expression patterns across groups can be visualized on an interactive website (https://bix.unil.ch/covid/). Collectively, these transcriptomic host responses to SARS-CoV-2 infection are discussed in the context of current studies, thereby improving our understanding of COVID-19 pathogenesis and shaping the severity level of COVID-19.


INTRODUCTION
Coronaviruses (CoV) are enveloped single-stranded positive-sense RNA viruses surrounded by spike glycoproteins shaping the typical "corona-like" appearance (1). To date, seven strains of human CoV have been identified. Four of them (HCoV-229E, HCoV-OC43, HCoV-NL63, HKU1) circulate in the population and cause only mild upper-respiratory tract infections in immunocompetent individuals (2). In the last two decades, three highly pathogenic viruses acquired by zoonotic transmission caused outbreaks of severe pneumonia. Severe acute respiratory virus (SARS)-related CoV 1 infected~8000 individuals in 2002-3, with a fatality rate of 10%; the Middle-East respiratory syndrome CoV (MERS-CoV) infected~8000 individuals since 2012, with a~36% fatality rate; the new coronavirus SARS-CoV-2 emerged in the province of Wuhan (China) in the end of 2019, infecting more than 100,200,000 individuals worldwide and killing over 2'150'000 as of January 2021.
While the immune response to SARS-CoV- 2 has not yet been fully characterized, it is likely to engage immune mechanisms similar to those previously described for SARS-CoV and MERS-CoV, and, more generally, other RNA viruses (3,4). Viral single stranded and double-stranded RNA are recognized by at least 3 families of pattern recognition receptors (PRR), including extracellular and endosomal Toll-like receptors (TLR3/4/7/8), cytoplasmic retinoic acid-inducible gene I-like receptors (RIG-I/ MDA5), and the cytosolic RNA-activated protein kinase (PKR). Signal transduction by PRR subsequently leads to the induction of transcription factors such as interferon regulatory factors (IRFs) and nuclear factor k B (NF-k B), thereby inducing the synthesis and secretion of pro-inflammatory cytokines such as Type I and III interferons (IFNs), as well as the production of chemokines inducing adaptive immunity. In turn, both type I and III IFNs induce the expression of interferon stimulated genes (ISGs), which restrict and limit viral spread and stimulate the adaptive immune responses (5,6), resulting in the generation of viral peptide-specific T cells (7,8) and the production of viralspecific antibodies (9)(10)(11).
Increasing evidences suggest that SARS-CoV-2 induces specific response mechanisms, probably distinct from those triggered by other viruses, that can lead to major immune dysregulation (12). However, only few clinical studies have investigated the comparison of immune activation responses during COVID-19 to other viral infections. To better understand the specific features of the immune response to COVID-19, we compared the transcriptional profiles of patients infected by SARS-CoV-2 across different disease severity to that of patients infected by Influenza A or B viruses. Altogether, our results suggest that severe COVID-19 presentation results from a defect in or escape from innate immunity associated with an unbalanced adaptive immune response.

Patients
This prospective observational study of SARS-CoV2 and Influenza viruses-infected patients was conducted in the emergency department, internal medical ward and in the intensive care unit of Lausanne University Hospital (CHUV), a tertiary care center in Switzerland. Adult patients were included in this study if COVID-19 or Influenza were confirmed by real time polymerase chain reaction from a nasopharyngeal swab. COVID-19 patients were included between February 6 th 2020 and April 3 rd 2020. Patients with Influenza were included between January 21 st 2015 and March 17th 2020. Healthy volunteers had a negative COVID-19 serology and were included in June 2020. Patients and healthy volunteers included in this study signed an informed consent form for genetic and functional testing, according to protocols approved by the Cantonal Ethics Committee of the state of Vaud (CER-VD 479/13, CER-VD 2019-02283, CER-VD 2020-01108). Samples were stored within a dedicated biobank fulfilling quality standards according to the Swiss Biobanking Platform criteria ("Vita label", certificate CHUV_2004_3).
Patients demographics, comorbidities, symptoms, vital signs and laboratory results performed during routine care were recorded using a standardized electronic case report form in REDCap (Research Electronic Data Capture) or Secutrial. Bedside clinical scores to identify patients at risk of poor outcome were calculated at first assessment in the emergency department: (i) Quick Sequential Organ Failure Assessment (qSOFA): one point each for systolic hypotension (≤100 mmHg), tachypnea (≥22/min) or altered mentation [Glasgow coma score ≤14 (13)], CRB-65: one point each for Confusion (Glasgow coma score ≤14), elevated Respiratory rate (≥30/min), low blood pressure (systolic <90 mmHg or diastolic ≤60 mmHg), age ≥65 years (14). Clinical outcomes were assessed by checking the electronic health record and by calling patients. COVID-19 patients were classified into three groups according to disease severity: (i) outpatients and/or inpatients (e.g. admitted for a motif other than COVID-19) without oxygen requirement (OXY0); (ii) inpatient with oxygen requirement without intubation (OXY1) and (iii) intubation and/or respiratoryrelated death (TUBE).

RNA Extraction and Library Preparation
Total RNA from whole blood was isolated using Blood RNA tubes (BD) for blood collection. RNA integrity was assessed with a Bioanalyzer (Agilent Technologies). The TruSeq mRNA stranded kit from Illumina was used for the library preparation with 400 ng of total RNA as input. Library molarity and quality were assessed with the Qubit and Tapestation using a DNA High sensitivity chip (Agilent Technologies). Libraries were pooled at 2 nM and loaded for clustering on several lanes of a Single-read Illumina Flow cell to reach an average of 30 Million of reads per library. Reads of 100 bases were generated using the TruSeq SBS chemistry on an Illumina HiSeq 4000 sequencer.

Differential Expression Analyses
Differential gene expression analyses were performed in R [using DESeq2_1.26.0 built-in functions (17)]. Overall similarity between samples was assessed by first applying a variance stabilizing transformation (VST) to the gene-level count matrices using the vst function taking into account the experimental design (blind=FALSE), and then performing a principal components analysis (PCA) on the regularized matrix using the prcomp function on the 10% most varying genes. To identify differentially expressed genes (DEGs), separate paired analyses between groups were computed using the function DESeq with default parameters. Log2 fold changes were moderated with the lfcShrink function using the apeglm (18) shrinkage estimator and differentially expressed genes were selected based on their multiple testing corrected adjusted P value (<0.01).

KEGG Pathway Enrichment
DOSE version 3.4.0 (19) and clusterProfiler version 3.6.0 (20) were used to identify pathways with a significant enrichment of DEGs. Analyses were performed separately for each comparison using the gseKEGG function with genes ranked by sign (log2FoldChange) * -log10 (pval). Pathways with an adjusted P-value <= 0.05 were considered significantly enriched.

Gene Set Enrichment Analyses of Functional Groups
Gene set enrichment analyses (GSEA) were performed in R using fgsea version 1.12.0 built in functions (21). Normalized enrichment scores and adjusted P-values were computed separately for each comparison using the fgsea function with genes ranked by sign(log2FoldChange) * -log10(pval), the manually curated gene sets reported in Supplementary Tables 2-3 and the number of permutations equal to the number of genes in the ranked list (16,670). Gene sets with an adjusted P-value <= 0.05 were considered significantly enriched.

Characteristics of Patients Infected With SARS-CoV-2 and Influenza Viruses
Blood gene-expression was measured in 103 patients infected with SARS-CoV-2 (collectively named "COVID-19") and compared to that of 22 patients infected with Influenza A or B (INFL) as well as 27 non-infected, healthy individuals (HLTY) COVID-19 patients were stratified according to the level of respiratory failure; 23 did not require oxygen support ("OXY0"), 40 received oxygen but no mechanical ventilation ("OXY1") and 40 required mechanical ventilation ("TUBE"). In the latter group, 15 patients were sampled within the first 7 days in hospital ("TUBE early") while 25 were sampled later ("TUBE late").
Patients and controls characteristics are shown in Supplementary  Table 1. The duration of symptoms before admission was similar in all disease groups. Baseline characteristics and comorbidities were also similar among all COVID-19 patients, except for an over-representation of males among the TUBE late group. As expected, baseline symptoms (fever, cough and dyspnea), signs (temperature, systolic blood pressure and respiratory rate) and severity scores (qSOFA and CURB65) significantly worsened from OXY0 to OXY1 and TUBE patients. INFL patients were significantly older than COVID-19 patients (mean age 73 versus mean ages ranging from 54 in OXY0 to 64 in TUBE early), but had a similar pattern of comorbidities, except for hypertension, dyslipidemia and gastrointestinal diseases, which were significantly less frequent in INFL than OXY1. Their qSOFA score was comparable to that of OXY0 and OXY1 while the CURB65 score and hospitalization rate was comparable to those of TUBE patients. INFL were significantly more likely to have fever, cough, fatigue, leukocytosis and thrombocytopenia at baseline than OXY0. In contrast, they were less likely to have hypotension, dyspnea, tachypnea, leukocytosis and radiological infiltrates at baseline; and death at 30 days than OXY1 and TUBE early.

Overall Transcriptomic Profile of Patients Infected With SARS-CoV-2 and Influenza Viruses
Total RNA libraries produced on average 34.1 million reads (standard deviation, SD=4.7), among which an average of 30.6 million (SD=4.6) were mapped against the 228,048 transcripts of Gencode v34 and aggregated at the gene level using Salmon v1.3.0. After removal of hemoglobin genes, all samples had produced at least 10 millions of usable reads, except samples from subjects TUBE early.01 and TUBE late.05, which had 9.9 and 9.8 million reads, respectively (Supplementary Figure 1) Figure 1). Sex was confirmed by counting the number of reads mapping on the male and female specific genes RPS4Y1 (ENSG00000129824.16) and XIST (ENSG00000229807.12, Supplementary Figure 1). The 16,670 genes with at least 1 cpm in at least 15 samples (e.g. the size of the smallest group) were kept for further analysis.
A total of 2,839, 3,821, 5,252 genes had higher expression (adjusted P<= 0.01), and a total of 2,164, 3,305 and 3,956 genes had lower expression, among OXY0, OXY1 and TUBE early, respectively, compared to HLTY ( Figure 1A). A total of 2,527, 1,832, and 1,135 genes had higher expression, and a total of 2,404, 1,848 and 737 genes had lower expression, among OXY0, OXY1 and TUBE, respectively, compared to INFL ( Figure 1B). The number of upregulated genes compared to both reference groups (HLTY and INFL) are 1,438, 1,001 and 636 in OXY0, OXY1 and TUBE early, respectively ( Figure 1C), whereas the number of down-regulated genes compared to both reference groups are 1,037, 835 and 318 ( Figure 1C). The two first principal components of the 10% most varying gene expression across all samples explained 47.6%, 46.1% and 51.1% of variance, respectively, when comparing OXY0, OXY1 and TUBE early groups with HTLY and INFL ( Figure 1D).
An unsupervised analysis was performed by using gene pathways sets from the Kyoto Encyclopedia of Genes and Genomes (22,23) (KEGG, Supplementary Figure 2  [hsa04110]) and "metabolism" (e.g. oxidative phosphorylation [hsa00190]), sharing a number of core-enriched genes, were upregulated in COVID-19 compared to INFL, with a gradient from less to more severe COVID-19.

Immunological Transcriptomic Features of Patients Infected With SARS-CoV-2 and Influenza Viruses
To translate the impact of differentially expressed genes (DEGs) in terms of biological responses, we systematically analyzed sets of genes matching specific immune processes ( Figure 2). Representative sets available from KEGG (22), Gene Ontology [GO (24,25)], Reactome (26) or Human Genome Organization (HUGO) Gene Nomenclature Committee [HGNC (27)] were selected (Supplementary Table 2) and shown by heatmaps and/ or volcano plots, with specific genes data detailed on box-plots. For the sake of simplicity, expression patterns across the different groups were regarded as "ascending" or "descending", when expression levels progressively increased, or decreased, respectively, from HLTY to OXY0, OXY1, TUBE early, and IFNL (Supplementary Figure 3 first and second rows). Patterns were qualified to have a "hill" or "valley" shape, when highest and lowest expression levels were seen in COVID-19 (OXY1 or TUBE early), and the lowest and highest in the other (HLTY and INFL), respectively (Supplementary Figure 3 third and fourth rows). Because the TUBE late group often represented a convalescent status, their gene expression levels were ignored in this pattern description. When considering innate immune pathways, we first analyzed sets of genes encoding PRR and their signaling effectors, ISGs as well as cytokines/chemokines. Most genes involved in antiviral signaling were over-expressed both in TUBE early and INFL patients compared to HLTY (Supplementary Figure 4; pathways involved in the detection of Influenza A and other RNA viruses), but gene expression was significantly lower among COVID-19 than among INFL ( Figure 3). Most striking was the under-expression of genes involved in viral recognition (e.g. TLR7, UNC93B, RIG-I/DDX58,    (e.g. STAT2, IRF9, SOCS1, MX1, OAS2, TRIM6, Figure 4). Most of these genes had expression levels increasing from less to more severe COVID-19 ("ascending" pattern). Gene expression levels for cytokines and chemokines could not always be assessed (Supplementary Figure 5); when measurable, their expression levels and that of their receptors tended to be lower among COVID-19 compared to INFL, although the pattern was not uniform and depended on structural groups. In particular, genes coding for cytokines from the IL-4-like and IL-1-like subfamilies had lower expression levels among OXY0 and/or OXY1, whereas those belonging to the TGF-B family tended to have higher expression levels among OXY0, compared to INFL.
When considering innate immune cells, we first observed that genes involved in NK cell functions had an overall lower expression levels in TUBE early compared to HLTY (Supplementary Figure 6 right panel) and in each COVID-19 group compared to INFL ( Figure 5 and Supplementary Figure 6 left panel). Some genes important for NK cell maturation were significantly under-expressed, with a "valley" pattern (e.g. Killer-cell inhibitory-receptors such as KIR2DL1 or KIR3DL2, as well as TBX21) or an "ascending" pattern (FCGR3B, Figure 5). Genes important for NK cell cytotoxicity also harbored a "valley" (perforin/PRF1, granulysin/GNLY, KSP37/ FGFBP2) or "ascending" pattern (CD107a/LAMP1). In contrast, NKG2A/KLRC1 and LAG3 were not overexpressed and CD94/ KLRD1 was even under-expressed among TUBE early compared to both HTLY (Supplementary Figure 6 left panel) and INFL ( Figure 5). Genes involved in macrophage and neutrophils were over-expressed in patients infected with viruses (COVID-19 and INFL) compared to HLTY ("ascending" pattern, e.g. CLEC4D and CD55, Figure 6). Genes involved in neutrophil degranulation (e.g. GGH) and/or other neutrophil functions (S100A8/9 and S100A12) had a peak of expression in the TUBE early group ("hill" pattern, Figure 6).
Adaptive immune pathways were notable for genes involved in antigen presentation through MHCI with proteasome, MHCII with endosome, T helper differentiation, proliferation/maturation/ function of T and B cells. Genes involved in antigen presentation through the MHC I had higher expression levels among virusinfected patients (TUBE early and INFL) compared to HLTY, with an "ascending" pattern (e.g. HLA-A, HLA-B, TAP-1, TAP-2, Figure 7 and Supplementary Figure 7). In contrast, genes involved in antigen presentation through MHC II presented a "valley" pattern, with under-expression among TUBE early compared to HLTY and COVID-19 compared to INFL, which strikingly predominated in TUBE early (e.g. HLA-DRA, HLA-DRB1, HLA-DQA1, CD74/Ii/CLIP/SLIP, Figure 7).
When focusing on T cell subclasses, some markers of memory were over-expressed (e.g. SELL/CD62L Figure Figures 10-11). Finally, exhaustion genes such as PDCD1/PD-1 or regulatory genes such as FOXP3 were under-expressed among COVID-19 compared to INFL or HTLY, mainly among TUBE early patients (Supplementary Figure 8).
Most genes involved in the complement cascade were overexpressed in TUBE early and INFL compared to HTLY (Supplementary Figure 14 right panel) but their expression was usually lower among COVID-19 compared to INFL ("ascending" pattern, e.g. C1QC, C2, C5), with the exception of C3, that, was over-expressed in TUBE early compared to INFL patients ( Figure 10 and Supplementary Figure 14 left panel). Genes encoding blood groups, some of which are part of the complement cascade (e.g. C4A, C4B) or are involved in complement activation (e.g. CR1, CD55, ABO) or regulation (e.g.CD59, GYPA, GYPB) were overexpressed solely among TUBE patients (e.g. ABO, GYPA, XK, Figure 10 and Supplementary Figure 15). Overall, there was no differential expression for genes involved in the coagulation cascade, except for a decreased expression among OXY0 compared to INFL.

Transcriptomics Signatures Associated With SARS-CoV-2 Infection and Its Severity
Finally, we investigated whether differential expressed gene signatures may be used to characterize SARS-CoV-2 infected patients (compared to the other) or severe versus non-severe presentation of COVID-19 ( Table 1). We first identified genes significantly differentially expressed in all acute COVID-19 groups (OXY0, OXY1 and TUBE early) compared to INFL, and, to be even more stringent, to HLTY as well (Supplementary Table 4). A total of 209 and 6 genes were significantly over-and under-expressed, respectively, in all comparisons (adjusted P <=0.01). Among the most significantly over-expressed genes, a large number encoded immunoglobulins (N=67, e.g. JCHAIN, adj P=3.5E-09 for TUBE versus INFL) or were related to cell cycle (N=44, e.g. CDC6, adj P=6.3E-07 for the same comparison). The most significantly down-regulated gene was

DISCUSSION
In this study, we describe major differences in the blood transcriptomic profiles of COVID-19 patients compared to subjects affected with influenza and between COVID-19 of different level of severity. One of the most striking features was the low expression of ISGs in COVID-19 compared to INFL patients. ISGs represent a group of genes transcriptionally activated by IFN signaling, which is essential for both innate and adaptive antiviral immunity against a wide range of pathogens (28,29). ISGs restrict viral replication and spread by inhibiting key steps of viral life cycle (30,31). The downregulation of ISGs is consistent with numerous immune escape mechanisms developed by CoVs to enhance their replication and/ or survival in the host (32,33 (45)]. The under-expression of genes involved in viral detection (such as TLRS or RLRs) observed in this study may be due to additional mechanisms, including genetic variation in the host (46) or virus-dependent transcriptional regulation. Furthermore, conditions associated with a severe presentation of SARS-CoV-2, such as advanced age, diabetes and cancer (47) are characterized by an impairment in type I and III interferon responses (48). While ISGs expression levels were lower in COVID-19 compared to INFL patients, expression levels of IFNs themselves, were not detectable in the peripheral blood, similarly to other transcriptomic studies (49). In a cellular model of SARS-CoV-2 infection, the production of type I and III IFNs was relatively low at low multiplicity of infection (MOI), but not at higher MOI, suggesting that IFN induction is initially limited, but can increase once a high level of viral replication is achieved (50). Conversely, in an animal model of SARS-CoV, robust viral replication was associated with delayed IFN type I responses, which subsequently leads to increased inflammation and lung damage (51). In cellular models, MERS-CoV induced a delayed and attenuated IFN type I response compared to those induced by human coronavirus 229E (HCoV-229E) (52). Similarly, SARS-CoV-2 patients requiring invasive mechanical ventilation presented the highest amount of IFNa at days 8 to 10 of symptom onset (53). Recent studies have also suggested that antibodies directed to IFN may be responsible for delayed antiviral immunity (54). Single-cell RNA sequencing revealed a suppression of IFN signaling among COVID-19 patients compared to Influenza patients (55). In addition, the same technique highlighted the key role of type I IFN in exacerbating inflammation in severe COVID-19 patient compared to what observed in healthy donors and in severe influenza patients (56). Altogether, these observations indicate that unfavorable outcomes in COVID-19 may result from the delayed or impaired production of type I and III IFNs, or their subsequent inhibition by antibodies, with compromised virus control and prolonged activation of inflammatory cytokines. This is in agreement with previously reported data from Galani et al. (57).
Natural killer (NK) cells and effector T cells both target infected cells exposing viral peptides through the MHC I. As reported for the SARS (58), the amount of both NK cells and T cells was highly decreased in SARS-CoV-2 patients (59)(60)(61). In addition, the frequencies of NK cells expressing CD16 and/or KIRs were reduced in the blood of patients infected with SARS-CoV-2 (61). Consistent with this observation, we found reduced expression of CD16, KIRs and TBX21 in SARS-CoV-2 infected patients compared to INFL patients. These data suggest that the former patients harbor immature NK cells, which may not be able to migrate towards infected tissues. Furthermore, we observed that SARS-CoV-2 infected patients had lower expression of genes involved in NK cell cytotoxicity such as perforin (PRF1), granulysin B (GNLY) and CD107a (LAMP1), an observation which is consistent with studies analyzing intracellular expression patterns of NK cells from COVID-19 patients (59,62). Immune checkpoints such as NKG2A/KLRC1, a co-receptor of CD94/KLRD1 which interacts with HLA-E, as well as inhibitory receptors such as LAG3 were upregulated in SARS-CoV-2 infected patients compared to healthy individuals (59). In our study, NKG2A/KLRC1 and LAG3 were not overexpressed and CD94/KLRD1 was even under-expressed among TUBE EARLY patients compared to both HTLY and INFL, suggesting that the level of inhibition may depend on the severity level of the disease. Genes involved in neutrophil degranulation and/or other neutrophil functions had increased expression levels in patients with severe manifestations of COVID-19, such as S100A8/9 and S100A12, encoding calprotectin and calgranulin, respectively, as recently reported (63). However, such responses may be a general feature of severe lung infection rather than a COVID-19-specific feature (64,65).
Another relevant feature of transcriptional profiles observed in this study is the low expression of MHC I and MHC II encoding genes among patients infected with SARS-CoV-2 compared to INFL patients. Down-regulation of genes involved in antigen presentation through both the MHC I and MHC II have been reported in lung epithelial cells infected with MERS-CoV, but not with SARS-CoV-1, highlighting important differences in viral escape mechanisms among different CoVs (66). Inhibition of both MHC I [reviewed in (67)] and MHC II [reviewed in (68,69)] represents well-known mechanisms developed by viruses to escape immune response. Specifically, Herpesviridae can prevent MHC I mediated antigen presentation by interfering with the generation of antigenic peptides by the proteasome (EBV), preventing their transport across the endoplasmic reticulum towards the peptide-loading complex (HSV, CMV) and/or by avoiding adequate presentation of this complex on cell surfaces [CMV, KHSV, reviewed in (67)]. Interestingly, our study shows that both transporters associated with antigen processing (TAP) 1 and 2 had lower expression levels in SARS-CoV-2 patients compared to patients with influenza, revealing a potential immune escape mechanism for SARS-CoV-2. In addition, it was recently proposed that the protein encoded by SARS-CoV-2 open reading frame (ORF) 8 can directly interact with MHC I molecules and significantly down-regulate their surface expression on various cell types (70). Since ORF8 is among the viral sequences having the less homology among CoVs, differences in ORF8 may explain differential capability among CoVs to inhibit antigen presentation through MHC I (69). Altogether, our results suggest that SARS-CoV-2 restrains antigen presentation and T cell mediated immune responses.
Activation of CD4+ T cells by interactions with peptides bound to MHC II is a crucial step in clearance of most pathogens. Many viruses have developed ways of blocking antigen presentation, although fewer mechanisms or viral interference have been described for MHC II compared to MHC I (69). Specifically, viruses can target MHC class II transactivator (CIITA), a key molecule in the control of MHC II proteins transcription (EBV, KHSV, HIV), the invariant chain protein (Ii/CD74), which coassembles with MHC II ab heterodimers in the endosome (CMV, EBV), its associated peptide (CLIP; HCV), as well as MHC II proteins themselves, taken either alone (CMV, EBV) or during their interaction with the T cell receptor [EBV; reviewed in (69)]. Accordingly, we observed that genes encoding proteins involved in MHC II activation (including CTSB, Ii/CD74 and CLIP) had low expression levels in patients infected with SARS-CoV-2 compared to INFL patients, revealing another potential immune escape mechanism for SARS-CoV-2.
T cells play a crucial role in antiviral immunity and were reported to be decreased in SARS-CoV infection (59)(60)(61) [reviewed in (71)] but their phenotypical and functional alterations due to COVID-19 are poorly characterized. While CD4 T helper cells contribute to B cell activation and subsequent antibody production, CD8 T cells kill infected cells and reduce viral burden. We observed reduced expression of CD4 in all severity types of COVID-19 patients compared to HLTY and INFL patients; in contrast, CD8 was under-expressed solely among TUBE early patients compared to HTLY, with levels similar to those of INFL patients. Altogether, these data suggest a more specific burden of SARS-CoV-2 on CD4 T cells than over CD8 T cells. Persistence of high expression levels of CD8 among OXY0 and OXY1 patients may result from the increased expression of IRF4 which is known to enhances CD8 expression (72), and that showed an expression pattern contrasting that of the other ISGs. Conversely, reduced expression of CD8 T cells in TUBE early patients may result from CD8 T cell infiltration of the lungs, as reported by others (73).
When analyzing the differential expression of memory markers in our dataset, we were not able to characterize specific T cell subsets, a finding that may be a consequence of the experimental design of the study (as a whole blood vs. single cells were considered) or of the lack of a well-characterized T cell memory pattern in the patients themselves. When considering the differential expression of T cell markers of activation, we noted a significantly enhanced expression of CD38, Ki-67/MK67 and CD44 [which is consistent with the observation by Braun et al. (7)], but not of HLA-DR (7,74), CD69 (75) and CD25 (76). Furthermore, we noted an over-expression of CSF1/GM-CSF correlated with COVID-19 severity, a finding that may lead to increased inflammation in severe patients, as suggested by Zhou et al. (77). Interestingly, GM-CSF+ CD4 T cells have been associated with inflammation in autoimmune diseases (78). We also observed a decreased expression of IL2RB/AKA CD122, which correlated with severity and may result in T cell deregulation (79). In contrast to previous studies (80), the exhaustion marker PD-1 was not over-expressed, but rather under-expressed, among TUBE early patients, which may reflect a defect in TCR signaling and/or in effector T cells activation. Altogether, the expression of T cell markers of memory and activation was contrasted, suggesting an overall dysregulation of T cell activity (81).
Humoral immune responses also play an important role in the clearance of SARS-CoV-2 and the establishment of an immunological memory (82). SARS-CoV-2 elicits a strong B cell response with previously described kinetics (83,84). Our data reveal a dramatic increase in the expression of immunoglobulin encoding genes of most classes (IGHM, IGHA, IGHG) in COVID-19 compared to HLTY and INFL subjects, with a peak among OXY1 and/or TUBE-early patients. There was also an increased expression of J-CHAIN, which contribute to polymer formation of secreted IgM and IgA, thereby enhancing antigen avidity and viral binding (85). These findings are consistent with the over-expression of genes involved in B cell differentiation and antibody class switching (CD40L, UNG, ICOS, BAFFR, CD19 and TACI) in OXY0 and OXY1 compared to INFL patients, with the exception of the Bruton's tyrosine kinase (BTK). Paradoxically, antibodies can play a deleterious role in SARS-CoV-2 pathogenesis. Opsonization of anti-spike antibodies allow SARS-CoV entering non-ACE2-expressing cells, which harbor Fc-g-RIIA, a receptor that happened to be over-expressed in TUBE early patients (86,87). Excessive antibody production together with presentation of host proteins resulting from prolonged tissue destruction may enhance auto-reactive responses contributing to disease severity (88,89). In addition, antibodies targeting SARS-CoV epitopes were shown to cross-react with cytokines, such as IL-11 (90,91); recent studies also revealed presence of antibodies targeting type-I IFN in COVID-19 patients (54). Single cells transcriptome of peripheral blood mononuclear cells in both COVID-19 and Influenza-infected patients revealed an enrichment of plasma cells in COVID-19 patients which has been correlated with the production of multiple protective neutralizing antibodies (92). Altogether, these data suggest an important activation of B cells with increased antibody production in COVID-19 compared to INFL patients. Further studies are needed to determine whether this strong response results from specific features of the virus, from the absence of previous exposure to its antigens, and to which extent they contribute to an adverse outcome.
The complement and coagulation systems are increasingly reported to play a relevant role in the pathogenesis of COVID-19 (93). Several studies have revealed that SARS-CoV-2 induced the activation of several complement pathways (94,95). Complement system is based on the activation and cleavage of proteins that cannot be addressed with transcriptomic data. Interestingly, we observed an increased expression of complement component-encoding genes in both COVID-19 and INFL compared to HLTY, with patterns not strikingly different in patients infected with both viruses. Some genes were slightly under-expressed among OXY0 compared to INFL (C1qrs, C2, C5); in contrast, the complement component C3 was over-expressed in TUBE early patients compared to INFL patients. Activated C3 can exacerbate SARS-CoV-associated acute respiratory distress syndrome (ARDS) (95). This suggests that, while the activation of the complement exists in both COVID-19 and in INFL, its persistence over a longer period of time in COVID-19 may contribute to inflammation, tissue damage, coagulation and neutrophil activation, leading to organ injury. This is consistent with the significant increase in the expression of neutrophil degranulation-associated genes observed among TUBE early compared to INFL patients. There was no definite profile in the expression of genes involved in the coagulation cascade among COVID-19 patients compared to INFL. In contrast, we observed a strong overexpression of blood group encoding genes in TUBE early/late patients compared to the other groups. Blood groups have been shown to change the immune response to infections (96). Some of these genes encode proteins involved in the pathogenesis of viral infections, such as GYPA, which codes for glycophorin A, a protein considered as the major receptor for different viruses on the red blood cell surface (97)(98)(99). Others, such as those of the Knops blood group system, play a role in the activation and/or modulation of the complement cascade (100,101). However, it remains unclear whether the overexpression of blood-group genes is related to the pathogenesis of COVID-19 or reflects a general activation of inflammation and/or hematopoiesis.
Increasing evidence reveals important interactions between metabolic and immune functions (102). For instance, aerobic glycolysis is induced during the activation of numerous immune cells, such as M1 macrophages (103), dendritic cells (104), T cells (105), B cell (106,107) and NK cells (108). In addition, oxidative phosphorylation plays an important role in the activation of M2 macrophages (109,110) and in the expression of transcription factors that are essential for T and B cells. This is consistent with the over-expression of genes involved in carbohydrate metabolic pathways in OXY1 compared to INFL patients. Differential expression of genes involved in the metabolism may also contribute to disease conditions (111).
Viruses influence host cells to create a favorable environment for their replication (112,113). SARS-CoV non-structural proteins 3b and 7a were shown to induce a G0/G1 phase arrest in infected cells (114,115). This phenomenon is probably limited to the site of infection, i.e. respiratory epithelial cells, which are infected in COVID-19. In contrast, the analysis of whole blood transcriptome reveals significant overexpression of genes involved in cell cycle, which probably reflects the activation of immune cells distant from the site of infection. This overexpression was present in all types of COVID-19 patients (OXY0, OXY1 and TUBE) compared to both HTLY and INFL. Further studies are needed to understand whether cell-cycle activation is a specific feature of SARS-CoV-2 infection or a component of the resulting inflammatory response.
To date, biomarkers of COVID-19 and/or COVID-19 severity mostly included clinical and inflammatory characteristics (116). To our knowledge, no study allowed for a large scale comparison of gene expression between COVID-19 and another viral infection. By using blood from patients with different severity levels of COVID-19 as well as INFL and HLTY, we identified and/or confirmed a number of very significant associations. In particular, genes encoding immunoglobulins and those related to cell cycle appear as a hallmark of COVID-19, as they were significantly independently upregulated in each COVID-19 severity group (OXY0, OXY1 and TUBE early) compared to both INFL and HLTY. Furthermore, a number of genes were significantly over (e.g. genes encoding PRR, ISG, or related to macrophages functions, degranulation of neutrophils, blood groups, complement, metabolism and/or oxidative phosphorylation) or under-expressed (e.g. genes related to MHC II, T-cell function or encoding immunoglobulins) among TUBE early patients compared to the other severity group (OXY0 and OXY1), thereby providing a potential basis to predict COVID-19 outcome. However, new studies including samples taken at different time-points of COVID-19 as well as patients with infections due to other pathogens will be needed to confirm these findings and/or establish reliable diagnostic tools to predict COVID-19 outcome.
Like other transcriptomic studies, this work has several limitations. The number of patients included in the different groups was limited, a factor that may have restricted the number of DEG reported. Samples were taken from whole blood and do not necessarily reflect gene expression patterns in clinically affected organs and/or individual cells. The sequencing depth may have restricted differential detection of less abundantly expressed genes. Finally, the samples were issued from a single cohort of patients, and thus validation from other cohorts would be useful. This study provides a comprehensive overview of the immune response among patients with different severity levels of COVID-19. These include a dramatic decrease in IFN responses, a reduced cytotoxicity activation in NK cells, an increased degranulation of neutrophils, a dysregulation of T cells, a dramatic increase in B cell function and immunoglobulin production, as well as an important over-expression of genes involved in metabolism and cell cycle. This study opens the way to further investigations aimed at elucidating the molecular mechanisms that underlay these observations. This study also suggests that it may be possible to identify a signature which could be useful to identify early patients at risk of adverse outcome.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Cantonal Ethics Committee of the state of Vaud (Swissethics 2020-01108). The patients/participants provided their written informed consent to participate in this study.  P-YB is supported by the Swiss National Science Foundation (31CA30_196036, 33IC30_179636 and 314730_192616), the Leenaards Foundation, the Santos-Suarez Foundation as well as grants allocated by Carigest. NB is supported by the Leenaards Foundation. CR is supported by the Swiss National Science Foundation (31CA30_196036 and 31003A_176097).

AUTHOR CONTRIBUTIONS
Supplementary Figure 3 | Interpretation of boxplots ("ascending", "descending", "hill" and "valley" pattern). Expression patterns across the different groups were regarded as "ascending" or "descending" when median expression levels progressively increased, or decreased, respectively, from HLTY to OXY0, OXY1, TUBE-early, and INFL. Conversely, patterns were qualified to have a "hill" or "valley" pattern, when peak and lowest median expression levels were seen in COVID-19 (OXY1 or TUBE-early), and the lowest and higher in both HLTY and INFL, respectively. Because TUBE-late often represented a convalescent status, their gene expression levels were ignored in this pattern qualification.