Enhanced Expression of Autoantigens During SARS-CoV-2 Viral Infection

Immune homeostasis is disturbed during severe viral infections, which can lead to loss of tolerance to self-peptides and result in short- or long-term autoimmunity. Using publicly available transcriptomic datasets, we conducted an in-silico analyses to evaluate the expression levels of 52 autoantigens, known to be associated with 24 autoimmune diseases, during SAR-CoV-2 infection. Seven autoantigens (MPO, PRTN3, PADI4, IFIH1, TRIM21, PTPRN2, and TSHR) were upregulated in whole blood samples. MPO and TSHR were overexpressed in both lung autopsies and whole blood tissue and were associated with more severe COVID-19. Neutrophil activation derived autoantigens (MPO, PRTN3, and PADI4) were prominently increased in blood of both SARS-CoV-1 and SARS-CoV-2 viral infections, while TSHR and PTPRN2 autoantigens were specifically increased in SARS-CoV-2. Using single-cell dataset from peripheral blood mononuclear cells (PBMCs), we observed an upregulation of MPO, PRTN3, and PADI4 autoantigens within the low-density neutrophil subset. To validate our in-silico analysis, we measured plasma protein levels of two autoantigens, MPO and PRTN3, in severe and asymptomatic COVID-19. The protein levels of these two autoantigens were significantly upregulated in more severe COVID-19 infections. In conclusion, the immunopathology and severity of COVID-19 could result in transient autoimmune activation. Longitudinal follow-up studies of confirmed cases of COVID-19 could determine the enduring effects of viral infection including development of autoimmune disease.

Multiple factors are involved in the development of autoimmunity, including genetics, age, and environment (11). Between the environmental triggers, viral infections, particularly those resulting in low interferon production, as it is the case with SARS-CoV-2 infection, have long been associated with induction of autoimmunity (11). Similar to many severe viral infections, SARS-CoV-2 could trigger the autoimmune reaction through multiple mechanism including molecular mimicry, epitope spreading, bystander activation, and persistence of latent virus (12)(13)(14)(15). Aforementioned mechanisms could be understood through examination of homology between various antigens of SARS-CoV-2 and self-antigens (16). Development of crossreactive epitopes are then dependent on viral strain as well as host genetic susceptibility, including human leucocyte antigen (HLA) polymorphism.
Although the concept of autoimmunity had been explored in previous viral infection (28), its relevance to COVID-19 respiratory infection deserves more attention; especially that immune derangement during SARS-CoV-2 infection could potentially trigger relapse and induction of many new cases. Therefore, the aim of current study was to utilize publicly available transcriptomic COVID-19 data to evaluate autoimmune activation during COVID-19 infection through measuring the gene expression levels of 52 autoantigens, known to be associated with 24 different autoimmune diseases.

MATERIAL AND METHODS
For the purpose of this study, we used a list of 52 autoantigens established by Burbelo et al. (29) for diagnosis of 24 different autoimmune disease including Hashimoto's thyroiditis, ANCAassociated vasculitis, rheumatoid arthritis (RA), and Systemic lupus erythematosus (SLE) ( Table 1). The expression of these autoantigens in the lungs and whole blood of COVID-19 patients was then determined in-silico using publicly available datasets. We validated autoantigen with mRNA expression equal or more than 1.5 LogFC change in leucocyte isolated from COIVD-19. These datasets were publicly available at National Center for Biotechnology Information Gene Expression Omnibus (NCIB GEO, http://www.ncbi.nlm.nih.gov/geo) and the European Bioinformatics Institute (EMBL-EBI, https://www.ebi.ac.uk). Moreover, single cell transcriptomic datasets of sorted neutrophils were used. In addition, the expression of autoantigens following COVID-19 infection was compared to that following infection with three respiratory viruses: SARS-CoV-1, IAV, and RSV.
RNA-sequencing platforms were used for COVID-19 studies, while microarray platforms were used for older datasets of SARS-CoV-1, IAV, and RSV ( Table 2). For the COVID-19 lung autopsies dataset (PRJNA646224) (32), the authors have extracted RNA from Formalin fixed paraffin embedded (FFPE) tissues from 9 COVID-19 fatal cases, and 10 SARS-CoV-2uninfected individuals who undertook biopsy as part of routine clinical care for lung cancer. For this lung autopsy datasets, we used processed sequencing data provided by Wu Meng et al. (32). The authors used DESeq2 to identify differentially expressed genes between the cases and controls. Benjamini-Horchberg correction was used for multiple testing (37). For COVID-19 whole blood transcriptomic dataset, we used processed sequencing data deposited under project number EGAS00001004503 (33). In this study, Aschenbrenner et al. extracted the RNA from whole blood of 39 COVID-19 patients and 10 healthy controls and analyzed it using NovaSeq 6000 (33). The authors used DESeq2 to identify differentially expressed genes between the cases and controls. Independent hypothesis weighting was used for multiple testing correction (33). Transcriptomic datasets of leucocytes isolated from 10 controls, 51 moderate COVID-19, 49 severe COVID-19 (12 non-critical and 37 critical) were used to validate the results of autoantigens with mRNA expression levels of equal or greater than 1.5 log fold change (GSE157103) (34). For leukocytes study, we processed the raw data using the Bioconductor package limma-voom (38), and presented the results as log2 counts per million (LogCPM). Independent student t-test (39) was used to compare between the independent groups. In addition, for SLE whole blood transcriptomic study, we used processed data provided by Panousis et al. (35). In this study, the authors used DESeq2 to identify differentially expressed genes between 79 active SLE and 58 controls. Benjamini-Horchberg correction was used for multiple testing (35).
Transcriptomic datasets of whole blood isolated from RSV and IAV infected patients (GSE17156) (31) and PBMCs isolated from SARS-CoV-1 infected patients (GSE1739) (30) were analyzed. In both studies, blood was obtained during peak of patient's symptoms, and processed by authors for RNA extraction and hybridization following Affymetrix protocol. After quality check, we normalized, and log transformed the raw Affymetrix data. Microarray data (CEL files) were pre-processed in our study with Robust Multi-Array Average (RMA) technique using R software (40). The probe set with the largest interquartile range (IQR) of expression values were selected to represent the gene. For RNA-seq study, we processed the data using the Bioconductor package limma-voom (38), and presented the results as logCPM. Log-transformed normalized intensities were also used in Linear Models for MicroArray data (LIMMA) analyses to identify differentially expressed genes between diseased and control groups. We used the default Benjamini-Horchberg correction for multiple testing. Raw data from different studies were never mixed or combined. For each study, the LogFC was obtained separately by analyzing data of diseased and controls. Statistical analyses were performed using R software (v 3.0.2) and Prism (v8; GraphPad Software). For all analyses, p-values <0.05 were considered significant.
For the single cell dataset, transcriptomic datasets of sorted neutrophils were used. Wilk, AJ, et al. (36) performed single sequencing on PBMC from seven COVID-19 patients, and six healthy controls. The details of sample isolation, sequencing, and data processing are available at NCBI GEO, and the study protocols (36). Briefly, PBMCs (GSE150728) were isolated from blood via standard Ficoll-Paque density gradient centrifugation. Authors performed the single-cell RNA-seq library preparation using a Nextera XT DNA library preparation kit (Illumina FC-131-1096) with 1 ng of pooled library and dual-index primers. Sequencing was performed on a NovaSeq S2 instrument (Illumina; Chan Zuckerberg Biohub) (36). Differentially expressed genes (DEGs) were calculated by comparing gene expression of individual COVID-19 samples with gene expression of all healthy controls using Seurat's implementation of the Wilcoxon rank-sum test. Only DEGs with a two-sided p value <0.05 adjusted for multiple comparisons by Bonferroni's correction were selected. The investigators clustered neutrophils in to two clusters, lowdensity neutrophils and canonical neutrophils. The novel cell population of low-density neutrophils was significantly increased only in patients with ARDS.

Gene Ontology
Gene ontology enrichments analyses was performed using DisGeNET (41) and Gene Ontology biology process databases. Metascape.org (42) was used to identify the enrichment in DisGeNET (41). Terms with a p-value <0.01, a minimum count of 3, and an enrichment factor >1.5 were collected and grouped into clusters based on their membership similarities. The top few enriched clusters (one term per cluster) were presented. Gene Ontology biology process database was accessed through Enrichr open source, available as a gene set enrichment analysis web server (43,44). GO biological processes were ranked according to combined score. This score was computed in Enrichr by taking the log of the p-value from the Fisher exact test and multiplying that by the z score of the deviation from the expected rank (43,44).

ELISA
The plasma level of MPO and PRTN3 (PR3) for seven noninfected controls, eight severe, and eight asymptomatic COVID-19 patients was determined using commercially available human ELISA kit (MPO, Cat # ab119605 and PR3, Cat # ab226902, Abcam, Cambridge, MA, USA). The plasma used in our study was obtained from COVID-19 patients recruited from Rashid Hospital. Plasma was isolated from blood via standard Ficoll-Paque density gradient centrifugation (Sigma, Histopaque-10771). Assays were preformed strictly following the manufacturer's instructions. Each sample was assayed in duplicate, and values were expressed as the mean of 2 measures per sample. One-way analysis of variance (ANOVA) and post hoc Tukey multiple comparison analyses were applied.

MPO and TSHR autoantigens Are Associated With More Severe COVID-19 in Lung Autopsy and Whole Blood
Using publicly available transcriptomic datasets, we have determined the expression levels of 52 autoantigens, known to be associated with 24 different autoimmune diseases. The list of these genes and their associated autoimmune disease is presented in Table 1. The datasets used in this study are presented in Table 2. Expression levels of autoantigens were determined in lung autopsies and whole blood (Figure 1). For lung, RNAsequencing data was obtained from 9 deceased COVID-19 patients and 10 negative controls (PRJNA646224) ( Figure 1A).
We next used gene ontology databases to determine general diseases associated with these seven genes ( Figure 3A). DisGeNET database was pooled using publicly available metascape.org tool showing that beside autoimmune disease, these genes were associated with other inflammatory and fibrotic conditions affecting different organs including lung (Bronchiectasis; MPO, PRTN3, and TRIM21) and kidney (Glomerulonephritis; MPO, PRTN3, TRIM21, and PADI4). In addition, the GO biological process database revealed the association of these seven genes with interferon alpha signaling, neutrophil activation, and regulation of cytokine production ( Figure 3B).

Upregulation of Autoantigens in Low-Density Neutrophils During COVID-19 Infection
After establishing an overall upregulation of autoantigens in lung autopsies and whole blood COVID-19 patients, we next determined whether the observed increase in autoantigens is reflected on the main inflammatory cells regulating COVID-19 severity. We extracted the neutrophil and lymphocyte counts from PRJNA646224 study and compared the neutrophil to lymphocyte ratio between 16 mild and 15 severe COVID-19 patients. As presented in Figure 4A, severe COVID-19 had significant higher neutrophil to lymphocyte ratio. Next, A single cell dataset of immune cells isolated from peripheral blood mononuclear cells (PBMCs) (GSE150728) of COVID-19 severe patients were used (36). Between the different immune cells three autoantigens, MPO, PRTN3, and PADI4, were significantly enriched within the low-density neutrophil subset ( Figure 4B). Moreover, the counts of low-density neutrophil and canonical neutrophil were increased during severe COVID-19 infection ( Figure 4C). To validate our in-silico analysis, we next measured plasma protein levels of two autoantigens of MPO and PRTN3 in severe and asymptomatic COVID-19. The protein levels were estimated using human ELISA assays. The results revealed an increase of MPO (mean 36,787 ± 1,961 vs mean 29,007 ± 1,860 pg/ml; p-value = 0.038) and PRTN3 (mean 151.5 ± 38 vs mean 14.77 ± 1.4 ng/ml; p-value = 0.001) in severe compared to asymptomatic COVID-19 infection ( Figure 5). Level of these proteins were not different in asymptomatic and non-infected controls ( Figure 5).

Prominent Autoantigens Upregulation in Coronavirus Infections Relative to Other Viral Infections
We next compared the profile of autoantigen upregulation observed during SARS-CoV-2 to that detected during other respiratory viral infection. To do that, we used transcriptomic microarrays and RNA-sequencing data from blood of SARS-CoV-1, influenza A virus (IAV), and respiratory syncytial virus (RSV) infected patients at the peak of disease. For each condition, differential expression and LogFC were obtained by comparing the normalized gene expression of the infected group versus healthy donors ( Figure 6A). None of autoantigens were upregulated more than one LogFC in IAV and RSV, while five autoantigens in SARS-CoV-1 and seven autoantigens in SARS-CoV-2 were upregulated more than one LogFC ( Figure 6A). MPO, PRTN3 (PR3), and PADI4 were the top shared autoantigens appearing in both coronavirus respiratory infections, with an increase in expression of more than 1.5 LogFC. We then intersected the differentially expressed genes in all four respiratory infections to obtain the shared signatures ( Figure 6B). Interestingly, TSHR and PTPRN2 autoantigens were specifically increased in SARS-CoV-2 ( Figure 6B). TSHR was also overexpressed in COVID-19 lung autopsies. Two genes (IFIH1 and TRIM21) were upregulated in three viral infections; however, their expression was higher in SARS-CoV-2 compared to IAV, and RSV ( Figure 6B).
of three autoantigens, MPO, PRTN3, and PADI4, were higher in the blood of severe compared to mild COVID-19. Autoimmune disease could be triggered by genetic and environmental factors; viral infections had been known as a major environmental cause of transient autoimmunity that could potentially lead to relapse or induction of de novo autoimmune disorders (11). These autoimmune disorders emerge weeks post viral infection; hence sensitive serological tests are needed to determine the cause-effect relationship between SARS-CoV-2 infection and autoimmune disease diagnosis (45)(46)(47).
Interestingly, MPO and TSHR were increased in both lung autopsies and whole blood of severe COVID-19 patients. Comparison of COVID-19 blood transcriptomic with IAV and RSV revealed that MPO, PRTN3, and PADI4 were selectively upregulated in coronavirus infections, SARS-CoV-1 and SARS-CoV-2, while TSHR and PTPRN2 autoantigens were distinctive to SARS-CoV-2 infection. These two genes (TSHR and PTPRN2) did not increase in the mild COVID-19 and hence they were associated with more severe infection.
Following the results obtained through gene ontology enrichment analyses, single cell transcriptomics of PBMCs revealed the significant increase in MPO, PRTN3, and PADI4 mRNA levels within low-density neutrophils. In addition, analyses of cell counts provided by Wilk et al. study showed significant increase in both low-density and canonical neutrophils during severe COVID-19 infection compared to controls ( Figure 4C) (36).
The top seven autoantigens upregulated in the blood of severe COVID-19 patients, were associated with a wide range of vascular and inflammatory autoimmune disorders ( Figure 3A). Vasculitis, a shared condition between fatal COVID-19 and vascular autoimmune diseases such as Anti-neutrophil cytoplasmic autoantibody (ANCA) vasculitis, is featured by elevation of MPO and PRTN3 levels (25,48). Wilk et al. single cell data identified a distinct group of low-density neutrophil; these immune cells were only detected in severe COVID-19 complicated with acute respiratory distress syndrome (ARDS) (36). These cells had a significantly high level of MPO, PRTN3, and PADI4 indicating that they could be a major source of the observed increase in blood level of these autoantigens during severe COVID-19. Following SARS-CoV-2 infection, neutrophil-derived extracellular traps (NETs) formation by neutrophils, NETosis, may therefore lead to burst of autoantigens including PADI4, MPO, and PRTN3 in the context of immunostimulatory molecules.
Confirming previous findings (49,50), increased MPO level reflected COVID-19 severity. Neutrophil activation leading to net formation or NETosis is observed in both viral infections and autoimmune diseases (47,51,52). NETosis could then be considered as a common pathological modulator of viral infection and autoimmune disease (47,52,53). NETosis autoantigens, MPO, PRTN3, and PADI4, were markedly increased in SARS-CoV-1 and SARS-CoV-2 but they did not appear in IAV and RSV. Supporting our findings, a recent   (55). MPO is a peroxidase enzyme responsible for intracellular catalytic reactions between hydrogen peroxidase and chlorides to form hypochlorous acid (56,57). NETosis leads to extracellular burst of chromatic, histones, and neutrophil granules containing MPO, PRTN3, and PADI4. An exaggerated increase in the level of these antigens during COVID-19 infection could lead to breaking the autoimmune tolerance and the recognition of autoantigens by immune sentinel cells. NETosis induces inflammation and DNA deployment that could trigger autoimmunity. While NET produced by neutrophils, their clearance is achieved by macrophages efferocytosis. During ARDS, neutrophil count and lifespan is significantly increased, and the ability of macrophages to engulf NETs and apoptotic cells is significantly decreased, prolonging the lung injury induced by neutrophil blast. Pharmacologic treatment could be used to enhance NETs clearance. Macrophage NETs efferocytosis could be restored by AMP-activated protein kinase (AMPK) activator such as Metformin or application of neutralizing antibody against HMGB1 (58). Therefore, such medications could be considered to reduce blood levels of these autoantigens, and hence lower chance of triggering autoimmunity.
TSHR is expressed by thyroid epithelial cells, and various extra-thyroidal tissue including the adipose, peripheral blood cells, and fibrocytes (59). Derived from monocytes, human fibrocytes express both thyroglobulin and thyrotropin receptor (60). They are increased during lung injury and have both the inflammatory characters of macrophages and the tissue remodeling features of fibroblasts (61). Chronic inflammatory conditions such as autoimmunity, cardiovascular disease, and asthma promote differentiation of immune cells to circulating fibrocytes and their accumulation at the site of injury (61,62).
TSHR is targeted by autoantibodies during Graves' disease (63). SARS-CoV-2 infection has been connected with the initiation and relapse phases of Grave disease (45,46). Of note, this disease has emerged in some patients during COVID-19 recovery period. Patients diagnosed were negative for nasopharyngeal swab PCR test but were positive for both IgM and IgG SARS-CoV-2 antibodies. This suggests that the observed increase of autoantigens during SARS-CoV-2 infection may trigger autoimmunity. This could lead to initiation or relapse of autoimmune disorders as a long-term COVID-19 outcome. In fact, evidence of transient autoimmunity has been reported among long COVID-19 outcomes by several studies (64)(65)(66). Follow-up data from survivors of viral infections have shown appearance of autoimmune disorders within weeks to months after recovery (67). In addition, TSHR elevation was only detected in severe COVID-19, which could suggest higher chance for appearance of autoimmune disorders post severe COVID-19 infection.
The majority of COVID-19 patients are expected to show one or more residual symptoms months after recovering from the infection (66,(68)(69)(70). In a post-acute COVID-19 follow-up study of 179 confirmed cases in Italy, fatigue and dyspnea persisted in around half of recovered individuals, while joint pain and chest pain lingered in 20-30% of recovered patients (68). Similarly, a 6-month follow-up study of 1,733 COVID-19 hospitalized patients from China reported lasting of fatigue and muscle weakness in more than half of patients, while patients who were more severely ill during their hospital stay had more persisting long-term symptoms (71).
Emerging case reports have shown that SARS-CoV-2 induces long-term immune-inflammatory abnormalities (66,72). Schenker et al. reported a 65-year-old female patient with de novo reactive arthritis and cutaneous vasculitis 10 days after recovery of all COVID-19-related symptoms (72). In a more severe case, a 31year female was presented with fatal multisystem inflammatory syndrome (MIS) 2 weeks post recovery from COVID-19 (65). This increase in incidence of autoimmunity was also reported among children, where SARS-CoV-2 epidemic was associated with a 30fold increase in Kawasaki-like disease (71).
Long-COVID-19 persisting symptoms involve immunemediated inflammatory disease and neurological abnormalities that could suggest possibility of triggering pre-existing or de novo autoimmune reactions weeks or month after COVID-19 recovery (37,73). Previous studies have shown that autoantigen gene upregulation is often followed by an increase in the respective autoantibody level (29,48,74). Although the increase in autoantigen expression observed in this study could only trigger short lasting autoimmunity, follow-up longitudinal studies are needed to establish the long-term enduring effects of SARS-CoV-2 infection in developing autoimmune diseases.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/supplementary material.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Dubai Health Authority. The patients/participants provided their written informed consent to participate in this study.