Molecular mimicry of SARS-COV-2 antigens as a possible natural anti-cancer preventive immunization

Background In the present study we investigated whether peptides derived from the entire SARS-CoV-2 proteome share homology to TAAs (tumor-associated antigens) and cross-reactive CD8+ T cell can be elicited by the BNT162b2 preventive vaccine or the SARS-CoV-2 natural infection. Methods and results Viral epitopes with high affinity (<100nM) to the HLA-A*02:01 allele were predicted. Shared and variant-specific epitopes were identified. Significant homologies in amino acidic sequence have been found between SARS-CoV-2 peptides and multiple TAAs, mainly associated with breast, liver, melanoma and colon cancers. The molecular mimicry of the viral epitopes and the TAAs was found in all viral proteins, mostly the Orf 1ab and the Spike, which is included in the BNT162b2 vaccine. Predicted structural similarities confirmed the sequence homology and comparable patterns of contact with both HLA and TCR α and β chains were observed. CD8+ T cell clones cross-reactive with the paired peptides have been found by MHC class l-dextramer staining. Conclusions Our results show for the first time that several SARS-COV-2 antigens are highly homologous to TAAs and cross-reactive T cells are identified in infected and BNT162b2 preventive vaccinated individuals. The implication would be that the SARS-Cov-2 pandemic could represent a natural preventive immunization for breast, liver, melanoma and colon cancers. In the coming years, real-world evidences will provide the final proof for such immunological experimental evidence. Moreover, such SARS-CoV-2 epitopes can be used to develop “multi-cancer” off-the-shelf preventive/therapeutic vaccine formulations, with higher antigenicity and immunogenicity than over-expressed tumor self-antigens, for the potential valuable benefit of thousands of cancer patients around the World.


Introduction
The Coronavirus disease 2019 (COVID-19) pandemic has resulted in a dramatic global public health crisis (1) and the entire scientific community has focused the attention on strategies to combat such an emerging infection.In this quest, successful preventive vaccines have been developed and administered to millions of individuals worldwide inducing an effective protective immunity (2)(3)(4).
The Pfizer-BioNTech BNT162b2 vaccine is based on the entire Spike Glycoprotein of the virus to elicit neutralizing antibodies for blocking the interaction between the virus and the ACE-2 cellular receptor.The SARS-CoV-2 Spike protein is made of the N-terminal S1 subunit, for virus-receptor binding and a C-terminal S2 subunit responsible for virus-cell membrane fusion (5).The S1 subunit includes an N-terminal domain (NTD) and a receptor-binding domain (RBD).The latter directly binds to the peptidase domain (PD) of angiotensin-converting enzyme 2 (ACE2) (6,7).
The BNT162b2 vaccine has been administered to >700 million of individuals and the efficacy against Covid-19 ranges between 86 and 100% across countries and populations, with a protection from severe diseases >96% (8)(9)(10).The response to anti-SARS-Cov-2 preventive vaccine shows high interpersonal variability at short and medium term (11), which is not dependent from the individual HLA allelic variants (12).
In addition to NAb response, T cell reactivity against SARS-CoV-2 epitopes has been reported in vaccinated subjects (13).Similar pattern of immune B and T cell responses have been reported in SARS CoV-2 infected individuals and specific target epitopes have been identified (14)(15)(16)(17)(18)(19).
We have recently identified 59 immunogenic epitopes linked to the most prevalent HLA alleles in the sequence of the Spike protein included in the BNT162b2 mRNA vaccine.An established longterm CD8+ T cell memory response specific to 23/59 epitopes was found, with a strong immunodominance for NYNYLYRLF (HLA-A24:02) and YLQPRTFLL (HLA-A02:01) epitopes (20).
In the present study we wanted to verify whether the molecular mimicry between epitopes derived from the SARS-CoV-2 and publicly available TAAs can be identified along with cross-reacting CD8+ T cells.This would suggest that the SARS-CoV-2 pandemic has represented a natural "anti-cancer vaccination", eliciting a spectrum of anti-viral T cell clones able to cross-react with tumor antigens.Such memory CD8+ T cells may be promptly recalled during the lifetime by cancer cells expressing TAAs similar or identical to the SARS-CoV-2 antigens and protect from cancer progression.This is possible because the degeneracy of the TCR in antigen recognition allows each single receptor to cross-react against similar antigens, recognizing at least 10 6 different MHC-bound peptides (21,22).Sequence and structure homologies between microorganism-derived antigens (viruses and bacteria) (MoAs) and tumor-associated antigens (TAAs) have been previously described (23).In particular, the homology should be localized in the amino acid residues at the central position of the peptides bound to the MHC-I groove, which directly interact with the T-cell receptor (TCR).Therefore, MoAs may elicit a T-cell response cross-reactive with TAAs and several experimental evidences have been reported so far (24)(25)(26)(27)(28)(29)(30)(31)(32)(33).In particular, we have shown that several HLA-A*02:01-restricted cancer epitopes share sequence homology with peptides derived from human chronic viruses (i.e.Herpesviruses, papilloma, hepatitis B).Similarly, novel HLA-A*02:01-associated TAAs specific for hepatocellular carcinoma (HCC) (i.e.ISG15) with a high sequence homology to viral-derived antigens have been predicted (i.e.Calicivirus).Bioinformatics modelling showed that epitope's pairs shared very similar 3D conformation, with an almost-identical pattern of contact with HLA and TCR.Finally, ex vivo immunization experiments demonstrate that PBMCs cross-react to the paired peptides (29)(30)(31).In particular, we have recently described peptides derived from HIV-1 sharing high sequence and conformational homology to TAAs derived from non-AIDS defining cancers (e.g.colon and breast cancers).Cross-reactive T-cells are identified only in HIV-positive patients, suggesting that a natural infection may turn out in a natural "preventive anti-cancer vaccine" (33).
More recently, we have mined homology between TAAs and antigens derived from Firmicutes and Bacteroidetes phyla, which account for 90% of gut microbiota (32).Sequence and structural homology emerged between several TAAs and HLA-A*02:01restricted peptides derived from microbiota, which was particularly striking for MAGE-A1 KVLEYVIKV peptide.Importantly, we have been the first to report individual as well as cross-reactivity against paired MoAs and TuAs in both healthy subjects (HS) and cancer patients (CP), which is the founding evidence of the potential protective role of cross-reacting T cells on onset and progression of cancer (32)(33)(34)(35).Based on such findings, here we searched for a molecular mimicry between SARS-CoV 2 antigens and TAAs expressed in several tumors.Significant sequence and conformational homologies have been found between SARS-CoV-2 peptides and TAAs expressed in different cancers.The viral epitopes were mostly derived from the Orf 1ab and the Spike proteins, which is included in the BNT162b2 vaccine.Furthermore, significant percentages of CD8+ T cell clones crossreactive with the paired peptides have been found in infected and BNT162b2 vaccinated individuals.

BLAST homology search
The TAAs and SARS-CoV-2 derived peptide selected as SB according to NetMHCpan V.4.1 prediction tool, have been submitted to Basic Local Alignment Search Tool, BLAST (https:// blast.ncbi.nlm.nih.gov/Blast.cgi) in order to identify sequence similarity between TAA and Viral epitopes predicted.Sequences with a homology of at least 4/9 identical residues along the sequence were considered significant.
The PDB format of the complex between HLA-A*02:01 (1AO7), a viral peptide (TAX), and human T-cell receptor was downloaded from RCS Protein Data Bank (PDB) website (https:// www.rcsb.org/structure/1AO7); the PyMol software was used to modify the TAX peptide sequence into the peptides analyzed in the present study.The Molsoft Mol Browser was used to generate the epitope modeling and molecular docking.

Sample collection and PBMCs isolation
Peripheral blood was obtained by venipuncture from a cohort of 56 healthcare workers enrolled at the National Cancer Institute "Pascale" in Naples, ITALY, upon signing an informed consent (Supplementary Table 1).All of them underwent the prescribed schedule of the Pfizer-BioNTech BNT162b2 vaccination (prime at Day 0; boost at day 21).Fresh human PBMCs were harvested after 2 weeks post-boosting dose, isolated by Ficoll-Hypaque density gradient centrifugation and cultured in RPMI 1640 medium (Life Technologies, Carlsbad, CA) supplemented with 2 mM l-glutamine (Sigma), 10% fetal bovine serum (Life Technologies) and 2% penicillin/streptomycin (5000 I.U./5 mg/ml, MP Biomedicals).Healthy donor samples were genotyped for HLA-A loci (Lab of Histocompatibility, Section of Cryopreservation and BaSCO, AORN Santobono-Pausilipon, Naples, Italy).Samples from HLA-A*02:01-positive individuals (n=20) were selected for downstream analyses.

Peptide binding affinity
Peptide binding affinity to HLA-A*02:01 molecule and BFA decay assays were per-formed for each candidate peptide.Human TAP-deficient T2 cell line (174xCEM.T2; ATCC CRL 1992 ™ ) was purchased from American Type Culture Collection (ATCC; https:// www.atcc.org/)and cultured in Iscove's modified Dulbecco's medium (IMDM; Gibco Life Technologies) containing 25 mM HEPES and 2 mM L-Glut, supplemented with 20% fetal bovine serum (FBS; Capricorn Scientific GmbH), 100 IU/ml penicillin and 100 mg/ml streptomycin (Gibco Life Technologies).Cells were maintained at 37˚C in a humidi-fied incubator with 5% CO2.Briefly, T2 were seeded at 3.5 × 105 cells per well in 24 well plates and incubated 16 hours at 27°C with peptides (final concentrations: 5, 10, 20, 50 and 100 mM) in IMDM serum free medium.The next day, cells have been incubated for additional 2 hours at 37˚C.Following incubation, cells were harvested and centrifuged at 200 x g for 5 min.Subsequently, cells were washed twice with phosphate buffered saline (1X PBS; Gibco Life Technologies) and stained with R-PE conjugated anti human HLA A2 mono-clonal antibody (cat.343306; BioLegend), for 30 min at 4°C, and analyzed with the At-tune ™ NxT flow cytometer (Thermo Fisher Scientific).The fold increase was calculated using the following formula: FI = [mean fluorescence intensity (MFI) sample/ MFI background, where MFI background represents the value without peptide.All the experiments were performed in triplicate.

IFN-g ELISpot assay
PBMCs from HLA-A*02:01-positive individuals were harvested after 2 weeks post-boosting dose for IFN-g ELISpot assay.Spike predicted peptides whit the best affinity for MHC class I HLA-A*02:01, KIADYNYKL RBD and YLQPRTFLL NTD were added at a final concentration of 10 mg/mL to 3×10 5 PBMCs per well in 100 mL RPMI 1640 medium (Capricorn Scientific GmbH).PBMCs were cultured at 37°C in a humidified incubator with 5% CO2 for 20 hours.Stimulation with 10 mg/mL Phytohemagglutinin (PHA-K; Capricorn Scientific GmbH) was used as positive control; stimulation with 10 mg/mL LTDEMIAQY peptide was used as the negative control, RPMI 1640 medium (Capricorn ScientificGmbH) was used as background control.The plates were read with an AID EliSpot Reader Systems (AID GmbH, Strassberg, Germany).Determinations from triplicate tests were averaged.Data were analyzed by subtracting the mean number of spots in the wells with cells and medium-only from the mean counts of spots in wells with cells and antigen.Spot forming units (SFU) were calculated as the frequency per 10 6 PBMCs.
Gating for CD3+/CD8+ T cells was performed on live cells, and binding to pMHCs was assessed by measuring specific fluorescence associated with each individual pMHC.Samples were acquired on a flow cytometer (Attune NxT, Thermo Fisher).

MHC I Dextramer preparation and T cell staining
MHC I Dextramer complex (Immudex) were generated by combining purified disulfide-stabilized HLA-A*02:01 monomer (3 µM), with 100 mM peptide for 18°C for 48 h; subsequently MHC Ipeptide monomer (1 µM), were loaded onto U-Load Dextramer and incubated for 30 min in dark.In particular all the MHC I -SARS-CoV -2 derived peptides were loaded onto U-Load Dextramer PE conjugated; whereas MHC I -TAA peptides were loaded onto U-Load Dextramer FITC conjugated.To stain for T-cell reactivity, 3 x 10 6 PBMCs from vaccinated donors, were stimulated for 3 days with viral peptides.After stimulation PBMCs were harvested and incubated with a pool of MHC I Dextramer complex for 10 min at 37°C in dark; D-biotin (Sigma-Aldrich, St.Louis, MO, USA) was added at a final concentration of 25 mM to block any free binding sites.Cells were then stained with antibodies anti-CD3 Super Bright 436 (1:100, Invitrogen, 62-0037-42), anti-CD8 PE-Cy7 (1:100, Biolegend, 300914) and LIVE/DEAD Fixable aqua (1:1000, Invitrogen) for 30 min on ice and washed twice in FACS buffer (PBS + 2% FCS).Gating for CD3 + /CD8 + T cells was performed on live cells, and binding to MHC I Dextramer was assessed by measuring specific fluorescence associated with each individual MHC I Dextramer.Samples were acquired on a flow cytometer (Attune NxT, Thermo Fisher).

Data processing and statistical analysis
T cell recognition data, determined by DNA-barcoded pMHC multimers analysis and Barracoda software, was plotted using RStudio version 4.1.0.For statistical analysis, data was assumed to have a non-Gaussian distribution and non-parametric tests were therefore used.Wilcoxon signed rank test was used for single paired comparisons and the Mann-Whitney test was used for unpaired comparisons.The p-values are indicated in figure legends.Plots were generated using GraphPad Prism version 9.1.2(GraphPad Software Inc., USA).

Sequence alignment of VOCs' SARS-CoV-2 spike proteins
The entire proteome of the reference SARS-CoV-2 Wuhan isolate as well as Alpha, Beta, Gamma, Delta and Omicron variants of concern (VOC), were downloaded from the ViralZone site (https://viralzone.expasy.org).The protein sequences, covering the entire Spike protein (1.273 aa), were selected and aligned to identify identities and differences between the 6 VOCs.Most of the mutations are located in the first 650 positions, corresponding to the S1 region, which include the most antigenic regions of the spike protein.In particular, indels are found only in the first 250 positions at the NH 2 terminus of the Alfa, Beta, Delta and Omicron proteins (Supplementary Table 2).The identity matrix shows that the Omicron sequence is the most diverse from all other VOCs.Indeed, while the VOCs' sequence homology is, on average, 98.74% (98.52 -98.94), the Omicron sequence has an average homology with all other VOCs of 97.3% (97.08 -97.4) (Supplementary Table 3).

Peptide prediction in the SARS-CoV-2 spike protein
In order to verify whether the observed mutations reflected in predicted HLA-A*02:01-associated epitopes, the individual spike proteins were analyzed with the NetMHCpan software.The analysis returned 18 predicted strong binding peptides (SB), all shared among the VOCs with the exception of one identified only in the spike of the Omicron variant (199 -KIYSKHTPV -207) and one in the spike of the Delta variant (941 -SALGKLQNV -949) (Supplementary Table 4; Figure 1A).Some of the shared predicted strong binding peptides are characterized by different linear aa sequence, which has an impact on the predicted affinity to the HLA-A*02:01 allele.In particular, at position 417, while the peptides derived from Wuhan, Alpha and Delta VOCs have an affinity of 23.08 nM, the one derived from the Gamma VOC has an affinity of 69.17 nM and the ones derived from the Beta and Omicron VOCs have an affinity of 110.85 nM.Similarly, at position 976, while the peptides derived from all VOCs have an affinity of 22.77 nM, the one derived from the Alpha VOC has an affinity of 12.95 nM (Figure 1B).Regarding the predicted weak binding peptides (WB), the analysis returned 39 peptides, of which 25 are shared among all the VOCs and 14 are identified in single or multiple spikes of the variants (Supplementary Table 5; Figure 1C).Overall, 7 new predicted WBs have been identified in VOCs other than Wuhan.Of these, 1 is shared at position 492, 3 are unique to the Omicron, 2 unique to Alpha and 1 unique to Beta variants (Supplementary Table 6).Interestingly, the Delta variant does not show any new predicted WB.All VOCs loose the predicted WB at position 612 of the Wuhan variant (Figure 1C).The predicted shared WBs, even though are characterized by different linear aa sequence, they show identical predicted affinity to the HLA-A*02:01 allele.The only exception is represented by the peptide at position 610, which shows a significant improvement in affinity in all VOCs compared to the Wuhan variant (733.09 vs. 1342.97nM) (Figure 1D).For all the subsequent analyses, only SB (nr.12) and WB (nr.4) with affinity <100nM have been further considered (Supplementary Tables 4 and 5).
In order to validate the 16 selected SB+WB epitopes, the same prediction analysis was performed using additional three algorithms (ANN 4.0, SMM and MHC Flurry 2.0).The results showed that 2 out of 16 epitopes were predicted only by NetMHC-Pan4.1 and 12 of them were predicted by all four algorithms (75%) (Figure 2; Supplementary Table 7).

T cell reactivity to predicted spike epitopes
We have recently assessed T cell reactivity against 59 epitopes predicted in the Spike protein associated to different HLA alleles, covering both the S1 and S2 subunits.In particular, 11 were associated with the HLA-A*02:01 and only three were located in the S1 subunit, namely VTWFHAIHV at position 62 and YLQPRTFLL at position 269 in the NTD region; VLSFELLHA at position 512 in the RBD region.The results show that a long-term CD8+ T cell memory response is observed for 23 peptides and the strongest immunodominance is found for NYNYLYRLF (HLA-A*24:02), YLQPRTFLL (HLA-A*02:01) epitopes (20).
In the present analysis, we have found that the latter has the highest affinity to the HLA-A*02:01 allele (4.3 nM), supporting the previous finding on the immunodominance.In order to assess the T cell response to an additional SB epitope associated to the HLA-A*02:01 allele within the S1 subunit, we selected the KIADYNYKL SB peptide at position p417 in the RBD region (23.08 nM).
An IFN-g ELISpot assay was performed on PBMCs from all vaccinated healthcare workers, comparing the T cell reactivity to the KIADYNYKL peptide and the immunodominant YLQPRTFLL peptide using the LTDEMIAQY peptide as the negative control.
All subjects had variable levels of circulating T CD8+ cells reacting mainly to the YLQPRTFLL peptide (0-356 SFU, with an average of 87.9), in agreement with our previous finding.On the contrary, the reactivity to the additional KIADYNYKL RBD peptide did not differ from the one to the negative control (mean 9.8 vs. 10.8SFU) (Figures 3A, B; Supplementary Figure 1).Such results were confirmed by an MHC-class I tetramer staining, which showed a percentage of circulating CD8+ T cells specific for the YLQPRTFLL peptide significantly higher compared to KIADYNYKL (average 1,23% and 0,18% respectively) (Figure 3C).

Identification of spike peptides homologous TAA epitopes
In order to verify whether such strong T cell reactivity elicited by the YLQPRTFLL peptide could provide a preventive protection against cancer development, we searched for sequence homology with all published tumor associated antigens (TAAs) at the Cancer Antigenic Peptide database.The BLAST search did not return significant homology with any known TAA.We subsequently searched for all other 56 SB and WB peptides predicted in the Spike protein and 10 partial homologies were returned.Of these, five were in the SP1 subunit along the sequence (two in the NTD, two in the RBD and one in the SD1/SD2 regions), and five in the SP2 subunit.Only the latter were classified as SBs, corresponding to CSNK1A1, GnTV, CLPP, Tyrosinase, STEAP1.Among these 10 peptides, 6 are common to all the VOCs, 1 is only in the Beta, 3 in the Omicron and 1 in all but the Wuhan (Table 1).
Given that the core region of the nonamer bound to the HLA molecule is the one prevalently involved in the interaction with the a and b chains of the TCR, we analyzed this region for each of the paired peptides (p3 -p7).In particular, we focused on the predicted SBs.The amino acid residues of the SP2.1/CSNK1A1 pair shows identity only in p3, similar chemical properties in p6 and p7, different properties in p4 and p5.The SP2.2/GnTV pair shows identity in p4 and p6, similar chemical properties in p5, different properties in p3 and p7.The SP2.3/CLPP pair shows identity in p3 -p5, similar chemical properties in p7, and different properties in p6.The SP2.4/Tyrosinase pair shows identity only in p4 & p5, similar chemical properties in p3 and p7, different properties in p6.The SP2.5/STEAP1 pair shows identity only in p3, similar chemical properties in p4 -p7 (Table 1).Binding to HLA-A*02:01 was experimentally confirmed using the TAP-deficient T2 cells (Supplementary Figure 2).

Conformational and interaction analysis of the paired epitopes
In order to further evaluate the conformation similarity of the paired peptides, bioinformatics analyses have been performed by molecular docking of both peptides together with the HLA and the TCR molecules.The results showed that three of the peptides from the spike's SP2 region share a very high conformation similarity to TAAs.In particular, the SP2.2 -VLNDIFSRL 973-OM is highly homologous to the GnTV -VLPDVFIRV; the SP2.3 -RLDKVEAEV 983-ALL is highly homologous to the CLPP -ILDKVLVHL; the SP2.4 -HLMSFPQSA 1048 -ALL is highly homologous to the Tyrosinase -LLWSFQTSA.Indeed, the comparison analysis of the spike peptides with the homologous TAAs showed high similar conformation and contact areas to the HLA molecule as well as to a and b chains of the TCR (Figures 4A-C).On the contrary, all other peptides from the SP1 and SP2 regions showed poor conformation similarity to TAAs (Figure 4D; Supplementary Figures 3-4).
Such observations were confirmed by the analysis of the paired peptides positioned in the HLA peptide-binding groove (Figure 5).Indeed, while the SP2.2, 2.3 and 2.4 peptides and the paired TAAs show very high similarity (Figures 5A-D, E, G), the SP1.2 is confirmed to be significantly different from the paired IL-13 TAA in the interaction with the HLA peptide-binding groove.This is particularly obvious looking at the footprint of the two peptides on the HLA groove (Figures 5F, H; Supplementary Figures 5-8).
The analyses of the interaction between the paired peptides and a and b chains of the TCR provided a further perspective.As reasonably predictable, none of the paired peptides show identical footprints on the TCR chains (Figure 6).However, the SP2.2, and 2.4, and the paired TAAs, show minor differences only in the contact areas of the p7 residue (S vs. I, for the SP2.2/GnTV pair; Q vs. T for the SP2.4/Tyrosinase) with the G 100 and G 101 residues of the b chain (Figures 6A, G).The SP2.3/CLPP pair shows more pronounced differences in the contact areas of the p7 and p8 residues (AE vs. VH) with the L 98 , G 100 and G 101 residues of the b chain (Figure 6D).No differences in the contact areas with the a chain of the TCR were observed for the three pairs.On the contrary, the less homologous SP1.2/IL13 pair, in addition to differences observed for the other peptides, shows a major difference in the contact area of the p1 residue (G vs. W) with the G 28 , S 29 , Q 30 residues of the a chain of the TCR (Figure 6H; Supplementary Figures 9-12).

T cell cross-reactivity to paired peptides
In order to evaluate if the paired Spike and TAAs peptides with high sequence and conformational homologies were recognized by cross-reacting CD8 + T cells, an MHC I Dextramer complex-staining was performed.PBMCs from 7 vaccinated donors were stimulated ex vivo for 3 days with the Spike peptide and then incubated with MHCI Dextramer complexes loaded with either the same viral peptide or the homologous TAA.The results showed a substantial percentage of circulating reacting CD8 + T cells specific to the SP2.2, 2.3, 2.4 spike peptides, with average values not statistically different (1.04%, 0.8%, 0.8% on average, respectively) (Figure 7A).A lower but significant percentage of circulating reacting CD8 + T cells specific to the homologous TAAs spike peptides were observed.Also in this case, the average values were not statistically different (0.4%, 0.48%, and 0.38% on average, respectively) (Figure 7B).Interestingly, a CD8 + T cell cross-reactivity was observed for the three paired peptides, with average values of 0.11 -0.14% (Figure 7C).The T cell reactivity against the individual peptides as well as the cross-reactivity showed an inter-subject relevant variability.In general, no cross-reactivity was observed in samples showing a percentage of CD8 + T cells reacting against either the Spike or the TAA lower than 0.2%.Reactivities against the negative control (namely a scrambled peptide) was between 0.0% and 0.05% in all tested subjects (0.027% on average; ± 0.0023 standard deviation).For all subjects the response to each SARS and TAA was statistically significant compared to the natural background (p <0.05).

TCR a and b chains of reacting T cells
The VDJdb database was interrogated for searching TCR a and b chains (CDR3 regions) already identified to bind the spike and the homologous TAA peptides.The search returned results only for two spike peptides, namely the SP2.3 and 2.4, and one of the peptides, namely the TYR homologous to SP2.4.While a single a and b CDR3 region has been identified to bind the SP2.3 peptide, at least 4 have been found to bind the SP2.4 peptide.Likewise, the homologous TYR TAA has been described to be bound by 6 independent a and b CDR3 regions (https://vdjdb.cdr3.net/search) (Table 2).As expected, none of the combinations binding the individual peptides is identical.However, it is of interest the finding that the same J regions of the b CDR3 (TRBJ2-1 and 2-3) are identified in the TCR binding both the SP2.4 and the homologous TYR TAA.The sequences of the CDR3 are not identical, but the consensus show high conservation at the NH 2and COOH-termini of the sequence (Figures 8A, B).On the contrary, the J regions of the a chain (TRAJ) do not show any conservation among the TCR binding the paired epitopes, except for the canonical residues C 1 -A 2 and F 17 at the NH 2 -and COOHtermini of the sequence, respectively (Figure 8C) (37).

Peptide prediction in the SARS-CoV-2 proteins outside the spike
Considering that most of the individuals, even if vaccinated with the spike protein, have been and are possibly exposed to the infectious virus, a T cell response against all possible epitopes in the other viral proteins can be elicited.Therefore, all other protein sequences of the SARS-CoV-2 VOCs were analyzed with the NetMHCpan software, to predict potential HLA-A*02:01-associated epitopes.
The analysis returned a variable number of predicted strong and weak binding peptides (SB and WB), according to the length of each protein sequence.Considering the two extremes, the ORF1ab (7096 aa) and the ORF10 (38 aa), the first showed the highest number of both SB (101) and WB (61) and the latter the lowest number of both SB (1) and WB (0).98% the predicted binding peptides are shared among the VOCs with the exception of two SBs identified in the ORF1ab (FLARGVVFM OMICRON , TIIQTIVEV ALPHA ) and one WB identified always in the ORF1ab (TIWFLLLSV ALPHA ) (Table 3, Supplementary Tables 8, 9).More than 35% of the SBs are predicted to have an extremely high affinity (<10nM) and 77% of these are predicted in the ORF1ab; 48% are predicted to have a very high affinity (>10<50nM) and 81% of these are predicted in the ORF1ab; 17% are predicted to have a high affinity (>50<100nM) and 90% of these are predicted in the ORF1ab (Figure 9).In order to validate all the selected SB+WB epitopes in non-spike proteins, the same prediction analysis was performed using the additional three algorithms.The results showed that between 80 100% of the epitopes predicted by NetMHC-Pan4.1 were predicted also by all four algorithms (Figure 10; Supplementary Table 10).

Homology of SARS peptides to TAA epitopes
The SB and WB peptides predicted in the SARS-CoV-2 proteins, outside of the spike protein, were aligned to all published TAAs for sequence homology.The BLAST search returned significant homology with several known TAAs, mostly in the ORF 1ab, with not less than 4/ The CDR3 sequences of the a and b chains binding the SP2.Cumulative response in all subjects to individual epitopes and cross-reactivity is shown (D).n.s.means not statistical significant.
Epitopes were predicted from the entire proteome of the SARS-CoV-2 VOCs (no spike).The number of total SB and WB are reported, indicating the number of those common to all VOCs and of those VOC-specific.4).Binding to HLA-A*02:01 was experimentally confirmed using the TAP-deficient T2 cells (Supplementary Figure 13).The comparison analysis of the non-spike peptides with the homologous TAAs showed high similar conformation and contact areas to the HLA molecule as well as to a and b chains of the TCR (Figures 11A-D).On the contrary, all other peptides showed poor conformation similarity to TAAs (Supplementary Figures 14-19).
Such observations were confirmed by the analysis of the paired peptides positioned in the HLA peptide-binding groove, in particular looking at the footprint of the paired peptides on the HLA groove (Figure 12; Supplementary Figures 20-31).
The analyses of the interaction between the paired peptides and the a and b chains of the TCR provided a further perspective.As reasonably predictable, none of the paired peptides show identical Compared prediction of SB+WB by all algorithms.The prediction of SB and WB non-spike epitopes predicted by NetMHC -Pan4.1 (<100nM) was assessed also with the indicated algorithms.Unique and shared predicted epitopes in the first ranking positions in all methods are indicated for all proteins.

B C D A
Predicted epitopes in the SARS-CoV-2 proteome.The SBs predicted from the full SARS-CoV-2 proteome, except for the spike protein, are summed up.The percentage of SBs, according the affinity value (nM) is reported (A).The percentage of SBs, in each subgroup, is shown according to each of the viral proteins from which they are derived: <10 nM (B); >10<50 nM (C); >50 nM (D).The epitopes have been aligned to homologous TAAs, as identified via BLAST search.Aligned aminoacid sequences are shown for each epitope pair.Green color indicates identity between the residues in the paired epitopes.The affinity for each peptide is reported with the tumor expressing the TAA.footprints on the TCR chains (Figure 13).However, the ORF1ab and the paired Glypican and MDK TAAs, show minor differences only in the contact areas with the G 100 and G 101 residues of the b chain (Figures 13, C, H).On the contrary, the ORF1ab and the Memb Glyc show more pronounced differences with the paired PRDX5 and Her-2 TAAs in the contact areas with the L 98 , G 100 and G 101 residues of the b chain (Figures 13D, G).No relevant differences in the contact areas with the a chain of the TCR were observed for the pairs (Figure 11H).Overall, more pronounced differences in the contact areas with both a and b chains of the TCR were observed in other epitope pairs (Supplementary Figures 32-43).

T cell cross-reactivity to non-spike epitopes and TAAs
In order to evaluate if the paired non-Spike and TAAs peptides with high sequence and conformational homologies were recognized by cross-reacting CD8 + T cells, an MHC I Dextramer complex-staining was performed in the same experimental setting described for the spike-derived peptides.Four paired epitopes were selected according to the conformational homology and the range of predicted affinity (9.8 -108.16nM).The results showed a substantial percentage of circulating reacting CD8 + T cells specific to the SRAS-CoV-2 peptides, with average values not statistically different (0,99 -1.34%).However, significant variability was observed among the tested subjects and, in particular, one showed high percentage of reacting CD8 + T cells against all the SARS-CoV-2-derived epitopes (2.4 -5.7%) (Figure 14).
A lower but significant percentage of circulating reacting CD8 + T cells specific to the homologous TAAs epitopes, with an average included between 0.36% (Glypican) and 1.34 (MDK) but the differences between the groups do not reach the statistical difference (Figure 9).Finally, a low level of CD8 + T cell crossreactivity was observed for the four paired peptides, with average values of 0.06 -0.16%, showing a direct correlation with the levels of reactivity against the individual peptides (Figure 14).Reactivities against the negative control (namely a scrambled peptide) was between 0.0% and 0.065% in all tested subjects (0.034% on average; ± 0.004 standard deviation).For all subjects the response to each SARS and TAA was statistically significant compared to the natural background (p <0.05).

Total T cell cross-reactivity to SARS-CoV-2 epitopes and TAAs
In order to evaluate the cumulative effect induced by the epitopes derived from the entire SARS-CoV-2 proteome, we analyzed the total CD8 + T cell responses against the viral epitopes, their paired TAAs and the cross-reactivities.
The overall analysis confirmed that the percentage of reacting CD8 + T cells to each individual epitope (SARS and TAA) is very variable from subject to subject, with a wide range from <1% to 6%.Some of the subjects show a consistent reactivity to all SARS or TAA epitopes, others a more focused one.Interestingly, selective high reactivity to TAAs, MDK the highest, is observed (p<0.001).Overall, considering the sum of the reactivities of all subjects, the highest percentage is observed for the VLLSMQGAV ORF1ab /ALLALTSAV MDK and KLLEEWNLV MEMBR/ RLLQETELV HER-2 epitope pairs, reaching >9% and ~6% of the total circulating T cells, respectively (Figure 15; Supplementary Figure 44).

Discussion
In the present study we screened all the possible MHC-I T cell epitopes in the entire SARS-CoV-2 proteome predicted to bind the HLA-A*02:01 allele and their homology to tumor associated antigens (TAAs).The ultimate goal is to assess whether the T cell response elicited by the vaccination and/or the productive infection with the SARS-CoV-2 may represent a natural priming to cancer antigens, to be promptly recalled and boosted in case of tumor growth.
The analysis was first focused on epitopes derived from the Spike protein, to assess the role of the vaccination.Most of them are shared among all the VOCs, with only few identified in variants different from the Wuhan, suggesting that all the vaccines implemented around the World (based on the Wuhan isolate) have potentially elicited the broadest spectrum of T cells response.Regardless all the predicted strong and weak binding epitopes (18 SB and 39 WB) in the spike protein, we confirmed, as previously reported by us and others, that the most immunogenic epitope in the contest of the HLA-A*02:01 allele is the YLQPRTFLL.The BLAST search did not return any homology between this peptide and known TAAs, implying that the strong anti-SARS-CoV-2 T cell response elicited by the vaccination is unlike to represent an immunological priming, potentially protective against tumor growth.Considering the remaining predicted 56 SB and WB peptides in the spike protein, 10 partial homologies with TAAs were found, equally distributed in the SP1 and SP2 subunits.Four of these homologies were associated to the Beta (one) and Omicron (three) VOCs only, suggesting that the infection with such VOCs may elicit a broader priming against tumor antigens.Obviously, this broadening effect does not apply to vaccinated individuals, which have been immunized with the Wuhan variant only.Such homology between the paired epitopes has been further corroborated by molecular docking analysis, which showed a very high similarity in the conformation of the peptides and the pattern of contact with the HLA molecule as well as a and b chains of the TCR.The reactivity of T cells from healthy subjects to TAAs and, yet more, the cross-reactivity of T cells to both epitopes provided the ultimate confirmation of the molecular mimicry between the viral and tumor antigens.The CDR3 regions of the TCR a and b chains binding the HLMSFPQSA SP2.4 and the paired LLWSFQTSA TYR peptides have been identified and are publicly available.There is no shared sequence among the CDR3 binding the two epitopes, although the consensus shows high conservation at the NH 2 -and COOHtermini.Nevertheless, considering the profusely reported broad variability in the CDR3 sequences binding a very same epitope, the finding in the present study cannot exclude that some of these CDR3 are indeed able to bind both epitopes.The same pipeline of analysis has been applied to all other SARS-CoV-2 proteins.Indeed, also vaccinated subjects may be infected by the replicating virus and therefore primed by any peptide derived from the viral proteome.The total number of predicted peptides is 205 (124 SB and 81 WB); 81.5% of the SBs and 75.3% of the WBs are predicted in the ORF1ab, which, indeed, represents 73% of the entire viral proteome.The homology with TAAs was identified for 26 SBs and 5 WBs from the SARS epitopes and 25/31 of them (80.1%) were from the ORF1ab.Two of the ORF1ab-derived (LLLDDFVEI, VLLAPLLSA) and one of the Nucleocapside-derived (LLLDRLNQL) epitopes show sequence homology to two different TAAs.Likewise, some of the TAAs show homology to more SARS epitopes, namely the LLLDDLLVS PRDX5 (5 homologies) and the ALLALTSAV MDK (4 homologies).Also for these paired epitopes, the sequence homology has been further corroborated by molecular docking analysis, which showed a very high similarity in the conformation of the peptides and the pattern of contact with the HLA molecule as well as a and b chains of the TCR.As predictable, all the paired SARS-CoV-2 peptides (spike and nonspike) and TAAs show high E-values confirming that they do not have any evolutionary relationships.Indeed, they are the result of a random event with a likely relevance in protection from cancer development, progression and final clinical outcome.Moreover, the RMSD values are all <1 which are considered as proof of highly similar structure.The top paired peptides were experimentally confirmed to bind the HLA-A*02:01 using the TAP-deficient T2 cells.
Moreover, the reactivity of T cells from healthy subjects to SARS epitopes, TAAs and, yet more, the cross-reactivity of T cells to both epitopes provided the ultimate confirmation of the molecular mimicry between these viral and tumor antigens.As predictable, the reactivity of cells from each subject to each epitope is significantly variable with completely different patterns, indicating a highly individual responsiveness.Nevertheless, the cumulative analysis show a consistent reactivity to all SARS and TAA epitopes with an average of 6.77% and 3.81% of reacting CD8 + T cells, respectively.Furthermore, the percentage of cross-reactive T cells is, on average, 0.81%.In particular, the highest percentage is observed for the VLLSMQGAV ORF1ab / ALLALTSAV MDK and KLLEEWNLV MEMBR/ RLLQETELV HER-2 epitope pairs, reaching >9% and ~6% of the total circulating T cells, respectively.For these pairs, the cross-reactive CD8 + T cells reach 1% of the total circulating T cells, confirming that the higher is the reactivity for the individual peptides the higher is the cross-reactivity.
Overall, in accordance to the proposed mechanism of the molecular mimicry in cancer patients infected by SARS-CoV-2 (38), the findings of the present study describe for the first time that significant homology is found between several epitopes derived from the entire SARS-CoV-2 proteome and the whole list of known TAAs.The present study is limited to the HLA-A*02:01 haplotype, which is present in about 40-50% of the Italian and Western general population and, therefore, it has a great impact.A much larger observation group will allow to generalize this result to many other haplotypes.
The level of sequence and conformational homology, together with the percentages of CD8 + T cells reacting with the individual epitopes and/or cross-reacting with both of them, has a high potential implication.Indeed, this suggests that the T cell response elicited during the natural infection by one or multiple SARS-CoV-2 derived epitopes may prime the immune system against several TAAs.The predicted epitopes showing a molecular mimicry with TAAs are mostly (>95%) common to all VOCs of the SARS-CoV-2, implying that the individual variants provides a very limited increased protective effect.
Therefore, the SARS-CoV-2 infection may represent a "natural anti-cancer vaccination" eliciting a memory T cell compartment able to cross-react with cancer cells and providing a protection from cancer development and progression.Such a protecting effect will be evaluated in the coming years by epidemiological studies assessing the incidence of several tumor types, including breast, liver, melanoma and colon cancers.Consequently, the viral antigens homologous to TAAs may be used for developing "multi-cancer" off-the-shelf preventive/therapeutic vaccine formulations, with higher antigenicity and immunogenicity than over-expressed tumor self-antigens, for the potential valuable benefit of thousands of cancer patients around the World.

Funding
The author(s) declare financial support was received for the research, authorship, and/or publication of this article.The study was funded by the Italian Ministry of Health through Institutional "RicercaCorrente" (L2/3 LB and L2/13 MT); POR FESR 2014/2020 "Campania OncoTerapie" (LB).BCe and SM are funded by the Italian Ministry of Health through the PNRR-POC-2022-12375769 project (LB).

1
FIGURE 1 Predicted epitopes in the SARS-CoV-2 Spike protein.(A) Position of the epitopes predicted to be Strong Binders (SBs) and indication of the VOC in which they have been found.;(B) Affinity value is indicated for each of the predicted SBs.(C) Position of the epitopes predicted to be Weak Binders (WBs) and indication of the VOC in which they have been found.;(D) Affinity value is indicated for each of the predicted WBs.The star indicates the epitopes for which T cell reactivity has been previously published.

FIGURE 2
FIGURE 2Compared prediction of SB+WB by all algorithms.The prediction of SB and WB spike epitopes predicted by NetMHC -Pan4.1 (<100nM) was assessed also with the indicated algorithms.Unique and shared predicted epitopes in the first 16 positions in all methods are indicated.

3
FIGURE 3 Epitope-specific CD8+ T-cell clones for epitopes derived from the SARS-CoV-2 Spike protein.(A) Elispot assay.PBMCs from 20 HLA-A* 02:01 positive vaccinated donors were stimulated O/N with the specific HLA-A*0201 predicted Spike Epitopes.SFU = IFNg spot forming units.(B) Elispot assay for a single subject; (C) Box plot showing in a single individual the higher presence of CD8+ T cells specific for YLQPRTFLL than KIADYNYKL after pMHC staining.

5 4 Predicted
FIGURE 5 Molecular mimicry of spike epitopes and TAAs: interaction with the HLA molecule.The paired epitopes are shown bound to the HLA-A*02:01 molecule.The TCR facing residues are presented to the TCR ab chains with the same conformation (A, B, E, F).The footprint of the paired epitopes on the HLA molecule is shown and the contact points are highlighted in yellow (C, D, G, H).

6
FIGURE 6 Molecular mimicry of spike epitopes and TAAs: interaction with the TCR molecule.The paired peptides are shown in contact with the TCR ab chains (A, B, E, F).The footprint of the paired epitopes on the TCR ab chains is shown and the contact points are highlighted in yellow (a chain) and in pink b (chain).For each pair, differences in the contact areas are circled (C, D, G, H).

7 T
FIGURE 7 T cell reactivity to spike epitopes and paired TAAs.PBMCs from HLA-A*02:01 positive vaccinated were analyzed by tetramer-staining with the indicated epitopes.Results of reactivity to individual epitopes are shown (A, B, E, F); Results of cross-reactivity to both epitopes are shown (C, G).Cumulative response in all subjects to individual epitopes and cross-reactivity is shown (D).n.s.means not statistical significant.

8 Consensus
FIGURE 8 Consensus CDR3 sequences of the ab chains reacting to paired epitopes.The TRAV/TRAJ and the TRBV/TRBJ sequences described in literature to bind the SP2.4 or the TYR epitopes have been piled up to generate the Seq Logo.SeqLogo from the b CDR3 TRBJ2-1 (A) and 2-3 (B) binding both epitopes are indicated.SeqLogo from the a CDR3 TRAJ (C) binding both epitopes are indicated.The height of the letters indicates the frequency of the specific amino acid residue in that specific position of the epitope.
Predicted 3D conformation of non-SPIKE peptides and paired TAAs.The conformation of the SARS-CoV-2 derived epitopes (ORF and MEMB) and paired TAAs bound to the indicated HLA-A*02:01 molecule is shown.The paired epitopes are indicated in each box (A-D).The prediction was performed using as template the publicly available crystallized structure (PDB https://www.rcsb.org/structure/1AO7). Green areas = contact points with HLA molecule; light Blue areas = contact points with the TCR a chain; Violet areas = contact points with the TCR b chain.

12
FIGURE 12 Molecular mimicry of non-spike epitopes and TAAs: interaction with the HLA molecule.The paired epitopes are shown bound to the HLA-A*02:01 molecule.The TCR facing residues are presented to the TCR ab chains with the same conformation (A, B, E, F).The footprint of the paired epitopes on the HLA molecule is shown and the contact points are highlighted in yellow (C, D, G, H).

13 14 TT
FIGURE 13Molecular mimicry of non-spike epitopes and TAAs: interaction with the TCR molecule.The paired peptides are shown in contact with the TCR ab chains (A, B, E, F).The footprint of the paired epitopes on the TCR ab chains is shown and the contact points are highlighted in yellow (a chain) and in pink b (chain).For each pair, differences in the contact areas are circled (C, D, G, H).

TABLE 1
Sequence homology between spike epitopes and TAAs.

TABLE 2
CDR3 sequence of the a and b chains binding spike and TAA epitopes.

TABLE 3
Strong (SB) and weak (WB) binders identified in the entire proteome (no spike).