ORIGINAL RESEARCH article
Sec. Virus and Host
Volume 12 - 2022 | https://doi.org/10.3389/fcimb.2022.868205
How concerning is a SARS-CoV-2 variant of concern? Computational predictions and the variants labeling system
- 1Department of Life Sciences, Health Biotechnology Program - King Fahad Chair for Health Biotechnology, College of Graduate Studies, Arabian Gulf University, Manama, Bahrain
- 2Department of Family and Community Medicine, College of Medicine and Medical Sciences, Arabian Gulf University, Manama, Bahrain
In this study, we evaluated the use of a predictive computational approach for SARS-CoV-2 genetic variations analysis in improving the current variant labeling system. First, we reviewed the basis of the system developed by the World Health Organization (WHO) for the labeling of SARS-CoV-2 genetic variants and the derivative adapted by the United States Centers for Disease Control and Prevention (CDC). Both labeling systems are based on the virus’ major attributes. However, we found that the labeling criteria of the SARS-CoV-2 variants derived from these attributes are not accurately defined and are used differently by the two agencies. Consequently, discrepancies exist between the labels given by WHO and the CDC to the same variants. Our observations suggest that giving the variant of concern (VOC) label to a new variant is premature and might not be appropriate. Therefore, we used a comparative computational approach to predict the effects of the mutations on the virus structure and functions of five VOCs. By linking these data to the criteria used by WHO/CDC for variant labeling, we ascertained that a predictive computational comparative approach of the genetic variations is a good way for rapid and more accurate labeling of SARS-CoV-2 variants. We propose to label all emergent variants, variant under monitoring or variant being monitored (VUM/VBM), and to carry out computational predictive studies with thorough comparison to existing variants, upon which more appropriate and informative labels can be attributed. Furthermore, harmonization of the variant labeling system would be globally beneficial to communicate about and fight the COVID-19 pandemic.
The new coronavirus SARS-CoV-2 that emerged late 2019 is still causing a pandemic of severe acute respiratory syndrome or COVID-19. The pandemic affected over 500 million people and claimed more than six million lives (Johns-Hopkins-University, 2022), thus posing a difficult challenge to the scientific and healthcare communities throughout the world (Malik et al., 2020; Tutelyan et al., 2020; World-Economic-Forum, 2020). Indeed, SARS-CoV-2 is a positive RNA virus that is constantly evolving through the accumulation of various type of mutations (Zhao et al., 2020; Banoun, 2021; Majumdar and Niyogi, 2021; Singh et al., 2021). Even though the majority of these mutations do not affect the virus infectious properties and have no real impact on the progress of the pandemic (Ashoor et al., 2021), some may enhance specific viral attributes that give the virus a selective advantage (Gobeil et al., 2021). Any new variant endowed with selective advantage(s) would favor the virus persistence and nurture the pandemic. Therefore, watching and predicting how the pandemic evolves and communicate it to the public is of paramount importance. The global surveillance of the pandemic is based on multidisciplinary approaches including epidemiological, genetic, structural, and clinical data (Agency, U.H.S, 2021; Campbell et al., 2021; England, P.H, 2021; France, S.P, 2021). This involves the use of a set of relevant criteria to categorize the variants. Toward this end, all international and national health and sanitary authorities have set various strategies to control the evolution of the SARS-CoV-2 pandemic. On 31 May 2021, the World Health Organization (WHO) announced a labeling system to categorize the variants into different levels of priority to better organize the global monitoring and research, and ultimately organize the “infodemic” and communicate more effectively with the public about the adequate response to the emergence of new variants of SARS-CoV-2 (https://www.who.int/). WHO has first developed a system to facilitate naming SARS-CoV-2 variants in addition to the existing nomenclature systems for naming and tracking SARS-CoV-2 genetic lineages established by GISAID (https://www.gisaid.org/), Nextstrain (https://nextstrain.org/), and Pango (https://cov-lineages.org/). These nomenclatures are mostly used by the scientific research community. For practical reasons (particularly to ease the communication), WHO has settled to name the SARS-CoV-2 emerging variants using the Greek letters alphabet sequence (α, β, γ, δ…). Since the SARS-CoV-2 virus is showing high genetic variability (Toyoshima et al., 2020; Yazdani et al., 2021; Dubey et al., 2022), WHO has established a labeling system for the variants into variant of concern (VOC), variant of interest (VOI), and variant under monitoring (VUM). This labeling system is based on definitions related to variant phenotypic attributes such as transmissibility, disease presentation, effect on current diagnostic tests, and response to available vaccines. This system was set to prompt and harmonize the actions needed to control the spread of a given variant. While the variants that emerged sequentially were named α, β, γ, and δ, the last emerging one was named omicron. Of interest is that the United States Centers for Disease Control and Prevention (CDC) has also adopted this system of variant classification and labeling but added more labels and labeling criteria and different labels’ change policy (https://www.cdc.gov/). Indeed, while keeping the labels, VOC, VOI, and VUM (calling the later “VBM” for Variant Being Monitored). In addition, the CDC uses an extra label which is “Variant Of High Consequences” (VOHC). As a result, some of the currently known variants are given different labels by each agency. Indeed, according to the CDC, there is no SARS-CoV-2 variant labeled VOI as of December 1, 2021. Furthermore, variants α, β, and γ that are currently labeled VOC by WHO have been deescalated to the VBM label by the CDC as of September 2021 (https://www.cdc.gov/). These discrepancies reflect different views on the labeling of SARS-CoV-2 variants and consequently the use of the labels to set public health actions. On the other hand, in both systems, variant labels can change with more data accumulating for a particular variant.
In this work, we have undertaken an evaluation of the system developed by WHO and adapted by the CDC to label the SARS-CoV-2 variants. We carried out a review of the classification criteria and analyzed how WHO and the CDC use these criteria to label the SARS-CoV-2 variants. Then, we used a comparative computational predictive approach to study the S protein mutations that characterize the five SARS-CoV-2 VOCs. We concluded that computational predictions provide a good ground of evidence for a rapid and more accurate labeling system.
2 Materials and methods
2.1 Data mining and information sources
We retrieved the genetic, epidemiological, and clinical data on the variants available as of 15 December 2021 from primary and secondary sources including the GISAID (https://www.gisaid.org/) and the Variants (http://covariants.org) data banks. We collected the information on variants naming and labeling from the following sources: WHO: https://www.who.int/en/activities/tracking-SARS-CoV-2-variants/, CDC: https://www.cdc.gov/coronavirus/2019-ncov/variants/variant-info.html#anchor_1632154493691, and PANGO lineages.
2.2 Variant sequence retrieval and solvent exposure analysis
SARS-CoV-2 spike protein extracellular domain amino acid sequence was obtained from the National Center for Biological Information (NCBI) protein ID: YP_009724390.1 (NCBI: https://www.ncbi.nlm.nih.gov). Variant-specific mutations were introduced to the collected sequence based on the list published at https://covariants.org/. The sequences corresponding to the different variants (alpha, beta, gamma, delta, and omicron) were analyzed for solvent exposure and possible epitope residues using the Sequential B-Cell Epitope Predictor server (BepiPred-2.0 server: https://services.healthtech.dtu.dk/service.php?BepiPred-2.0). BepiPred-2.0 is based on a random forest algorithm trained on epitopes annotated from antibody–antigen protein structures (Jespersen et al., 2017).
2.3 SARS CoV-2 spike protein furin cut site loop modeling
For the loop modeling, the Phyre2 web server (http://www.sbg.bio.ic.ac.uk/phyre2/html/page.cgi?id=index) (Kelley et al., 2015) was used to generate alpha, beta, gamma, delta, and omicron variant 3D models of the extracellular spike monomers, and the results were saved and visualized on PyMOL (DeLano, 2002).
2.4 Model quality assessment
As a quality assessment for the generated models, the crystallized model of the spike protein (PDB ID 6VXX 2.80A° version 2.4) was downloaded from the RCSB database (https://www.rcsb.org). The structure was cleaned of water and heteroatoms, the complex was split, and a PBD file for a monomer chain was created (chain B) and saved using PyMOL software. This monomer chain B was used as a reference model for the sequence generated models. To define the common contact map between the crystal structures and the generated models, CMView (Vehlow et al., 2011) was used with the following parameters: contact type, Ca; distance cutoff, 8.0; and Needleman-Wunsch alignment. Different contact maps were established between the crystal structure and the models, and the common contact percentage was calculated. Higher common contacts indicate more structural similarity and hence the models are suitable for further analysis. In addition, Tm align (https://zhanggroup.org/TM-align/) (Zhang and Skolnick, 2005) was used to calculate TM score value for each model. The superimposition root mean square deviation (RMSD) was calculated using PyMOL. Low RMSD and TM scoring between 0.5 and 1.0 indicate that the two compared structures (the crystal and the model) has about the same fold.
2.5 Mutational analysis: the effect of mutation on the interaction with ACE2 receptor
To analyze the effect of different RBD mutations on different SARS-CoV-2 variant interactions with ACE2 receptor, the PDB ID 6LZG structure was used as a model. The effect of single and accumulated mutations were evaluated by calculating changes in binding affinity (ΔΔG) upon single or multiple mutations using the MutaBind2 server (https://lilab.jysw.suda.edu.cn/research/mutabind2/) (Zhang et al., 2020). In addition, the server provided a structural model that was used to analyze polar interactions by the LigPlot+ software (Laskowski and Swindells, 2011).
3.1 Study of SARS-CoV-2 variant labeling system
Expert groups at WHO and the CDC use similar labeling systems to classify the new emergent SARS-CoV-2 variants. We retrieved a total of 24 different criteria from the working definitions elaborated by WHO and the CDC to give a particular label to a new variant. These criteria are derived from six viral attributes (Figure 1). Each attribute corresponds to a set of criteria that are formulated differently by each agency that considers different criteria in their working definition of each label. The two agencies use common (VOC and VOI) and different (VUM, VBM, and VOHC) labels. Each agency uses generally different combinations of criteria to give a variant a specific label except in five instances shown in Figure 2 where they use overlapping criteria for the same label (VOC and VOI). Consequently, there are discrepancies in the labels of currently active variants. Indeed, while WHO is currently labeling variants α, β ,γ, δ, and omicron as VOC, for the CDC, only variant omicron is labeled VOC. Indeed, the CDC declassified variants α, β ,γ, and δ to VBM. In addition, there are no variants currently labeled VOI by the CDC. Figure 2 illustrates the differential use by WHO and the CDC of the viral attribute-derived criteria to label the SARS-CoV-2 genetic variants.
Figure 1 SARS-CoV-2 variant attributes and derived criteria used by the World Health Organization (WHO) and the Centers for Disease Control and Prevention (CDC) for the labeling of emergent variants.
Figure 2 Matrix showing the commonalities and discrepancies of the criteria used by WHO and the CDC to label SARS CoV-2 variants. The green and yellow triangles indicate the criteria used by WHO and the CDC, respectively. The red squares indicate the criteria used by both agencies for a same label. The numbers correspond to the labeling criteria displayed in Table 1.
3.2 Attributes of the SARS-CoV-2 major variants of concern
3.2.1 Epidemiological impact and transmissibility
While epidemiological data are available for variants α, β, γ, and δ, data on the omicron variant are limited and incomplete. This is mainly because not enough time has elapsed since the emergence of this variant to allow enough data accumulation and meaningful analysis when the variant was given the VOC label. Table 1 summarizes the initial pieces of data available as of 25 December 2021 on the omicron variant. For this variant, the early estimation of transmissibility increase is in the range of three to six. However, the previous data observed with the transmissibility of variant delta that became dominant worldwide and the currently observed high spreading/incidence of the omicron variant make the prediction of its transmissibility a 100-fold higher than the delta variant plausible (Rao and Singh, 2021).
3.2.2 Disease presentation and impact on therapy and diagnostic
COVID-19 has three clinical presentation forms: mild, moderate, and severe. WHO and the CDC use four labeling criteria related to clinical presentation and four others pertaining to the impact of a given variant on therapy and diagnostics. The two agencies use different formulation for all of these criteria (Figure 1) and use different combinations of these criteria to label SARS-CoV-2 variants (see Figure 2). There is no explicit mention to the disease forms in the variant labeling criteria related to disease presentation in both WHO and CDC variant-labeling usage.
3.2.3 Impact on immunity, vaccination, and public health measures
WHO and the CDC also use different formulations for the criteria related to these attributes and use them differently to attribute a specific label (Figures 1, 2). For instance, for WHO, immune escape is a criterion used to label a variant VOI and VOC. However, the CDC uses it for labeling a variant VOI, VBM, and VOHC. Omicron has been shown to have extensively but incompletely evaded the Pfizer BNT162b2 vaccine (Cele et al., 2022), thus fulfilling the criterion of decreased vaccine effectiveness. For the criterion related to the evaluation of an imminent risk to public health a variant can pose, only the CDC uses it for the VOHC label. Table 2 shows how the five major SARS-CoV-2 variants fulfill the criteria used by WHO and the CDC for variant labeling regardless of the actual labels given by each agency.
Table 2 Actual application of the criteria formulated by WHO and the CDC to the five major SARS-CoV-2 variants.
3.3 Computational approach for the prediction of SARS-CoV-2 viral attributes
The computational approach we used for the prediction of SARS-CoV-2 emerging variants viral attributes is described in Figure 3.
Figure 3 Outline of the computational approach used to predict the viral attributes of SARS-CoV-2 emerging variants. This two-step approach includes a number of computed tasks from which various predictions are retrieved. The predictions are linked to the viral attributes and the criteria that apply are used to label a given variant.
3.3.1 Mutation profile analysis
Figure 4 shows a comparative mapping of the mutation profile of the omicron variant with those of the alpha, beta, gamma, and delta variants. Most of the mutations affect the SARS-CoV-2 S protein. Omicron displays 63 different mutations as compared to the Wuhan strain. Thirty-six mutations occurred in the S spike protein and 15 are clustered in the RBD region. The representation of the mutations in the S protein 3D models of SARS-CoV-2 five VOCs shows that most of the mutations map to the solvent-exposed regions (Figure 5 and Supplementary Table 1).
Figure 4 Mapping of the nonsynonymous mutations’ characteristics of the SARS-CoV-2 four variants. A typical genomic organization of SARS-CoV-2 contains the following: 5’ end UTR; open reading frames: ORF 1a and ORF 1b; the structural genes coding for the Spike (S) protein, the Envelope (E), the Membrane (M), and the Nucleocapsid. The accessory genes such as 3a, 3b, 6, 7a, 7b, 8, 10, and 14 are distributed among the structural genes. The 3’ end UTR follows the poly (A) tail. The green, yellow, blue, purple, and red show, respectively, the synonymous mutations characteristic of the alpha (α) variant (B.1.1.7), beta (β) variant (B.1.351), omicron (o) variant (B.1.1.529), gamma (γ) variant (P.1), and delta (δ) variant (B.1.617.2). The (-) represents the deletion, (°) represents the insertion, (*) represents the stop codon, magenta triangles indicate variations in the receptor-binding domain (RBD), and cyan triangles denote variations in the receptor-binding motif (RBM). The NCBI reference sequence for the surface glycoprotein of SARS-CoV-2 is YP_009724390.1.
Figure 5 Representation of the surface models of SARS CoV-2 variant S spike protein (monomer). Side views (upper and middle rows) and top view (third row). Color coding is as follows: mutations (red), S1 subunit (cyan), S2 subunit (green), furin cleavage site (yellow), fusion peptides FP1 and FP2 (black), and the arrows show the TMPRSS2 cleavage site (orange).
3.3.2 Effect of omicron S protein mutations on the immunogenicity
Non-synonymous mutations of the different SARS-CoV-2 variants caused changes on the epitope probability and antibody exposure and hence immunogenicity. Most of the epitope changes are noticeable in the S1 domain of the spike protein. Figure 6 shows the percentage of the exposed epitope on the different variants. Detailed probability and exposure states of each residue as predicted by BepiPred are listed in Supplementary Table 1.
Figure 6 Percentage of solvent-exposed residues in the S protein of the five SARS-CoV-2 VOCs. Solvent-exposed residues at the surface of the S protein are potential epitopes. Variation of the percentage of solvent-exposed residues provide a useful hint for the prediction of immunogenicity changes between variants.
3.3.3 Effect of omicron S protein mutations on the 3D structure of the molecule: Loop modeling and quality assessment
Computational methods allow the building of accurate protein models of the SARS-CoV-2 S protein based on data input and the alignment with experimentally solved multiple template molecules. We used the Phyre2 web server to generate 3D models for the extracellular domain of the variants α, β ,γ , δ, and O monomer spike protein (Figure 7). The structures were generated with 100% confidence and 84% coverage for alpha and beta, and 83% coverage for gamma, delta, and omicron by the single highest scoring template. The quality assessment of these models obtained by superimposition with the crystallized structure of the SARS-CoV-2 spike glycoprotein (closed state) 6VXX Chain B showed an RMSD of 1.086 for alpha, 1.085 for beta, 1.079 for gamma, 1.075 for delta, and 1.082 for omicron (Figure 5). The Tm scores were all less than one and both RMSD and TM scoring values are acceptable and indicate high similarity between the crystallized model and the generated one showing the same folds. In addition, the CMView common contact map gave a high similarity score ranging between 80% and 81.5%, keeping in mind that the crystallized structure is missing few residues (gaps) including the furin cleavage loop. What is important here is that these generated models include the furin cleavage site that is lacking in all the crystallized models deposited on the Protein Data Bank (PDB). Moreover, comparison of the contact map percentage and TM-align scoring gave an indication of the similarity between the different models (Table 3). Given the highly reliable Tm score and RMSD values, the computationally generated models are suitable to be used for further analysis especially for evaluating the effect of mutations on and around the furin cleavage site.
Figure 7 This figure represents the 3D models generated for the α, β, γ, δ, and O variants, their structural analysis, and the quality assessment by superimposition with chain B of the crystallized model PBD 6VXX. (A) Superimposition RMSD values. (B) Calculated common contact percentages and Tm scores.
Table 3 Comparison of contact map percentage (red) and TM-align (blue) scoring between the different SARS CoV-2 VOCs. The arrowheads denote the variant used as a reference when calculating the data.
3.3.4 The effect of contact residues’ mutations on the SARS-CoV-2 S protein/ACE2 complex thermodynamic stability
Sequence analysis showed that the omicron variant has the most mutated RBD with 15 different mutations, out of which, 9 mutations were in the contact residues with ACE2 receptor. The alpha variant showed only one mutation in the contact residues (N501Y), beta and gamma showed three mutations, all of which are in the contact residues, while delta showed two mutations, none of which are in the contact residue with ACE2. Analyzing these mutations as a single mutation showed the different effect on the complex of SARS-CoV2 RBD with ACE2 receptor. Where some mutations showed to be deleterious as a single mutation, others have a stabilizing effect. However, the combination of several mutations in the contact residue shows a different effect; in the case of the omicron variant, the combination of nine mutations in the contact residues showed to be not deleterious even though they slightly destabilize the complex with ACE2 with an accumulative ΔΔG= 0.18. The beta and gamma variants with three mutations in the contact residue showed an accumulative ΔΔG of 1.79 and 1.07, respectively, which highly destabilizes the complex and decreases the binding affinity. Delta has only two mutations that occur out of the contact residues with a stabilizing effect on the complex and hence increasing affinity to ACE2 and an accumulative ΔΔG = −0.33. The single contact residue mutation of the alpha variant (N501Y) showed a destabilizing effect that we reported previously (Ashoor et al., 2021) (Table 4).
Table 4 Comparison of the mutation profiles of five SARS-CoV-2 VOCs’ S protein in the contact residues with the ACE2 receptor.
3.3.5 Effect of mutations in the S protein of SARS-CoV-2 major variants on the polar interactions with the ACE2 receptor
Using the crystallized structure PDB ID 6LZG that represents the interaction between SARS-CoV-2 receptor binding domain with ACE2 receptor as a reference, we generated models for the five variants on the LigPlot+ software and analyzed the polar interaction patterns. The interacting residues and type of interactions are shown in Table 5. All the variants have one or more mutations on the contact residues except for the delta variant. At a first glance to Table 5 and by comparison, one can spot the interaction pattern similarity between alpha and delta variants despite the fact that the alpha variant showed only one mutation in the contact residue (N501Y) and delta has two mutations out of the contact residues (L452R and T478K). Both alpha and delta variants were reported to be highly transmissible with delta being 60% higher (Duong, 2021). Delta shows that eight out of nine main polar interactions are present with one missing polar interaction (Spike/ACE2: Asn487/Tyr83) and the addition of new two polar interactions (Gln493/Glu35 and Tyr505/Glu37) and one salt bridge (Glu484/Gln24). Similarly, alpha is missing the same polar interactions but shows the same new salt bridge Glu484/Lys31 and no additional new polar interactions. The addition of new polar interactions and the Glu484/Lys31 salt bridge in the delta variant could be the cause of the stabilizing effect on the complex with ACE2, which can explain the high transmissibility.
Table 5 S protein/ACE2 complex contact residues’ interactions pattern of different SARS-CoV-2 variants.
In addition to Glu484, other mutations on RBD binding hot spot residues have been also linked to antibody binding and neutralization including mutations on Lys417, Gly446, Phe456, Asn501, Gly477, and Asp614 (Greaney et al., 2021a; Liu et al., 2021b; Montefiori, 2021; Supasa et al., 2021). Interestingly, the omicron variant compiles all these mutations (Table 5), indicating a potential higher degree of immune escape. Additionally, omicron has a consecutive mutation on residues Gln493, Gly496, and Gln498 that dramatically affect the interaction with ACE2. These residues are structurally in the receptor binding ridge and Gln493 is known to form with Leu455 the two receptor binding motif (RBM) stabilizing hot spots and to be a target for some therapeutic antibodies (Shang et al., 2020; Greaney et al., 2021b). This interaction-disrupted profile may sum up into a less stable complex with ACE2 (ΔΔG = 0.18) (Table 5) and loss of antibody neutralization. Furthermore, the comparison of the interactions’ differences between SARS-CoV-2 variants showed some interactions that are conserved in all the variants, suggesting their importance in the complex stability.
Despite significant global healthcare measures and social mitigation efforts along with the availability of a number of vaccines, the SARS-CoV-2, COVID-19 pandemic entered its third year and the numbers of cases are soaring worldwide (https://www.gisaid.org/, https://coronavirus.jhu.edu/map.html). This is mainly due to the emergence of new and more fit viral variants that continue to fuel this pandemic. To create awareness about any additional health issue a new variant may cause, WHO, the world’s most influential health agency, has developed a labeling system to classify the new variants, in addition to the conventional nomenclature. The CDC has adopted this labeling system but introduced major changes. Among the labels used by WHO and the CDC is the label “Variant of Concern or VOC”. The word “concern” is synonymous of anguish, anxiety, and apprehension. It naturally enjoins fear and disturbance, which predispose taking immediate action and making odd changes. Indeed, when a new variant is labeled VOC, countries are inclined to close their borders and take drastic countermeasures. This was particularly striking when a number of countries banned travels from South Africa, which was the first country to describe and report the genomic sequence of the omicron variant. While preventive and cautionary actions are mandatory to control pandemics, these should be based on solid scientific evidence. However, it takes time and coordinated efforts for the scientific community to generate the data needed to accurately label new variants according to criteria such as transmissibility, disease severity, and change in the epidemiological pattern, immune escape, and resistance to previously neutralizing antibodies, efficacy of existing therapies, or vaccine efficiency. Meanwhile, the rapid pandemic progress requires timely response including a good communication system. In this study, we highlight some flaws in the labeling system developed by WHO mainly to ease communication with the national health authorities and the public. Indeed, in addition to the non-appropriate wording for labeling emergent variants, the combination of criteria used to define a variant label as shown in WHO and the CDC web sites, is not accurate. Indeed, in their definitions of the different labels, both agencies build a combination of criteria using the prefix “OR” but not “AND”. This introduces a confusion that is amplified by the CDC use of more labels, and different formulation and different combinations of the criteria used to attribute a label to a new variant. According to WHO, a label is supposedly assigned to a variant through a comparative assessment with the previous ones. Even though SARS-CoV-2 variants are primarily detected upon the virus genomic sequence changes and particularly mutations in important functional regions, the genetic variations do not clearly appear among the viral attributes WHO and the CDC use to formulate their definition of the different variants. The labeling system developed by WHO and the definitions of the criteria for each label used by both WHO and the CDC do not mention a comparative assessment of the genetic variations but rather retain clinical, epidemiological transmissibility and other non-genetic viral attributes. Such parameters need a substantial time to be accurately determined. It seems that the current labeling system relies more on fear from emergent genetic variations, “the super killer virus”, than evidence about the impacts of mutations (Caini et al., 2018). Thus, it seems premature and not appropriate to give a new variant the VOC label. Even when a greater transmissibility can be established, this does not necessarily mean greater severity. For example, the very transmissible H1N1 influenza virus variant was not as severe as many other influenza viruses and the epidemic has naturally faded away (Grubaugh et al., 2020). With reference to the WHO variant labeling system, H1N1 could have been labeled VOC while it has never been one. Therefore, to solve the dilemma between the necessity of rapidly labeling a new variant and the lack of scientific evidence, a good approach is to thoroughly analyze the genetic variations observed to rapidly generate the best data on the potential attributes of the new viral variant. To achieve this analysis, the computational prediction of variant effect on protein stability, function, and interaction is a very useful way to determine the variant “importance”. Several methods can be used based on the availability or non-availability of the 3D structure of the protein. In this study, we show that computational prediction and in silico comparative studies between new variants and older more characterized ones can provide good insights into the impact of the mutations and thus the potential behavior of the emergent variants. Indeed, the observation that most of the mutations in SARS-CoV-2 VOCs map to the solvent-exposed regions along with the comparison of antigenicity predictions (Crooke et al., 2020; Vashi et al., 2020; Chen et al., 2021) represents a good basis to make a fair assumption on a potential immune escape and/or a likely reduction of antibody neutralization and to anticipate an evaluation of vaccine effectiveness. In addition, the comparative analysis of the S protein 3D structure between the five SARS-CoV-2 VOCs showed an identical overall folding as demonstrated by the RMSD values and Tm scores of each model. Meanwhile, the comparison of contact residue maps gave a fine-tuning of the structural divergence, which suggests potential functional differences that need to be further analyzed.
Furthermore, computational studies of the mutation effect on the thermodynamic stability of the S protein/ACE2 complex and the comparative analysis of the pattern of polar interactions of the different variant with the ACE2 receptor give good indications to predict transmissibility and virulence and draw some plausible epidemiological scenario. Indeed, the effect of mutations can be computed as single or combined mutations. The data we obtained with the combined mutations of the omicron variant that have a much higher number of mutations than the other variants show that this variant engages into a more stable interaction with ACE2 than the β and γ variants do and has more interactions than the delta variant.
For the other variant labeling criteria, several studies discussed the importance of the Glu484Lys mutation in the interaction with ACE2 and immune escape (Greaney et al., 2021b; Makdasi et al., 2021). This was also observed with the beta and gamma variant that carry Glu484Lys mutation. It was reported that the beta variant has reduced antibody neutralization compared to the delta variant. Moreover, beta’s resistance to neutralizing antibodies increased by 9.4-fold to convalescent plasma and 10.3- to 12.4-fold for sera from individuals who have been vaccinated (Liu et al., 2021a). In addition, it was suggested that new variants with the same mutation might bear new challenges for current vaccines or monoclonal antibody therapies (Krause et al., 2021; Zhou and Wang, 2021).
Meanwhile, our analysis of the pattern of polar interactions between the different variants shows that omicron has an extra 13 hydrogen bonds as compared to variant delta. In addition, we noticed the presence in the more transmissible alpha and delta variants of an extra salt bridge between the S protein Glu 484 and ACE2 lysine 31 and the presence of an extra salt bridge between the S protein Gln 493 and ACE2 Glu 35 in the omicron variant. This observation, combined with the observed high number of H bonds and the thermodynamic stability data, predicts a potentially more efficient entry into host cells and enhanced transmissibility of the omicron variant, which has been confirmed since the description of the omicron mutation profile. Nevertheless, relying on the study of a single mutation or the computing of one biophysical feature of a variant structure to predict how a viral attribute would evolve is not sufficient. Furthermore, multiple genes can control epidemiologically relevant viral attributes such as the mode of transmission and virulence. Therefore, it is recommended to integrate the data on multiple mutations with computation of various viral structural features to make the best predictions and attribute the right label to a given variant.
In conclusion, the system of SARS-CoV-2 labeling developed by WHO and amended by the CDC has some major flaws. Relying on the integrated biophysical and structural data generated from computational comparative predictions of the likely behavior of a new variant would help in the rapid and accurate labeling of emergent variants. Meanwhile, given our provisional and incomplete knowledge and the uncertain nature of the COVID-19 pandemic, it would be wise to operate in epistemic self-abnegation, use the best tools and knowledge we have at hand, and introduce revisions whenever new evidence becomes available.
Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
DA: in silico analysis, methodology, data curation, writing, and editing. MM: mutations’ review, illustrations, figures, and tables. KT: data retrieval from data banks and formatting. SC: analysis and illustration of the WHO and CDC variant-labeling criteria. NA: data retrieval and iconography. NK: review of the in silico study. AR: data cross-checking and coordination. M-DF: project conception, work design, data analysis, writing, editing, and supervision. All authors contributed to the article and approved the submitted version.
Grant LS_COVID 19, 2020 from the Arabian Gulf University.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fcimb.2022.868205/full#supplementary-material
Agency, U.H.S (2021) SARS-CoV-2 variants of concern and variants under investigation in England (Technical briefing 27). Available at: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1029715/technical-briefing-27.pdf.
Ashoor, D., Ben Khalaf, N., Marzouq, M., Jarjanazi, H., Chlif, S., Fathallah, M. D. (2021). A computational approach to evaluate the combined effect of SARS-CoV-2 RBD mutations and ACE2 receptor genetic variants on infectivity: The COVID-19 host-pathogen nexus Front. Cell. Infect. Microbiol. 11, 707194. doi: 10.3389/fcimb.2021.707194
Banoun, H. (2021). Evolution of SARS-CoV-2: Review of mutations, role of the host immune system. Nephron 145 (4), 392–403. doi: 10.1159/000515417
Caini, S., Kroneman, M., Wiegers, T., El Guerche-Séblain, C., Paget, J. (2018). Clinical characteristics and severity of influenza infections by virus type, subtype, and lineage: A systematic literature review. Influenza Other Respir. Viruses 12 (6), 780–792. doi: 10.1111/irv.12575
Campbell, F., Archer, B., Laurenson-Schafer, H., Jinnai, Y., Konings, F., Batra, N., et al. (2021). Increased transmissibility and global spread of SARS-CoV-2 variants of concern as at June 2021. Eurosurveillance 26, 24, 2100509. doi: 10.2807/1560-7917.ES.2021.26.24.2100509
Cele, S., Jackson, L., Khoury, D. S., Khan, K., Moyo-Gwete, T., Tegally, H., et al. (2022). Omicron extensively but incompletely escapes Pfizer BNT162b2 neutralization. Nature 602 (7898), 654–656. doi: 10.1038/s41586-021-04387-1
Chen, Z., Ruan, P., Wang, L., Nie, X., Ma, X., Tan, Y. (2021). T And b cell epitope analysis of SARS-CoV-2 s protein based on immunoinformatics and experimental research. J. Cell. Mol. Med. 25 (2), 1274–1289. doi: 10.1111/jcmm.16200
Crooke, S. N., Ovsyannikova, I. G., Kennedy, R. B., Poland, G. A. (2020). Immunoinformatic identification of b cell and T cell epitopes in the SARS-CoV-2 proteome. Sci. Rep. 10 (1), 1–15. doi: 10.1038/s41598-020-70864-8
DeLano, W., Schrödinger, L. (2020). PyMOL, the PyMOL molecular graphics system, version 2.4. 0 (Schrödinger, LLC). Available at: http://www.pymol.org/pymol.
Dubey, A., Choudhary, S., Kumar, P., Tomar, S. (2022). Emerging SARS-CoV-2 variants: Genetic variability and clinical implications. Curr. Microbiol. 79, 1, 1–18. doi: 10.1007/s00284-021-02724-1
Duong, D. (2021). Alpha, beta, delta, gamma: What’s important to know about SARS-CoV-2 variants of concern? Can. Med. Assoc.193 (27), E1059–E1060. doi: 10.1503/cmaj.1095949
England, P.H (2021) SARS-CoV-2 variants of concern and variants under investigation in England technical briefing 12. Available at: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/988619/Variants_of_Concern_VOC_Technical_Briefing_12_England.pdf.
France, S.P (2021) Nouveau variant détecté et sous surveillance en bretagne 2021. Available at: https://www.santepubliquefrance.fr/presse/2021/nouveau-variant-detecte-et-sous-surveillance-en-bretagne.
Gobeil, S., Janowska, K., McDowell, S., Mansouri, K., Parks, R., Stalls, V., et al. (2021). Effect of natural mutations of SARS-CoV-2 on spike structure, conformation and antigenicity. Science 373 (6555), eabi6226. doi: 10.1101/2021.03.11.435037
Greaney, A. J., Loes, A. N., Crawford, K. H., Starr, T. N., Malone, K. D., Chu, H. Y., et al. (2021a). Comprehensive mapping of mutations in the SARS-CoV-2 receptor-binding domain that affect recognition by polyclonal human plasma antibodies. . Cell Host Microbe 29, 3, 463–476.e466. doi: 10.1016/j.chom.2021.02.003
Greaney, A. J., Starr, T. N., Gilchuk, P., Zost, S. J., Binshtein, E., Loes, A. N., et al. (2021b). Complete mapping of mutations to the SARS-CoV-2 spike receptor-binding domain that escape antibody recognition. Cell Host Microbe 29 (1), 44–57.e49. doi: 10.1016/j.chom.2020.11.007
Grubaugh, N., Petrone, M., Holmes, E. (2020). We shouldn't worry when a virus mutates during disease outbreaks. Nat. Microbiol. 5 (4), 529–530. doi: 10.1038/s41564-020-0690-4
Jespersen, M. C., Peters, B., Nielsen, M., Marcatili, P. (2017). BepiPred-2.0: improving sequence-based b-cell epitope prediction using conformational epitopes. Nucleic Acids Res. 45, W1, W24–W29. doi: 10.1093/nar/gkx346
Johns-Hopkins-University (2022) The johns Hopkins university COVID-19 dashboard. Available at: https://coronavirus.jhu.edu/map.html.
Kelley, L. A., Mezulis, S., Yates, C. M., Wass, M. N., Sternberg, M. (2015). The Phyre2 web portal for protein modeling, prediction and analysis. Nat. Protoc. 10, 6, 845–858. doi: 10.1038/nprot.2015.053
Krause, P. R., Fleming, T. R., Longini, I. M., Peto, R., Briand, S., Heymann, D. L., et al. (2021). SARS-CoV-2 variants and vaccines. N Engl. J. Med. 385 (2), 179–186. doi: 10.1056/NEJMsr2105280
Laskowski, R. A., Swindells, M. B. (2011). LigPlot+: multiple ligand-protein interaction diagrams for drug discovery. J. Chem. Inf Model. 51 (10), 2778–2786. doi: 10.1021/ci200227u
Liu, C., Ginn, H. M., Dejnirattisai, W., Supasa, P., Wang, B., Tuekprakhon, A., et al. (2021a). Reduced neutralization of SARS-CoV-2 b. 1.617 by vaccine and convalescent serum. Cell 184, 16, 4220–4236.e4213. doi: 10.1016/j.cell.2021.06.020
Liu, Z., VanBlargan, L. A., Bloyet, L.-M., Rothlauf, P. W., Chen, R. E., Stumpf, S., et al. (2021b). Identification of SARS-CoV-2 spike mutations that attenuate monoclonal and serum antibody neutralization. Cell Host Microbe 29, 3, 477–488. e474. doi: 10.1016/j.chom.2021.01.014
Majumdar, P., Niyogi, S. (2021). SARS-CoV-2 mutations: the biological trackway towards viral fitness. Epidemiol. Infect. 149, e110. doi: 10.1017/s0950268821001060
Makdasi, E., Zvi, A., Alcalay, R., Noy-Porat, T., Peretz, E., Mechaly, A., et al. (2021). The neutralization potency of anti-SARS-CoV-2 therapeutic human monoclonal antibodies is retained against viral variants 36, 10, 109679. doi: 10.1016/j.celrep.2021.109679
Malik, Y. S., Kumar, N., Sircar, S., Kaushik, R., Bhat, S., Dhama, K., et al. (2020). Coronavirus disease pandemic (COVID-19): challenges and a global perspective Pathogens 9, 7, 519. doi: 10.3390/pathogens9070519
Montefiori, D. C. (2021). SnapShot: SARS-CoV-2 antibodies. Cell Host Microbe 29, 7, 1162–1162.e1. doi: 10.1016/j.chom.2021.06.005
Rao, S., Singh, M. (2021). The newly detected b. 1.1. 529 (Omicron) variant of SARS-CoV-2 with multiple mutations: Implications for transmission, diagnostics, therapeutics, and immune evasion. DHR Proceedings 1, S5, 7–10. doi: 10.47488/dhrp.v1iS5.35
Shang, J., Ye, G., Shi, K., Wan, Y., Luo, C., Aihara, H., et al. (2020). Structural basis of receptor recognition by SARS-CoV-2. Nature 581, 7807, 221–224. doi: 10.1038/s41586-020-2179-y
Singh, J., Pandit, P., McArthur, A. G., Banerjee, A., Mossman, K. (2021). Evolutionary trajectory of SARS-CoV-2 and emerging variants. Virol. J. 18 (1), 166. doi: 10.1186/s12985-021-01633-w
Supasa, P., Zhou, D., Dejnirattisai, W., Liu, C., Mentzer, A. J., Ginn, H. M., et al. (2021). Reduced neutralization of SARS-CoV-2 b. 1.1. 7 variant by convalescent and vaccine sera. Cell 184, 8, 2201–2211.e2207. doi: 10.1016/j.cell.2021.02.033
Toyoshima, Y., Nemoto, K., Matsumoto, S., Nakamura, Y., Kiyotani, K. (2020). SARS-CoV-2 genomic variations associated with mortality rate of COVID-19. J. Hum. Genet. 65, 12, 1075–1082. doi: 10.1038/s10038-020-0808-9
Tutelyan, V. A., Nikityuk, D. B., Burlyaeva, E. A., Khotimchenko, S. A., Baturin, A. K., Starodubova, A. V., et al. (2020). [COVID-19: new challenges for medical science and practical health]. Vopr Pitan 89 (3), 6–13. doi: 10.24411/0042-8833-2020-10024
Vashi, Y., Jagrit, V., Kumar, S. (2020). Understanding the b and T cell epitopes of spike protein of severe acute respiratory syndrome coronavirus-2: A computational way to predict the immunogens. Infect Genet. Evol. 84, 104382. doi: 10.1016/j.meegid.2020.104382
Vehlow, C., Stehr, H., Winkelmann, M., Duarte, J. M., Petzold, L., Dinse, J., et al. (2011). CMView: interactive contact map visualization and analysis. Bioinformatics 27, 11, 1573–1574. doi: 10.1093/bioinformatics/btr163
World-Economic-Forum (2020). Challenges and opportunities in the post-COVID-19 world. World Economic Forum. Available at: www.weforum.org.
Yazdani, S., De Maio, N., Ding, Y., Shahani, V., Goldman, N., Schapira, M. (2021). Genetic variability of the SARS-CoV-2 pocketome. J Proteome Res. 20, 8, 4212–4215. doi: 10.1021/acs.jproteome.1c00206
Zhang, N., Chen, Y., Lu, H., Zhao, F., Alvarez, R. V., Goncearenco, A., et al. (2020). MutaBind2: predicting the impacts of single and multiple mutations on protein-protein interactions. iScience 23, 3, 100939. doi: 10.1016/j.isci.2020.100939
Zhang, Y., Skolnick, J. (2005). TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 33 (7), 2302–2309. doi: 10.1093/nar/gki524
Zhao, J., Sun, J., He, W.-T., Ji, X., Gao, Q., Zhai, X., et al. (2020). Snapshot of the evolution and mutation patterns of SARS-CoV-2. bioRxiv. doi: 10.1101/2020.07.04.187435
Zhou, W., Wang, W. (2021). Fast-spreading SARS-CoV-2 variants: challenges to and new design strategies of COVID-19 vaccines. Signal Transduction Targeted Ther. 6 (1), 226. doi: 10.1038/s41392-021-00644-x
Keywords: SARS-CoV-2, variants of concern, variant of interest, variant under monitoring, computational prediction
Citation: Ashoor D, Marzouq M, Trabelsi K, Chlif S, Abotalib N, Khalaf NB, Ramadan AR and Fathallah M-D (2022) How concerning is a SARS-CoV-2 variant of concern? Computational predictions and the variants labeling system. Front. Cell. Infect. Microbiol. 12:868205. doi: 10.3389/fcimb.2022.868205
Received: 02 February 2022; Accepted: 04 July 2022;
Published: 10 August 2022.
Edited by:John Hiscott, Istituto Pasteur Italia Cenci Bolognetti Foundation, Italy
Reviewed by:Ryan Connor, National Library of Medicine (NIH), United States
Yongfen Xu, Institut Pasteur of Shanghai (CAS), China
Copyright © 2022 Ashoor, Marzouq, Trabelsi, Chlif, Abotalib, Khalaf, Ramadan and Fathallah. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: M-Dahmani Fathallah, email@example.com