Can the SARS-CoV-2 Spike Protein Bind Integrins Independent of the RGD Sequence?

The RGD motif in the Severe Acute Syndrome Coronavirus 2 (SARS-CoV-2) spike protein has been predicted to bind RGD-recognizing integrins. Recent studies have shown that the spike protein does, indeed, interact with αVβ3 and α5β1 integrins, both of which bind to RGD-containing ligands. However, computational studies have suggested that binding between the spike RGD motif and integrins is not favourable, even when unfolding occurs after conformational changes induced by binding to the canonical host entry receptor, angiotensin-converting enzyme 2 (ACE2). Furthermore, non-RGD-binding integrins, such as αx, have been suggested to interact with the SARS-CoV-2 spike protein. Other viral pathogens, such as rotaviruses, have been recorded to bind integrins in an RGD-independent manner to initiate host cell entry. Thus, in order to consider the potential for the SARS-CoV-2 spike protein to bind integrins independent of the RGD sequence, we investigate several factors related to the involvement of integrins in SARS-CoV-2 infection. First, we review changes in integrin expression during SARS-CoV-2 infection to identify which integrins might be of interest. Then, all known non-RGD integrin-binding motifs are collected and mapped to the spike protein receptor-binding domain and analyzed for their 3D availability. Several integrin-binding motifs are shown to exhibit high sequence similarity with solvent accessible regions of the spike receptor-binding domain. Comparisons of these motifs with other betacoronavirus spike proteins, such as SARS-CoV and RaTG13, reveal that some have recently evolved while others are more conserved throughout phylogenetically similar betacoronaviruses. Interestingly, all of the potential integrin-binding motifs, including the RGD sequence, are conserved in one of the known pangolin coronavirus strains. Of note, the most recently recorded mutations in the spike protein receptor-binding domain were found outside of the putative integrin-binding sequences, although several mutations formed inside and close to one motif, in particular, may potentially enhance binding. These data suggest that the SARS-CoV-2 spike protein may interact with integrins independent of the RGD sequence and may help further explain how SARS-CoV-2 and other viruses can evolve to bind to integrins.


INTRODUCTION
Integrins are an ancient superfamily of heterodimeric transmembrane proteins that are involved in diverse cell processes, such as cell-cell adhesion and both inside-out and outside-in cell signaling, at cell surfaces (Sebe-Pedroś et al., 2010). The heterodimers consist of an alpha (a) and a beta (b) subunit, of which there are as many as 18 a and eight b subunits that form at least 24 heterodimeric structures known in humans (Takada et al., 2007). Some integrins, such as a V b 3 , are ubiquitously expressed across human tissues, while others are specific to certain cell types, such as a IIb b 3 to platelets (Etzioni, 1999). In total, integrins recognize a large variety of cell surface, extracellular matrix, and soluble ligands (Takada et al., 2007). The most frequently observed integrin-binding motif, RGD, is present on several extracellular proteins, such as fibronectin and vitronectin, and can interact with at least 12 integrin subtypes (Ruoslahti, 1996). This motif uses the negatively-charged aspartate to interact with positively-charged cations in the metal ion-dependent adhesion site (MIDAS) domain of integrins (Xiong et al., 2002). Other non-RGD motifs have been discovered to be integrin-binding, such as the LDV sequence from vascular cell adhesion molecule and the GLOGER sequence (where O is hydroxyproline) from collagen that bind a 4 and a 1 b 1 integrins, respectively (Makarem and Humphries, 1991;Barczyk et al., 2010). Designed de novo peptides and peptides homologous to natural ligands, such as ATN-161 (Ac-PHSCN-NH2), have also been described to recognize integrins and some do not share sequence identity or similarity to known integrin-binding motifs (van Golen et al., 2002;Cianfrocca et al., 2006). Thus, integrin-ligand specificity is still incompletely understood and may involve interactions with varying motifs that correspond to binding sites on different domains of integrin dimers (Luo et al., 2007).
Because integrins are largely accessible on the surface of human cells, they are a common target for cell entry by viral pathogens (Maginnis, 2018). Surface proteins of some viruses, such as the capsid protein VP1 of coxsackievirus A9, contain commonly known integrin-binding sequences, such as RGD, that bind integrins to gain entry into cells or modulate downstream signaling pathways, while other viruses have acquired novel integrin-binding mechanisms, for example the binding of human echovirus 1 outside of the known ligand-binding site on a 2 b 1 integrin (Roivainen et al., 1994;Jokinen et al., 2010;Hussein et al., 2015). Other viruses are known to bind integrins, such as HIV-1 binding to a 4 b 7 , but the specific residues involved in the interactions are still unknown (Liu and Lusso, 2020). Thus, further investigation of the evolution of integrin-binding for viral pathogens is important for a better understanding of viral cell entry and cell signaling perturbations during infection.
The Severe Acute Syndrome Coronavirus 2 (SARS-CoV-2) spike protein receptor-binding domain, which binds to angiotensin-converting enzyme 2 (ACE2) to initiate viral cell entry, has been shown to contain an exposed RGD motif that has been suggested to be integrin-binding, thus potentially allowing for integrin-mediated cell entry (Luan et al., 2020;Makowski et al., 2021). Experimental studies have shown that the SARS-CoV-2 spike protein directly binds a V b 3 and a 5 b 1 integrins and may also interact with a 3 , b 1 , a 4 , and a X integrin subunits (Aguirre et al., 2020;Kotani and Nakano, 2020;Beddingfield et al., 2021;Nader et al., 2021;Wang et al., 2021). RGD-based drugs were found to inhibit spike binding to a V b 3 and a 5 b 1 integrins, and mutating the RGD motif to RGE or RGA was also found to decrease binding to a V b 5 integrinsfurther implicating the spike RGD motif in interactions with integrins (Sebe-Pedroś et al., 2010;Robles et al., 2021). Additionally, the canonical SARS-CoV-2 cell entry receptor, ACE2, has been shown to associate with a 5 b 1 integrins, potentially facilitating virus internalization (Beddingfield et al., 2021;Kliche et al., 2021). A structure-based computational study has indicated that the RGD motif accessibility to integrins depends based on its depth within the spike receptor-binding domain (Othman et al., 2021). The study performed protein-protein docking between the spike protein and a 5 b 1 , a IIb b 3 , and a V b 8 integrins and found that the interactions were not favourable, thus suggesting that other motifs on the spike receptor-binding domain may be involved in interactions with integrins. Also, some of the integrins discovered to be associated with SARS-CoV-2 cell entry, such as a X , are not known to bind RGD peptides (Wang et al., 2021). No definitive structural study has, yet, been described for spikeintegrin interactions. Thus, although the RGD sequence has been shown as the probable integrin-binding motif for RGD-binding integrins, the jury is still out on motifs on the spike protein that allow binding to non-RGD-binding integrins or if other spike motifs synergize with RGD-related binding to integrins.
The presence of non-RGD integrin-binding motifs on coronavirus, particularly SARS-CoV-2, spike proteins has not been systematically investigated. Herein, to infer potential connections, we investigate integrins that have been demonstrated to be directly involved in binding to the spike protein, that have been shown to mediate viral entry through the spike protein, or that exhibit changed expression over the course of SARS-CoV-2 infection. Furthermore, we have utilized sequence and structure-based bioinformatics approaches to improve our understanding of the potential of spike binding to integrins by cataloguing all known integrin-binding motifs and mapping them to the SARS-CoV-2 spike protein receptor-binding domain. Since the N-terminal receptor-binding domain of the SARS-CoV-2 spike protein is heavily glycosylated, the C-terminal receptor-binding domainwhich contains the main interaction site with ACE2was chosen for study. Amino acid sequence alignments revealed that some exact and several similar matches were discovered between known integrin-binding motifs and the SARS-CoV-2 spike protein. Subsequent structural analyses reveal that the three-dimensional availability of some motifs is greater than that of the RGD sequence. The evolution of these motifs from phylogenetically close betacoronaviruses, such as SARS-CoV and RaTG13, was also analyzed alongside recently recorded mutations in the SARS-CoV-2 spike protein. The combination of these ideas suggests that integrin binding may also occur independently of the RGD motif on the spike protein. Further experimental work is necessary to validate potential spike-integrin interactions.

INTEGRINS AND SARS-CoV-2
Within the extensive research into SARS-CoV-2, several studies have outlined changes in integrin expression and some have shown direct interactions between the spike protein and different integrins. Since increases in integrin expression may indicate their potential usage as cell entry receptors by the spike protein, as previously described with dengue virus serotype 2 and b 3 integrins, a meta-analysis of expressed integrins during SARS-CoV-2 infection may give insight into which corresponding integrin-binding motifs may be of interest (Zhang et al., 2007;Calver et al., 2021;Gisby et al., 2021).
Binding assays have shown that the spike protein directly interacts with a 5 b 1 and a V b 3 integrins (Beddingfield et al., 2021;Nader et al., 2021). One study discovered that the ATN-161 pentapeptide inhibited interactions between integrin a 5 b 1 and the SARS-CoV-2 spike protein, for which they predicted three inhibitory ATN-161 binding sites on a 5 b 1 (Beddingfield et al., 2021). Cilengitide, an RGD-based drug, and an RGD peptide were found to block the SARS-CoV-2 spike protein from binding to a V b 3 and a 5 b 1 integrins, respectively, indicating that the spike protein binds in an RGD-dependent manner (Nader et al., 2021;Robles et al., 2021). A co-immunoprecipitation study found that mutating the RGD sequence to RGE or RGA reduced the binding of the SARS-CoV-2 spike protein to a V b 5 integrins (Gao et al., 2021). Binding assays demonstrated direct binding between the spike protein and b 1 integrins (Park et al., 2021). A screening of candidate cell entry receptors for SARS-CoV-2 suggested that a 3 and b 1 integrins may be involved, and a CRISPR activation screening looking at neuronal receptors involved in SARS-CoV-2 infection found that a X integrins may mediate cell entry (Kotani and Nakano, 2020;Wang et al., 2021). The antibody natalizumab, which binds to a 4 integrins, was found to decrease SARS-CoV-2 infection, hinting at the potential use of these integrins in cell entry (Aguirre et al., 2020). These studies provide more definitive proof that integrins interact with the SARS-CoV-2 spike protein.
Reviewing changes in integrin expression during SARS-CoV-2 infection may give clues about their potential roles as viral cell entry receptors: increases in expression suggest potential usage, while decreases in expression may rule out their involvement (Fantini and Yahi, 2015). Notably, however, integrin expression may also be an indicator of several other altered cellular processes, such as inflammation and apoptosis, during infection (Futosi et al., 2013). Furthermore, increased integrin expression has also been shown to enhance viral replication irrespective of binding to viral surface proteins, and decreased expression may result as a viral mechanism to prevent multiple viruses from infecting the same cell or as a defence mechanism by the host (Michel et al., 2005;Lanier, 2008;Schmidt et al., 2013). Integrins that showed increased expression were selected since expression of the canonical SARS-CoV-2 entry receptor, ACE2, has also been shown to increase following infection (Zamorano Cuervo and Grandvaux, 2020). Three publicly-available expression datasets from the Gene Expression Atlas collectively showed that there were increases in a 1 , a V , and b 1 integrin expression and decreases in a 3 , a 4 , a D , b 4 , and b 7 integrin expression post SARS-CoV-2 infection in colon cells and increases in a 2 , a 7 , a M , a V , b 2 , and b 8 integrin expression and decreases in a D and b 7 integrin expression in infected lung cells (Supplementary Table 1) (Desai et al., 2020;Vanderheiden et al., 2020;Wu et al., 2020). One study looked at longitudinal proteomic profiles in SARS-CoV-2 patients and found that the presence of b 6 integrins increased over time while a 11 integrins decreased (Gisby et al., 2021). A mass cytometry screening of transmembrane protein expression of platelets from SARS-CoV-2 patients found increased expression of a IIb b 3 integrins (Bongiovanni et al., 2021). These studies suggest that the expression patterns of several integrin subtypes are influenced by SARS-CoV-2 infection and that several are available as potential receptors.
Altogether, the data suggest that a 1 , a 2 , a 4 , a 7 , a M , a V , a X , b 1 , b 2 , b 6 , b 8 , a 5 b 1 , a V b 3 , and a IIb b 3 integrins could be of use to the SARS-CoV-2 spike protein. The a 2 , a 4 , a 7 , a V , b 1 , b 6 , b 8 , a 5 b 1 , a V b 3 , and a IIb b 3 integrins are all known to be RGD-binding (or part of dimers that bind RGD for individual alpha or beta subunits), while the a 1 , a M , a X , and b 2 are not known to be part of dimeric complexes that bind RGD sequences, which suggests that spike could potentially interact with integrins using motifs outside of the RGD sequence. Of note, the expression and direct experimental studies have been limited by cell type, and other integrin subtypes, such as a E and a L , have not been extensively explored. Thus, an evaluation of potential non-RGD integrin-binding motifs present on the SARS-CoV-2 spike protein may help guide further structural investigations and may shed light on the integrin-binding potential of other viruses.

NON-RGD POTENTIAL INTEGRIN-BINDING MOTIFS ON THE SARS-CoV-2 SPIKE PROTEIN RECEPTOR-BINDING DOMAIN
The use of non-RGD integrin-binding motifs on the SARS-CoV-2 spike protein has only been explored briefly: an LDI motif was discovered on the outside of the C-terminal receptor-binding domain (RBD) (Tresoldi et al., 2020). Thus, to expand the understanding of potential non-RGD integrin-binding sites on the SARS-CoV-2 protein, we gathered all known non-RGD integrin-binding motifs and compared them to the sequence of the SARS-CoV-2 RBD and further analyzed their structural accessibility for potential binding to integrins. The RGD motif was analyzed in parallel as a reference.
To collect all known integrin-binding motifs, we searched the PepBank and ELM web servers using "integrin" and "integrinbinding" search terms and selected all non-RGD motifs that were returned (Shtatland et al., 2007;Duchrow et al., 2009;Kumar et al., 2020). We additionally performed a literature search to include motifs from viruses and de novo or homology-derived peptides not found in the database queries. The search resulted in a total of 71 motifs found from ELM (6), PepBank (36), and the literature search (29) (Supplementary Table 2).
The motifs were aligned to the sequence of the SARS-CoV-2 spike receptor-binding domain (residues 333-523) using EMBOSS Needle and the EBLOSUM62 similarity matrix and sequence identity and similarity were recorded (Madeira et al., 2019). The reverse sequence of the integrin-binding motifs was also screened against the RBD sequence, since DGR sequences have been found to bind RGD-recognizing integrins (Spitaleri et al., 2008). We focused on motifs that reported amino acid similarity percentages above 50%, although manual inspection of the alignments revealed some alignments that had similar but out of order amino acids, which may still exhibit similar physicochemical characteristics.
The sequence mapping resulted in 27 integrin-binding motifs that align to 11 motifs on the SARS-CoV-2 spike RBD ( Table 1 and Figure 1A). Except for RGD, only one motif, the LDS sequence that binds a 4 b 1 and a 5 b 1 integrins, on the SARS-CoV-2 spike RBD was found to match at 100% identity in the forward direction; however, one motif read in the reverse direction, TEI, was found to align at 100% identity to the [L,I] ET motif that has been shown to bind a L b 2 integrins. Using amino acid similarity, the spike RKSNLK motif aligned at 100% similarity to the a 3 b 1 -binding N[G,V]R and a V b 6 -binding QRSDL motifs. In the reverse direction, the a IIb b 3 -binding VPW motif and the YGL motif, which binds a 4 b 1 , a 4 b 7 , a 9 b 1 integrins, were found to be 100% similar to the spike FPL and VGY motifs, respectively. Several a V b 6 -binding motifs -RTDLY, REDV, ASDIS, RTDLS, RDLETwere found to be between 50.0% and 66.6% similar to the ERDIS motif on the spike protein, which just precedes the TEI sequence. The spike MLD motif, which binds a 4 b 1 , a 9 b 1 , a 7 b 4 integrins, was also found to have 66.6% identity to the aforementioned LDS motif. The a 1 b 1 -binding (R/K)TS motif was found to be 66.6% similar to the previously mentioned RKSNLK motif. The a V b 3 -binding GRKRK and GRFPF motifs were found to be 60% similar to the NRKRI and GYQPY (overlapping the previously mentioned VGY motif) spike motifs. The LLG motif, which has been shown to bind b 2 integrins, was found to be 60% similar to both the VGG and GVG motifs on the spike RBD. Interestingly, the ATN-161 pentapeptide (Ac-PHSCN-NH (2)) was discovered to be 42.9% identical to the spike RBD STPCN motif, while the a X b 2 -binding GPRP motif also aligned at 50% similarity to the overlapping GSTP spike motif. The G[F,L,V][P,O]GE[N,R] motifs, which have been shown to bind a 1 b 1 and a 2 b 1 integrins, were found to be~50% similar to the spike GVEGFN motif. The spike YGF motif was found to be 66.6% similar to the YGL motif. Altogether, these motifs comprise the potential for the spike protein to bind 12 integrin subtypes independent of the RGD sequence.
The presence of motifs on the SARS-CoV-2 spike protein that are 100% identical or similar to integrin-binding motifs brings attention to the potential for binding to integrins independent of the RGD sequence. Some of the spike motifs, such as GVG and VGY, are overlapping, which may induce stronger binding or permit the binding of different integrins on one overarching  motif. Interestingly, some spike motifs were highly similar to integrin-binding motifs but the amino acids were out of order. For example, the spike ERDI motif was found to be 50% similar to the a V b 6 -binding REDV motif by alignment of the D and I/V residues, but the E and R residues were found on both motifs in a different order. This was also the case for the similarity between the ATN-161 pentapeptide sequence (PHSCN) and the spike STPCN motif, in which both the P and S residues are present but in a different order. Furthermore, the similarity to the ATN-161 pentapeptide is interesting considering studies that suggest its protective ability against SARS-CoV-2 infection, although the peptide has been suggested to also bind the RGD-binding site on integrins (Amruta et al., 2021;Beddingfield et al., 2021). Structural analysis to investigate the accessibility of the motifs on the spike protein may lend more insights into their potential ability to bind integrins.

STRUCTURAL ORIENTATION OF POTENTIAL INTEGRIN-BINDING MOTIFS
The alignments and the amino acid configurations of the corresponding spike protein motifs were subsequently inspected on the structure of the spike protein RBD [3D model generated as previously described (Beaudoin et al., 2021)] to determine whether the side chains are found on the surface of the receptor-binding domain. Since the literature on the 3D structure of binding sites for non-RGD motifs is limited and the potential binding modes may be numerous, in order to quantitatively assess motif accessibility, structural flexibility and solvent accessibility were determined to better understand the surrounding residue microenvironment. Ten 3D models representing different points in flexibility for the SARS-CoV-2 RBD were generated using CABS-Flex 2.0, which utilizes molecular dynamics to infer root mean squared flexibility (RMSF) for each residue, and the solvent accessibility (defined as the number of water molecules in contact with the residue * 10) for each residue in each model was calculated using DSSP ( Table 2; Supplementary Table 3) (Kabsch and Sander, 1983;Kuriata et al., 2018). Overall, as discussed in several studies, the SARS-CoV-2 RBD does not exhibit extensive flexibility, but the distal ends of the ACE2 binding motif in particular, which are often unresolved in crystal structures, were reported with the highest flexibility (~6 RMSF) (Dehury et al., 2020;Spinello et al., 2020). The mean RMSF for all residues of interest was 1.84, and the average solvent accessibility was 65.70. After structural analysis, 19 integrin-binding motifs that cover 8 motifs on the spike RBD were found as potentially A B C FIGURE 1 | Putative integrin-binding motifs on the SARS-CoV-2 receptor-binding domain.The SARS-CoV-2 receptor-binding domain (RBD) amino acid sequence is shown with the putative integrin-binding motifs marked in red (A). The SARS-CoV-2 spike RBD is depicted with the predicted structurally-accessible integrin-binding motifs highlighted in different colors (B). The spike RBD is shown aligned with the PDB: 6m0j structure in order to show the potential accessibility of these motifs when the spike protein is bound to the canonical host cell entry receptor ACE2 (green) (B). A sequence alignment of the putative integrin-binding motifs found on the SARS-CoV-2 receptor binding domain with spike sequences from the SARS-CoV, RatG13, MERS-CoV, Pan-CoV-GD, Pan-CoV-GX-P4L, Pan-CoV-GX-P2V, SL-CoVZC45, BtKY72 betacoronavirus strains is shown with a corresponding phylogenetic tree of the coronavirus genomes generated using FastTree 2.1 and visualized with iTOL (C). accessible to receptors, while 3 of the original 11 spike motifs were discarded due to their buriedness or incompatibility with potential binding to integrins based on the knowledge of the original integrin-motif interactions ( Figure 1B). The STPCN and GVEGFN motifs, which are found adjacent to each other in sequence, exhibited the highest flexibility with an average of 4.43 and 4.48 RMSF, respectively, and high average solvent accessibility of 72.36 and 82.57, respectively. However, after noting that the GVEGFN motif is 1) missing a crucial glutamate between the glycine and asparagine residues and 2) that recent mutation E484K exchanges the remaining acidic glutamate for a basic lysine, this motif was deemed unlikely to mimic integrin-binding in the manner that has been shown for natural G[F,L,V][P,O]GE[N,R] ligands (Hamaia et al., 2012;Wise, 2021 .62. Because of the low flexibility and solvent accessibility of residues in the GYQPY motif, it was discarded as potentially integrin-binding. The LDS motif was found to exhibit low flexibility (1.04 RMSF) and the lowest average solvent accessibility of all putative motifs at 30.61. Because the aspartate residue, which is important for binding in the MIDAS region of integrins, in the LDS motif was buried inside the RBD (solvent accessibility of D: 3.53), the LDS motif was discarded (Ruoslahti, 1996). The FPL motif exhibited a low average flexibility at 0.571 RMSF but a mean solvent accessibility (63.18) close to the average of other motifs. The NRKRI motif reported the second-lowest average flexibility (0.47 RMSF) but the highest average solvent accessibility of 82.78. The YGF motif showed the lowest mean flexibility (0.42 RMSF) and solvent accessibility (25.27) and was, thus, discarded. Although the RGD motif was described as inaccessible for integrin-binding by previous computational analyses, the R and D residues were found to exhibit low flexibility overall -0.244 and 1.015 RMSF, respectivelybut high average solvent accessibility -49.09 and 87.27, respectively. The RGD motif reported an average 0.585 RMSF and a solvent accessibility of 46.85. The reference model showed low solvent accessibility for the arginine and aspartate residues (R: 24 and D: 27), which could imply that small conformational changes may have large  effects on residue accessibility. Furthermore, the GVG motif in front of the RGD motif in 3D space showed high flexibility (2.94 RMSF), which may allow for restructuring to accommodate potential RGD-based interactions. The resulting motifs comprise the potential binding to b 2 ,  a 1 b 1 , a 3 b 1 , a 4 b 1 , a 4 b 7 , a 9 b 1 , a V b 1 , a V b 3 , a V b 6 , a L b 2 , a X b 2 , and a IIb b 3 integrins. Interestingly, several of these integrins overlap those that were suggested to directly interact with the spike protein or exhibit increased expression during SARS-CoV-2 infection: a 1 , a 4 a V , a X , b 1 , b 2 , b 6 , a V b 3 , a IIb b 3 . Since many of the integrins overlap RGD-binding integrins, perhaps the putative non-RGD motifs on the spike protein may provide synergistic binding with the RGD sequence. The RGD motif is overall less accessible than some of the newly discovered potential integrin-binding motifs, such as STPCN and ERDISTEI, although the changes in solvent accessibility due to small conformational changes may have a significant impact on potential integrin-binding. Given that, however, several other motifs, such as the RKSNLK, VGY, and NRKRI, have strong potential for binding to integrins based on amino acid identity and similarity and 3D availability. Further experimental work is needed to validate these potential interactions.

EVOLUTION OF PUTATIVE NON-RGD INTEGRIN-BINDING MOTIFS
As previously mentioned, the evolution of integrin-binding motifs on viral surface proteins provides another route for potential host cell entry. Thus, a look at the conservation of these motifs in related betacoronaviruses and an inspection of recently recorded mutations in the SARS-CoV-2 spike RBD may provide insights into the relevance of these motifs.
Recent genomic comparisons between the spike sequence of SARS-CoV-2 and other coronaviruses have shown that the receptor-binding domain is under both purifying and positive selection (Berrio et al., 2020;. Mutations have been suggested to increase affinity for ACE2 or to prevent recognition by neutralizing antibodies (Armijos-Jaramillo et al., 2020;Greaney et al., 2021). As shown in Figure 1C, an alignment of the SARS-CoV-2 spike receptor-binding domain sequence (NCBI Accession: NC_045512) with the sequences of the SARS-CoV (NC_004718); bat coronavirus strains RaTG13 (MN996532), SL-CoVZC45 (MG772933), and BtKY72 (KY352407); pangolin coronavirus strains GD (MT121216), GX-P4L (MT040333), and GX-P2V (MT072864); and the more phylogenetically distant MERS-CoV (NC_019843) shows that some of the motifs are completely conserved among recently diverged viruses while others have evolved more recently (Price et al., 2010;Letunic and Bork, 2021). Notably, only the pangolin GD strain was to share 100% amino acid sequence identity with all of the potential integrin-binding motifs, including the RGD sequence, from SARS-CoV-2 . The pangolin strain RGD sequence has not yet been mentioned in the literature, and its presence reveals that integrin-binding motifs may evolve before zoonotic transmission to humans. Some of the motifs, such as ERDISTEI and NRKRI, are completely conserved among the RaTG13 and pangolin coronavirus strains, while other motifs, such as RKSNLK and STPCN, seem to have evolved more recently. The alignments support studies that suggest that SARS-CoV-2 is a result of recombination between human, pangolin, and bat coronaviruses . In addition to the analysis of the integrin-binding potential of these motifs, these insights can provide clues towards SARS-CoV-2 spike evolution.
Interestingly, the recent SARS-CoV-2 spike mutations seem to only directly affect the STPCN motifwhere the G476S mutation adds another serine just before the serine already present and the S477N and T478K mutations affect the sequence itself (Guruprasad, 2021). Notably, the T478K mutation found in the Delta strain would create a more similar sequence to the aligned ATN-161 (PHSCN) sequence by replacing the polar threonine with a basic lysine, which may mimic basic histidine interactions (Aoki et al., 2021). The lack of mutations near or on the other putative integrin-binding sites, especially considering the conservation with the pangolin coronavirus genome, may give credibility to their integrinbinding potential.
The combination of 1) the conservation among recent betacoronaviruses and 2) the lack of mutations or insertion of mutations that provide better alignments for the putative integrinbinding motifs on the SARS-CoV-2 protein reveals that these sites may, indeed, be involved in interactions with integrins.

DISCUSSION
The SARS-CoV-2 spike protein has been shown to bind multiple integrin subtypes. Although there is growing evidence for interactions based on the spike RGD sequence, the exact binding mechanisms are still unknown (Robles et al., 2021;Simons et al., 2021). Additionally, previous preliminary computational studies suggest that the spike RGD motif is not completely accessible for integrin-binding, while there have also been other non-RGD-binding integrins that have been suggested to be potentially involved in SARS-CoV-2 viral entry (Othman et al., 2021;Wang et al., 2021). An analysis of putative non-RGD integrin-binding motifs on the SARS-CoV-2 receptor-binding domain identified several viable candidates after sequence alignment and subsequent structural characterization. The motifs comprise potential binding to RGD-binding integrins, such as a V b 6 , as well as non-RGD-binding integrins, such as a X b 2 . A meta-analysis of reported integrin expression levels revealed that several of the putative spike-binding partners are increased during SARS-CoV-2 infection, further adding to their validity. Since a X integrins were both found as putative cell entry receptors for the SARS-CoV-2 spike protein in a CRISPR activation screen, the results presented in this study may help unravel potential binding mechanisms (Wang et al., 2021). The identified motifs may act together with the RGD motif or independently to bind integrins for cell entry or to interfere with cell signaling pathways.
Of interest, one motif was found to be similar to the ATN-161 pentapeptide, which has been shown to block against SARS-CoV-2 infection in vivo (Amruta et al., 2021). The RGD-based drug, Cilengitide, and the RGD peptide were also found to inhibit spike-integrins interactions on endothelial cells (Nader et al., 2021;Robles et al., 2021). Such studies highlight that inhibiting spike-integrin interactions may be useful in blocking or ameliorating infection. Gao et al. also suggested that administering entire integrin subunits could be helpful as a treatment option (Gao et al., 2021). Thus, additional work to define potential non-RGD-based integrin interactions may give rise to new drug target options. Experimental elucidation of the binding mechanisms and affinities between the SARS-CoV-2 spike protein and the diverse integrin dimers would be helpful for engineering inhibitory peptides or small molecules.
As ACE2 has been shown to associate with certain integrin subtypes, such as a 2 , a 5 and b 1 , it is possible that integrinbinding motifs on the SARS-CoV-2 receptor-binding domain may be exposed even after binding with ACE2, which may then allow subsequent interaction between the spike-ACE2 complex and an integrin on the same receptor-binding domain (Clarke et al., 2012). However, since the coronavirus spike protein oligomerizes to its final trimeric form and, thus, contains three receptor-binding domains (one per protomer), the spike protein may interact with integrins in a variety of ways: the SARS-CoV-2 spike protein may 1) bind integrins independently of ACE2, 2) bind to ACE2 and then bind an integrin using another receptorbinding domain (or vice-versa) either on the same or a different spike protein, or 3) bind to ACE2, first, and then utilize accessible motifs to bind an integrin from the same spike receptor-binding domain (Makowski et al., 2021).
Evolutionary sequence comparisons between different coronavirus strains revealed that some of the putative integrinbinding motifs have recently emerged while others are more conserved among betacoronaviruses. Notably, all of the potential integrin-binding motifs, including the RGD sequence, were conserved in one pangolin coronavirus strain, which suggests that integrin-binding motifs may evolve before zoonotic transfer to humans. The presence of these motifs alongside the RGD motif on the pangolin coronavirus genome further supports their potential as mediators for integrin-binding. The discovery that the evolution of the RGD and other putative non-RGD integrinbinding motifs on the spike protein preludes the jump to humans reveals that the integrin-binding motifs may be a driving factor for adapting to human cells. These results indicate that the mapping of all known integrin-binding sequences to viral surface proteins may help to better understand the evolution of integrin use among viruses. Recorded mutations relevant to the putative integrin-binding motifs were centred around one motif similar to the ATN-161 pentapeptide, in particular, which might increase potential affinity to integrins. Monitoring future mutations on the spike protein receptor-binding domain may be useful to identify new cell entry receptors. Subsequent investigation into the connections between spontaneous integrin-binding motif generation and viral transmission between species may further delineate the role of integrins in zoonotic transfer.
The discovery of potential integrin-binding motifs independent of the RGD sequence on the SARS-CoV-2 spike protein highlights that integrin-binding motifs on viral surface proteins may be more widespread than previously established. Since integrins are ubiquitously expressed throughout the human body, their usage as receptors for cell entry by viruses should be scrutinized. The RGD and other putative integrinbinding motifs on the spike protein surface provide potential mechanisms through which SARS-CoV-2 may utilize integrins as cell entry receptors or to interfere with host signaling pathways. Further experimental work is necessary to validate the direct structural interactions between the SARS-CoV-2 spike protein and integrins.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author.

AUTHOR CONTRIBUTIONS
CB, SH, CL-HH, TB, and AJ all contributed to the conception and design of the study. CB performed sequence, structural, and statistical analyses. All authors contributed to the article and approved the submitted version.

ACKNOWLEDGMENTS
Many thanks to Richard Farndale for the helpful comments on the manuscript.