Bioinformatics Approaches to Predict Mutation Effects in the Binding Site of the Proangiogenic Molecule CD93

The transmembrane glycoprotein CD93 has been identified as a potential new target to inhibit tumor angiogenesis. Recently, Multimerin-2 (MMRN2), a pan-endothelial extracellular matrix protein, has been identified as a ligand for CD93, but the interaction mechanism between these two proteins is yet to be studied. In this article, we aim to investigate the structural and functional effects of induced mutations on the binding domain of CD93 to MMRN2. Starting from experimental data, we assessed how specific mutations in the C-type lectin-like domain (CTLD) affect the binding interaction profile. We described a four-step workflow in order to predict the effects of variations on the inter-residue interaction network at the PPI, based on evolutionary information, complex network metrics, and energetic affinity. We showed that the application of computational approaches, combined with experimental data, allowed us to gain more in-depth molecular insights into the CD93–MMRN2 interaction, offering a platform for developing innovative therapeutics able to target these molecules and block their interaction. This comprehensive molecular insight might prove useful in drug design in cancer therapy.


INTRODUCTION
CD93 is a single-pass transmembrane glycoprotein belonging to the group XIV family of the C-type lectin-like domain (CTLD) superfamily (Zelensky and Gready, 2005), and it is predominantly expressed in endothelial cells (ECs) with expression also observed in monocytes, natural killer cells, platelets, myeloid cells, hematopoietic stem cells, and several lymphocyte subtypes (Greenlee et al., 2008;Khan et al., 2017). Notably, CD93 is highly expressed in blood vessels within tumors and has been identified as a key regulator of glioma angiogenesis (Langenkamp et al., 2015;Galvagni et al., 2017;Tosi et al., 2017), making it a suitable potential target for anti-angiogenic treatment. Recently, we identified a new signaling pathway involved in regulating EC adhesion and migration (Galvagni et al., 2016), but much remains to be clarified about the role of CD93 in the control of EC physiology. Recently, the pan-endothelial extracellular matrix (ECM) protein Multimerin-2 (MMRN2) was identified as the interacting partner of CD93 (Galvagni et al., 2017;Khan et al., 2017). We observed the CD93/MMRN2 interaction to be highly specific since no interaction was seen with other ECM molecules which share similar molecular domains with MMRN2 (Galvagni et al., 2017).
CD93 and MMRN2 are both up-regulated in tumor vasculature during tumor progression, suggesting that the CD93-MMRN2 interaction regulates tumor angiogenesis. Indeed, disruption of this interaction strongly impaired EC migration and in vitro angiogenesis (Galvagni et al., 2017). Recent work has suggested that inhibition of CD93-MMRN2 interaction may lead to disruption of vascular integrity in tumors, showing that the CD93-MMRN2 complex is required for the activation of β1 integrin, phosphorylation of focal adhesion kinase, and fibronectin fibrillogenesis in ECs (Lugano et al., 2018).
In the work of Galvagni et al. (2017), model structures and docking hypothetical studies about CD93-MMRN2 interaction were already performed. However, the interaction mechanism between these two proteins is yet to be studied. CD93-MMRN2 binding is dependent on a long-loop region in the C-type lectin-like domain (CTLD) of CD93, and this interaction is abrogated by point mutations in the CTLD and sushi-like domains (Khan et al., 2017;Galvagni et al., 2017). (For further information about CD93 and MMRN2 domains, see Supplementary Data Sheet Section S1. "Domains description.") Here, the application of computational approaches, combined with experimental data [see Figure 7 of Galvagni et al., 2017(Galvagni et al., 2017], allowed us to gain more in-depth molecular insights into the CD93-MMRN2 interaction, working also to establish an in silico workflow model for shedding light on other unknown interacting molecules validated by experimental data.

METHODS
In this work, we compare experimental and in silico data to select the most matching docking pose.
Experimental data were handled as described in the Galvagni et al. (2017) article by using chimeric constructs containing the extracellular domains of CD93 fused to Myc, and the MMRN2 wild type fused to a His tag were generated as previously described (Orlandini et al., 2014;Colladel et al., 2016). Figure 7 of Galvagni et al. (2017) lists the mutations that contributed to the CD93-MMRN2 interaction and was the starting point for our bioinformatics investigation.
In the same work, in silico model structures and docking studies were performed to investigate the region of CD93 and MMRN2 involved in the interaction (Galvagni et al., 2017). Starting from two docking poses, we applied a four-step workflow for screening the most trustworthy one (Figure 1).
We used the EVolutionary Couplings server to provide functional and structural information about proteins derived from the evolutionary sequence record, using methods from statistical physics (Hopf et al., 2019). By using FASTA sequence (UniProt code: Q9NPY3 and Q9H8L6 for CD93 and MMRN2, respectively), EVolutionary Couplings was used to determine co-evolved residues in our selected protein-protein interactions (PPIs) and to provide information on whether a protein interaction is conserved across enough sequenced genomes using a single pair per genome (Hopf et al., 2019).
We performed an in silico validation of the effect of CD93 mutants (Galvagni et al., 2017) by mCSM-PPI2 (Rodrigues et al., 2019). Using the transcripts ENST00000246006.5 and FIGURE 1 | Four-step workflow applied to predict and evaluate the effects of variations on the inter-residue interaction network at the PPI. ENST00000372027 for CD93 (Chr20) MMRN2 (Chr10), respectively, we then mapped the gnomAD missense variants to the structure (Karczewski et al., 2021). The PPI interface of the complex was analyzed using the PDBePISA tool (Krissinel and Henrick, 2007). We calculated the measure of regional intolerance to missense variation for CD93 in both docked poses by using MTR (Galvagni et al., 2017;Traynelis et al., 2017;Silk et al., 2019).

RESULTS AND DISCUSSION
The aim of the PPI docking procedure is to predict correct poses and to score them according to the strength of interaction in a reasonable time frame. In this study, we presented an extended approach to evaluate the reliability of protein-protein complex structures, confirmed by experimental data. Starting from the best two plausible docking poses selected from the study by Galvagni et al. (2017), we applied a four-step workflow for screening the most trustworthy one. Based on an evolutionary statistical approach, our aim was to find co-evolved residues between CD93 and MMRN2: if a protein-protein interaction is conserved across enough sequenced genomes, using a single pair per genome can give accurate predictions of the interacting residues. The analysis of correlated evolutionary sequence changes across proteins can identify residues that are in close proximity (below a threshold distance of 8 Å) with enough accuracy to determine the three-dimensional structure of the protein complexes; consequently, it can be used to screen the reliability of the selected poses (Hopf et al., 2014). All results are summarized in Table 1 and represented in Figure 2; moreover, a comprehensive output of such analysis is reported in "Supplementary Data Sheet Section S2." Moreover, in "Supplementary Data Sheet Section S3," the obtained results have been compared by using ColabFold (Mirdita et al., 2021). For Pose 1 (Figure 2A), the most interesting co-evolved residue couples are closer or equal to 8 Å: Leu-610 (in MMRN2) and Glu-131 (in CD93) with 5.5 Å and a high probability score of 0.82; Ala-585 (in MMRN2) and Pro-245 (in CD93) with 8 Å and a probability score of 0.70. However, the other two pairs (Ala-597 in MMRN2 and Tyr-125 in CD93; Ala-585 in MMRN2 and Pro-245 in CD93) showed an excellent probability score, respectively, of 0.98 and 0.70, and the distances are higher than the threshold of 8 Å. On the contrary, as shown in Figure 2B, there are no pairs of residues in Pose 2 with a distance below 8 Å. The co-evolved couple in the closest proximity is MMRN2 Ala-585/CD93 Pro-245 with a probability score equal to 0.70 and 11.5 Å. Differently from Pose 1, all the other selected pairs of Pose 2 presented distances larger than 20 Å. Being able to verify the presence of two co-evolved couples of residues in Pose 1 located at a reasonable distance, this first analysis suggested it as a probable more reliable pose.
The second step evaluated if the two selected poses fitted with experimental results derived from the study of Galvagni et al. (2017), where an extensive mutation analysis of the CD93/MMRN2 binding strength among wild-type and mutated proteins was evaluated by solid phase assay (see Table 2 for the full list of results). To characterize the interacting surface of CD93, point mutations were introduced, and the CD93 mutants expressed in 293T cells were analyzed using Western blots to assess the expression of the soluble recombinant proteins. Every experimental missense mutation was evaluated with mCSM-PPI2, a novel machine learning computational tool designed to predict the effects of missense mutations more accurately on protein-protein interaction binding affinity (Rodrigues et al., 2019). The predicted binding affinity scores were compared with experimental results: the poses that best fit with experimental results through the affinity prediction made with mCSM-PPI2 could be considered more reliable than the other one. For Pose 1 ( Table 2), every binding strength prediction fitted with the experimental results apart from D249A (experimental results: increasing, prediction affinity: moderately decreasing) and H236A (experimental results: decreasing, prediction affinity: weakly increasing and with a distance to interface extremely elevated around 10 Å). For Pose 2 (Table 2), a major part of predictions fitted with experimental results with two serious exceptions for E100R (experimental results: slightly increasing, prediction affinity: strongly decreasing and with a distance to interface extremely 3 | Results from mapping the gnomAD missense variants to CD93: b-factor = 1/−1, where "1" indicates one or more missense variants found at this amino acid position, and "−1" represents no missense variants seen at this amino acid position; next to each one is the associated MTR score. close around 3.2 Å) and D249A (experimental results: increasing, prediction affinity: strongly decreasing, with a distance to interface of 3.2). In conclusion to this second step, Pose 1 fit the best with the experimental results. As a third step, we mapped the gnomAD missense variants to the structure docking poses (Karczewski et al., 2021), obtaining a b-factor equal to 1 or −1, where "1" indicates one or more missense variants found at this amino acid position and "−1" represents no missense variants seen at this amino acid position. Based on these results, we verified whether such residues are located at the binding interface. Thus, the complex structure of every pose was analyzed using the PDBePISA tool for the selection of interface residues (Krissinel and Henrick, 2007). By comparing gnomAD and PDBePISA, we noticed that in Pose 1, the total number of variants (excluding missense) seen at the interface positions of CD93 is larger than that in Pose 2. Specifically, in Pose 1, for a total of 47 interface residues, we observed 26 amino-acid variants' positions, while in Pose 2, for a total of 36 interface residues, no missense variants were found in only 15 positions ( Table 3). This finding suggests a greater tolerance for missense mutations in Pose 1 than in Pose 2, as confirmed by the following fourth step. The fourth step was based on the exploration of regional intolerance to missense variation in interface residues located in Pose 1 and Pose 2 by looking into missense tolerance ratio (MTR) (Traynelis et al., 2017;Silk et al., 2019) scores. It was crucial in this last step to evaluate missense variant deleteriousness by examining its surrounding regional intolerance and to calculate the MTR scores at their position (Table 3). There are no missense intolerant regions placed in the CD93 interfaces both in Pose 1 and in Pose 2. Summarizing our results, Pose 1 is the most plausible docking pose, consistent with experimental data.

CONCLUSION
In silico and experimental procedures were used as a basis to determine the CD93 structure-function relationship. The CD93/ MMRN2 complex was analyzed in vitro, dissecting interactions occurring in specific conditions. Protein docking procedures were used to predict the involvement of specific amino acid residues in the CD93/MMRN2 interaction. Structural analyses were conducted using bioinformatic tools for protein-protein interface regions, showing the key role of amino acid residues in the interaction. Here, we provide an approach to evaluate the best pose of protein-protein complex structures according to new experimental data. With the application of bioinformatic tools, we have described a four-step workflow to predict effects of variations on the inter-residue interaction network at the PPI, based on evolutionary information, complex network metrics, and energetic affinity; in addition, it allows us to map and explore regional intolerance to missense variation. These observations could provide a basis to strengthen the bioinformatics process involved in the development of new drugs.

LIMITATION OF THIS STUDY
A comparison of in silico and experimental procedures was effective for the determination of the structure-function relationship in a protein-protein interaction. However, it is important to emphasize that this exploratory study is based on only two docking poses; thus, it could be interesting to expand and further verify this four-step method on a larger case series to screen a higher number of potential docking poses (Orlandini et al., 2008).

DATA AVAILABILITY STATEMENT
Publicly available datasets were analyzed in this study. These data can be found here: https://www.uniprot.org/uniprot, Q9NPY3, Q9H8L6.

AUTHOR CONTRIBUTIONS
OS, MO, and DA contributed to conception and design of the study. FG and MM organized the molecular biology dataset. AV and MS performed the statistical analysis. FP wrote the first draft of the manuscript. VC, LF, MK, and FN wrote sections of the manuscript. AS supervised the study. All authors contributed to manuscript revision, read, and approved the submitted version.