- 1Laboratorio de Biotecnología Genómica, Centro de Biotecnología Genómica, Instituto Politécnico Nacional, Cd Reynosa, Tamaulipas, México
- 2Department of Molecular Biology and Microbiology, Tufts University School of Medicine, Boston, MA, United States
- 3Department of Oral Biology, Rutgers School of Dental Medicine, Newark, NJ, United States
Bdellovibrio bacteriovorus is the most studied member of a group of small motile Gram-negative bacteria called Bdellovibrio and Like Organisms (BALOs). B. bacteriovorus can prey on Gram-negative bacteria, including multi-drug resistant pathogens, and has been proposed as an alternative to antibiotics. Although the life cycle of B. bacteriovorus is well characterized, some molecular aspects of B. bacteriovorus-prey interaction are poorly understood. Hypothetical proteins with unestablished functions have been implicated in B. bacteriovorus predation by many studies. Our approach to characterize these proteins employing Alphafold has revealed novel interactions among attack phase-hypothetical proteins, which may be involved in less understood mechanisms of the Bdellovibrio attack phase. Here, we overlapped attack phase genes from B. bacteriovorus transcriptomic data sets and from transposon sequencing data sets to generate a set of proteins that are both expressed at the attack phase and are necessary for predation, which we termed Attack Phase Predation-Essential Proteins (AP-PEP). By applying Markov Cluster Algorithm and AlphaFold-Multimer to analyze the protein network and interaction partners of the AP-PEPs, we predicted high-confidence protein-protein interactions and two structurally similar but unique novel protein complexes formed among proteins of the Bd2209-Bd2212 and Bd2723-Bd2726 operons. Furthermore, we confirmed the interaction between hypothetical proteins Bd0075 and Bd0474 using the Bacteria Adenylate Cyclase Two-Hybrid system. In addition, we confirmed that the C-terminal domain of Bd0075, which contains Tetratricopeptide repeat motifs, participates principally in its interaction with Bd0474. This study revealed previously unknown cooperation among predation essential hypothetical proteins in the attack phase B. bacteriovorus and has paved the way for further work to understand molecular mechanisms of BALO predation processes.
1 Introduction
Bdellovibrio bacteriovorus is a Gram-negative obligate predator that can prey on a wide range of Gram-negative bacteria, including multi-drug resistant pathogens such as clinical isolates of Acinetobacter baumannii and Psuedomonas aeruginosa, and has been proposed as an alternative to antibiotics (Abulude et al., 2023; Cavallo et al., 2021). The life cycle of Bdellovibrio bacteriovorus starts with the Attack Phase (AP) where fast-swimming AP cells locate their prey, attach to, and enter into the prey’s periplasm, forming a spherical bdelloblast (Herencias et al., 2020; Makowski et al., 2019). In the Growth Phase (GP), Bdellovibrio consumes the prey’s cytoplasmic content to supply nutrients for multiplication. When nutrients are depleted, the Bdellovibrio cells escape to start a new attack cycle (Ajao et al., 2022; Evans et al., 2007).
The life cycle of B. bacteriovorus is well characterized, and several genes essential for predation have been identified. However, many molecular aspects of B. bacteriovorus predation and growth remain unknown (Dwyer and Volle, 2019; Lai et al., 2023; Medina et al., 2008; Rotema et al., 2015). Arguably, hypothetical proteins may help fill this knowledge gap. This hypothesis becomes more compelling, considering many genes identified as important for predation in B. bacteriovorus code for hypothetical proteins whose molecular functions are unknown. For example, in one study, of 16 genes identified as “predation-essential” 7 coded for hypothetical proteins (Tudor et al., 2008). Hypothetical proteins also constituted 72% of 240 “predatosome” genes identified by Lambert and coworkers as specifically upregulated during predation (Lambert et al., 2010). Moreover, hypothetical proteins formed the largest functional category (39.42%) of 104 genes listed as essential for predation during genome-wide transposon sequencing (Tn-seq) characterization of gene function in B. bacteriovorus (Duncan et al., 2019). In a recent study, Caulton et al. revealed the MAT superfamily, a group of trimeric fiber proteins with diversified adhesive tips that function as prey recognition moieties (Caulton et al., 2024). Four of the six MAT proteins implicated in prey recognition are annotated as hypothetical proteins in the B. bacteriovorus HD100 genome as of this writing (Caulton et al., 2024). The abundance of hypothetical proteins implicated in B. bacteriovorus predation calls for further studies.
In this study, we cross-referenced genes from two transcriptomic data sets of B. bacteriovorus (Karunker et al., 2013; Lambert et al., 2010) with a Tn-seq data set (Duncan et al., 2019) to create an overlapping set of proteins expressed at the attack phase and necessary for predation. We refer to these as “Attack Phase-Predation-Essential Proteins (AP-PEP)”. Using AlphaFold-Multimer, we predicted protein-protein interactions among AP-PEP proteins. Using the Bacterial Two-Hybrid system, we showed that Bd0075, containing Tetratricopeptide repeat (TPR) domains, interacts with Bd0474, a forkhead-associated (FHA) domain-containing protein. Also, as predicted by Alphafold, we demonstrated that the C-terminal domains of both proteins are responsible for the interaction. Furthermore, we report two structurally similar novel protein complexes formed by the Bd2209-Bd2212 and Bd2723-Bd2726 operons.
2 Materials and methods
2.1 Strains and culture conditions
Escherichia coli strains XL1Blue and BTH101 were grown in Luria Bertani (LB) with 10 μg/mL tetracycline and 100 μg/mL streptomycin, respectively, and were maintained at 37°C shaking. Bdellovibrio bacteriovorus 109J was co-cultured with E. coli DH5α prey in HEPES buffer at 29°C shaking as described previously (Jurkevitch, 2012).
2.2 Clustering of orthologous proteins
Proteins expressed in the attack phase were obtained from the transcriptomic data of Lambert and coworkers (Lambert et al., 2010) and Karunker and coworkers (Karunker et al., 2013). These, together with genes from the Tn-seq data of Duncan and coworkers (Duncan et al., 2019), were mapped for overlaps using OrthoVenn3 (Sun et al., 2023), which clustered orthologous or identical proteins.
2.3 Protein sequence annotation, domain characterization and structural modelling
InterPro (Blum et al., 2021) was used to scan the input amino acid sequences for families, conserved domains, and sites by sequence comparison. InterPro and PHOBIUS (Käll et al., 2004) were used to predict the transmembrane topology of proteins and to annotate their amino acid sequences in cytoplasmic, intermembrane, and non-cytoplasmic regions (Blum et al., 2021). NCBI’s Conserved Domain Database (CDD) (Marchler-Bauer et al., 2017) was used to find conserved domains. SWISS-MODEL (Waterhouse et al., 2018) and Foldseek (van Kempen et al., 2023) were used for 3D-homology searches. US-align (Zhang et al., 2022) was used for structural analysis of proteins and protein complexes.
2.4 Physicochemical properties of proteins
The ExPASy Protopam server at https://web.expasy.org/cgi-bin/protparam/protparam (Walker et al., 2005) was used to predict physicochemical properties. Molecular weight, theoretical pI, amino acid composition, atomic composition, instability index, aliphatic index, and grand average of hydropathicity (GRAVY), amongst other physicochemical properties, were deduced from the primary protein sequences of the proteins.
2.5 Estimation of binding affinities
Putative binding affinities of protein-protein interactions and complexes were estimated using PRODIGY (Vangone and Bonvin, 2017).
2.6 Protein-protein interaction network prediction
Protein association networks were predicted using the STRINGS database (von Mering et al., 2003). STRINGS output was visualized using Cytoscape3.9 (Shannon et al., 2003). Protein-protein interactions (PPI) were predicted by AlphaFold2-multimer version 3 via the ColabFold server (Evans et al., 2022; Mirdita et al., 2022). AlphaFold-Multimer’s interface predicted template modeling (ipTM) score <0.55 has been shown to indicate random predictions, while 0.55–0.85 performs better than random, with increasing accuracy (Homma et al., 2024; O’reilly et al., 2023; Zhu et al., 2023). The predicted interaction models were viewed and analyzed using ChimeraX version 1.5 (Pettersen et al., 2021).
2.7 Bacterial adenylate cyclase two-hybrid test
Protein-protein interaction partners were confirmed experimentally using the Bacterial Adenylate Cylase Two-Hybrid (BACTH) system (Euromedex No. EUK001). Each of the two genes coding for proteins that could interact was inserted into either pUT18C/pUT18 or pKT25/pKTN25 BACTH plasmids. This allows each protein to fuse with BACTH Bordetella pertussis adenylate cyclase 18 or 25 subunits, respectively. These plasmid constructions were co-transformed into chemically competent E. coli BTH101 and transformants were selected on LB agar plates supplemented with 0.5 mM isopropyl-β-D-thiogalactopyranoside (IPTG), 40 μg/mL 5-bromo-4-chloro-3-indoyl-β-D-galactopyranoside (X-Gal), 50 μg/mL kanamycin and 100 μg/mL ampicillin, incubated for 36 h at 30°C. A co-transformation of E. coli BTH101 with pKT25-zip and pUT18C-zip was a positive control in the test.
2.8 Construction of specific fragments for protein domains
Primers were designed to amplify nucleotide sequences corresponding to different domains of the Bd0075 and Bd0474 proteins. Engineered proteins that excluded domains were constructed by fusing nucleotide sequences upstream and downstream of the excluded region by Splicing by Extension Overlap (SOE) PCR. The resulting amplicons were cloned directionally into BACTH expression vector pKT25 or pUT18C.
3 Results
3.1 Consensus predation-essential-hypothetical proteins
From the transcriptomics data published by Lambert et al. (2010), genes involved in the attack phase (AP) phase totaled 1,535 after eliminating redundancy caused by duplicates across the categories. Of these, sequences of 1,456 proteins were recoverable from NCBI GenBank, likely due to genome reannotation. Similarly, of the 421 AP genes from the transcriptomics data set of Karunker et al. (2013), 411 protein sequences were recoverable, while 101 of 105 proteins from Tn-seq data set from Duncan et al. (2019) were recovered from NCBI GenBank. Using OrthoVenn to cluster orthologous proteins from the three data sets, a total of 818 proteins were clustered, while 1,064 proteins were unclustered (singletons) (Table 1). As shown in Figure 1, the overlapping region of the three data sets had 39 clusters containing 43 proteins. The 43 proteins, which are expressed in the attack phase and essential for predation (AP-PEP), are listed in Supplementary S4. Proteins from all the input datasets and clusters in all overlaps in the data sets are provided in Supplementary S1.

Figure 1. Venn diagram shows overlap and cluster among AP hypothetical proteins from RNA expression data sets (Lambert et al. and Karunker et al.) and hypothetical proteins from transposon data (Duncan et al.).
3.2 Prediction of protein-protein associations and MCL clustering with STRINGS
Each of the 43 proteins from the overlap among three datasets was used as a query in the STRINGS database to predict functional protein associations. This resulted in 43 “local” interaction networks containing 169 unique proteins after eliminating redundancy caused by duplications. Using the 169 proteins, a “global” protein interaction network was constructed. To facilitate the identification of potential protein complexes or structures within the global network, we applied the Markov Cluster Algorithm (MCL) with an inflation value of 3. This analysis formed 27 distinct clusters, which were derived based on stochastic flow dynamics. Hypothetical proteins were found in interactions with other proteins across different clusters. Proteins in each cluster are given in Supplementary S2.
3.3 Prediction of direct protein-protein interaction with AlphaFold-Multimer
Employing AlphaFold-Multimer, we explored clusters 4, 10, and 12 (Figure 2) for direct protein-protein interactions. Cluster 4 forms an interesting group as most proteins have one or more Forkhead-associated (FHA) or TPR domains known to play roles in protein-protein interaction. Clusters 10 and 12 had members who formed complete operons. Cluster 12 had four of its five proteins annotated as hypothetical proteins.

Figure 2. (A) CLUSTER 4, containing proteins that include Fork Head Associated (FHA) domains and Tetratricopeptide Repeat (TPR) domains. (B) CLUSTER 10, containing proteins from the Bd2723 - Bd2726 operon. (C) CLUSTER 12, containing proteins from the Bd2209 - Bd2212 operon.
All interactions within each cluster were assigned interaction scores by STRINGS based on criteria such as gene neighborhood, gene fusions, co-expression, and gene co-occurrence across genomes. STRING-combined Scores above the cut-off ≥0.7, based on the lowest score of a set of experimentally established protein-protein interactions (positive controls), were considered relevant. STRING-predicted interactions above the cut-off were tested for direct protein-protein interaction in AlphaFold-Multimer.
AlphaFold-Multimer ipTM (interface predicted template modeling) scores between 0.6 and 0.8 are confident while iPTM scores >0.8 are highly confident. AlphaFold-Multimer ipTM scores of interactions are given in Supplementary S3. A summary of positive interactions by AlphaFold-Multimer (i.e., ipTM score above 0.6) is shown in Table 2. A comparison between mean ipTM scores between positive controls (known interactions) and predicted interactions yielded a p-value of 0.18. This indicates that there is no statistically significant difference between the means, as determined by the t-test, Figure 3.

Figure 3. (A) Comparison of ipTM scores among known interactions (positive controls) and unknown interactions (B) Comparison of mean ipTM scores of known and unknown interactions. A p-value of 0.18 indicates no statistically significant difference between the means as determined by the t-test.
3.4 Complexes from clusters 10 and 12
Bd2723, Bd2724, Bd2725, and Bd2726 in cluster 10 are encoded by genes within an operon, while the genes encoding Bd2209, Bd2210, Bd2211, and Bd2212 in cluster 12 constitute another operon. Each operon encodes for proteins that form predicted protein complexes comprising all four member proteins. Interestingly, the two operons have similar gene arrangement, and corresponding pairs from the two operons have similar sizes (Figure 4). BLAST analysis shows that corresponding gene pairs have low percentage identities. BLAST E-values and percentage identities are shown in Table 3. Simulation of the predicted complexes, in AlphaFold-Multimer by stepwise addition of each protein (in order of largest to smallest) gave highly confident ipTM scores at each step. The largest proteins in the two operons, Bd2212 and Bd2726, respectively, acted like hubs into which other proteins fitted. The predicted structure of each complex is shown in Figure 5. The complexes exhibit a similar overall structure with a TM-score of 0.80356, indicating that they share highly similar folds. However, the RMSD of 4.17 Å suggests that there are differences at the atomic level, particularly in their fine details. The superimposed structure of the complexes is given in Supplementary S5.

Figure 4. Similar genomic structure of the Bd2209-Bd2212 and Bd2723-Bd2726 operons. Blast identity scale (grey) shows limited identity in the operons (Bd2209/Bd2723 : 39.14%, Bd2210/Bd2724 : 36.56%).

Table 3. Blast E-values and percentage identities of corresponding protein pairs from Bd2209-Bd2,212 and Bd2723-Bd2,726 operons.

Figure 5. (5Ai) AlphaFold-multimer model of Bd2209-Bd2,212 complex (5Aii) Predicted Aligned Error (PAE) matrices, showing confidence in structural predictions for Bd2209-Bd2212 complex (5Bi) AlphaFold-multimer model of Bd2723-Bd2,726 complex (5Bii) Predicted Aligned Error (PAE) matrices, showing confidence in structural predictions for Bd2723-Bd2726 complex.
3.5 Proteins and interaction from cluster 4
From cluster 4, three interactions Bd0075 + Bd0474, Bd0075+ Bd3473, and B0475 + Bd0473 having ipTM >0.6, were selected for further analysis.
3.5.1 Bd0075 and Bd0474 protein structures and interaction
The AlphaFold model of Bd0075 shows three distinct domains termed A, B, and C. Using InterPro, Domain-A (Met1 to Glu83), which had no annotation, is predicted to be in the cytoplasm. In contrast, Domain-B (Gly193 to Gly488) and Domain-C (Asp489 to Asn965), containing TPR repeats, are predicted to be extracytoplasmic (Figure 6A). The AlphaFold structure of Bd0474 also showed three distinct Domains. Domain-A (from Ala2 to Ala100) and B (from Met134 to Glu241) are FHA domains and predicted to be in the cytoplasm, while Domain-C (from Ser363 to Ala673) containing TPR repeats is extracytoplasmic (Figure 6B).

Figure 6. (A) The Bd0075 protein. Domain-A (Purple) Domain-B (Blue) and Domain-C (Green). (B) The Bd0474 protein. Domain-A (Purple), Domain-B (Blue), and Domain-C (Green).
To check the conservation of residues within each domain in Bd0075, we calculated conservation scores for each aligned position based on a multiple sequence alignment of 14 Bdellovibrio sequences (both intraperiplasmic and epibiotic Bdellovibrio species) (Supplementary S4). A threshold of 0.1 was established to identify highly conserved positions, indicating that at these sites, 90% or more of the sequences exhibit identical residues. The green-colored regions in the conservation plot represent positions with scores below this threshold and are thus considered statistically significant for conservation (Figure 7A). These conserved positions likely correspond to functionally important areas within the protein structure. As expected, the regions corresponding to the known TPR domains in Bd0075 were conserved. In addition, the N-terminal region corresponding to the Bd0075 Domain-A is also conserved.

Figure 7. (A) A plot showing the conservation scores across the alignment of Bd0075 from 14 Bdellovibrio species, with conservation scores plotted on the y-axis and positions in the alignment on the x-axis. Regions highlighted in green represent positions with conservation scores below a threshold of 0.1, indicating that these positions are conserved across species. Conversely, positions with higher conservation scores (above 0.1) are shown in steel blue, indicating lower conservation and greater variability among the sequences. The region circled black shows the conservation of the Domain-A of Bd0075. (B) Multiple alignment of protein representatives of the GYF domain from NCBI-CDD with Bd0075 of Bdellovibrio bacteriovorus HD109 J.
We therefore sought the unannotated Domain-A based on structural homology with known protein structures. Using the first 166 amino acids of Bd0075 which corresponds to the cytoplasmic domain region within which the Domain-A is found, a 3D-homology search using SWISS-MODEL shows that the Domain-A shares structural similarities to the glycine-tyrosine-phenylalanine (GYF) domain of the human CD2 cytoplasmic domain binding G protein (CD2BP2) found in the intracellular CD2 binding protein 2 (CD2BP2). This domain has the conserved motif, GP [YF]xxxx [MV]xxWxxx [GN]YF. A multiple sequence alignment of representative GYF-containing proteins from NCBI-CDD, including those from humans, chicken, and yeast, with the GYF-domain of B. bacteriovorus 109J Bd0075 is shown in Figure 7B. Despite being from a prokaryotic origin, Bd0075 aligns well with the motif from these eukaryote proteins. In Bd0075, the use of isoleucine at position 8 of the motif is consistent with representative species like Arabidopsis thaliana and Schizosaccaromyces pombe. However, Bd0075 substitutes Tryptophan for Methionine at position 11 (Figure 7B). In addition, using Foldseek for 3D-homology search, several Foldseek member databases including CATH50 and AFDB-proteome, gave protein hits with portions matching the GYF-domain. Structural matches of Bd0075 Domain-A and portions of the protein hits, with their respective TM and RMSD are given in Supplementary S5.
Using AlphaFold-Multimer, we predicted an interaction between Bd0075 and Bd0474 with an ipTM score of 0.619, interaction model is given in Supplementary S5. Although this score falls in the grey area, an ipTM range 0.6–0.8, where predictions may be correct or wrong (https://www.ebi.ac.uk/training/online/courses/alphafold/inputs-and-outputs/evaluating-alphafolds predicted-structures-using-confidence-scores/confidence-scores-in-alphafold-multimer/), our experimental analysis validated this interaction. Moreover, some known interactions, employed as positive control in this study, had ipTM scores in the 0.6–0.8 range. The Bd0075-Bd0474 interaction is predicted to occur among amino acid residues in the C-Domain of both proteins co-located in the extracytoplasmic space. AlphaFold-Multimer model of the Bd0075-Bd0474 interaction surface showing interacting residues, and a cartoon representation of how they might interact in the membrane is given in Figure 8. Interaction model of the complete domains is given in Supplementary S5.

Figure 8. (A) Cartoon of a Bd0075-Bd0,474 interaction in the membrane (B) Close-up view of 3D structure of domain-domain interactions between Bd0075 and Bd0474.
3.5.2 Bd0075 and Bd3743 interaction
Bd0075 also showed a significant interaction (ipTM = 0.676) with the helix-turn-helix domain-containing protein Bd3743. Bd3743 contains an OmpR/PhoB-type DNA-binding domain found in response regulators at its C-terminal but lacks the phosphoacceptor receiver (REC) domain at its N-terminal, it instead possesses a TPR domain. Interestingly, in AlphaFold-Multimer models, Bd3743 binds to Bd0075 at the same sites where Bd0474 binds to Bd0075. The amino acid residues in Bd0075 involved in binding Bd0474 and Bd3743 are shown in Table 4. Modeling interaction dynamics using the three proteins in AlphaFold-Multimer shows that the Bd0075 TPR domain prefers Bd0474 in the presence of Bd3743 (Supplementary S4).
3.5.3 Bd0475 and Bd0473 interaction
The proteins Bd0473 and Bd0475 are encoded in the same operon with Bd0470, Bd0471, Bd0472, and Bd0474. In AlphaFold-Multimer, Bd0473 (containing an FHA domain) interacts with Bd0475 (a hypothetical protein) with an ipTM score of 0.604.
3.6 Physicochemical properties of proteins and binding affinity of interactions and complexes
Based on their amino acid sequences, we computed physicochemical properties, including instability index, aliphatic index, and GRAVY, for proteins examined in this study. All the proteins in this study except Bd0473, Bd0474, and Bd0475 are predicted to be stable proteins with an instability index below 40 (Table 5). The binding energies of the protein-protein interactions and complexes computed using PRODIGY are shown in Table 6. The more negative the binding energy (-ΔG kcal mol-1), the greater the predicted binding affinity of proteins. Binding energies of Bd0075 -Bd0474 interaction ((ΔG = −19.6 kcal mol-1) and Bd0075 - Bd3743 (−8.6 kcal mol-1)) were consistent with the preferential binding of Bd0075 to Bd0474 in the presence of Bd3743. Binding energies of the Bd2209-Bd2,212 complex (ΔG = −64.7 kcal mol-1) and the Bd2723-Bd2726 complex (ΔG = −50.6 kcal mol-1) indicate high affinity among the proteins in these complexes.

Table 6. Binding affinities (ΔG), dissociation constant (Kd), Interfacial contacts (ICs), and percentages Non Interacting Surface (NIS) of examined interactions.
3.7 BACTH assay and confirmation of Bd0075 - Bd0474 interaction
We validated one of the predicted interactions, Bd0075 - Bd0474, using a Bacterial Adenylate cyclase two-hybrid assay (BACTH). The positive control was set up as an interaction between the pKT25-Zip and pUT18C-Zip plasmids, which were co-transformed into E. coli BTH101 cells and plated on LB plates with 0.5 mM IPTG, 40 μg/mL X-Gal, 50 μg/mL kanamycin and 100 μg/mL ampicillin. The emergence of blue colonies from the plate indicated an interaction. The negative control, between the empty plasmids pKT25 and pUT18C, gave rise to white colonies indicating no interaction. Co-transformation of pKT25:Bd0075 and pUT18C:Bd0474 into BTH101 yielded blue colonies on assay plates, confirming interaction between the proteins.
Furthermore, Bd0075 Domain-A alone (Bd0075AnoTM), Bd0075 Domain-A with transmembrane helix (Bd0075 ATM), and Bd0075 with only Domains A and C (Bd0075 A C) were used in BACTH assay with intact Bd0474. Bd0075AnoTM shows no interaction with Bd0474 as indicated by white colonies from the plate. The Bd0075ATM-Bd0474 gave colonies with slightly blue coloration, suggesting a role for the transmembrane helix in the interaction. The Bd0075AC-Bd0474 interaction yielded blue colored colonies comparable with those from Bd0075-Bd0474 interaction showing that the Domain-C is important for the interaction (Figure 9).

Figure 9. Experimental confirmation of protein-protein interaction by BACTH system. From right to left: Bd0075 interacts with Bd0474, Bd0075 Domain-C participates in the interaction, Transmembrane of Bd0075 may participate in the interaction, and Bd0075 Domain-A without transmembrane has no effect on the interaction.
Experiments with different domains of Bd0075 show that using Domain-A alone does not cause the Bd0075-Bd0474 interaction. The inclusion of the Transmembrane helix (TMH) with the Domain-A gave a slight blue color. This could mean that the TMH may contribute to the observed interaction. The interaction tests with engineered Bd0075, containing Domain-C, restored the Bd0075-Bd0474 interaction, showing that, as predicted, the Domain-C participates in the interaction (Figure 9).
3.8 Discussion
As B. bacteriovorus continues gaining attention as a means of controlling Gram-negative pathogen populations, many molecular aspects of Bdellovibrio predation remain unclear, partly due to many hypothetical and uncharacterized proteins with unknown functions within its genome. To assist in narrowing down hypothetical proteins for experimental exploration, we employed AlphaFold-Multimer, an artificial intelligence (AI)-based model developed by DeepMind, which has emerged as a groundbreaking tool for research in macromolecular interactions, significantly advancing the prediction of protein structure and multiprotein complex structures. Using available gene expression and genome-wide transposon sequencing data, we generated a set of 43 Predation Essential Attack Phase Proteins, from which a global protein interaction network involving 169 proteins was constructed. By clustering this network using MCL clustering, we obtained 23 clusters and further investigated interactions from 3 of the clusters.
Only interactions with AlphaFold-Multimer ipTM score above the 0.6 cutoff were considered in this study. IpTM <0.55 has been shown to indicate random predictions, while 0.55–0.85 performs better than random, with increasing accuracy (Homma et al., 2024; O’reilly et al., 2023). We acknowledge that AlphaFold-Multimer has limitations, particularly in its propensity to produce false negative predictions. However, this limitation had minimal impact on our study, as we leveraged AlphaFold-Multimer’s ability to stringently identify positives while maintaining a low false positive rate of approximately 1%, as reported in previous studies (Homma et al., 2024; Johansson-Åkhe and Wallner, 2022; Omidi et al., 2024). This indicates that while some true positives may have been missed, only the most reliable interactions were selected from the vast network for experimental follow-up.
Bd0075 and Bd0474 proteins were predicted to interact with an ipTM score of 0.619. These proteins have been previously implicated alongside other proteins as essential for B. bacteriovorus predation by transposon mutagenesis studies (Duncan et al., 2019). Not only are Bd0075 and Bd0474 expressed during the attack phase of predation, but we find, using the MicrobesOnline webserver (https://microbesonline.org/) (Dehal et al., 2009), that their overlaid expression profiles also showed a very strong positive correlation, with a Pearson correlation coefficient of 0.99 suggesting that the proteins are co-expressed and can be simultaneously available for functional interaction in the cell. Analysis of physicochemical properties shows that Bd0474, with an instability index of 44.86, is an unstable protein. Hence Bd0474 might benefit functionally from interacting with a more stable Bd0075 (instability index of 33.06). Our AlphaFold-multimer model of the Bd0075-Bd0474 interaction shows that residues from the extra cytoplasmic Domain-C of Bd0075 and Bd0474 participate in the interaction. Bacterial Adenylate cyclase Two-Hybrid (BACTH) experiments confirmed the Bd0075-Bd0,474 interaction. Furthermore, experiments with interaction using different domains, showed that the Domain-C of Bd0075 is involved in the Bd0075-Bd00474 interaction and that the transmembrane helixes of the proteins may contribute to the interaction and stabilization of the complex. The role of transmembrane helixes in the stabilization of protein-protein interaction and complex in the membrane has been documented (Moore et al., 2008).
In addition, Bd0075 can form bonds with Bd3743 at the same binding site in its Domain-C where Bd0474 binds. Since Bd0474 and Bd3743 seem to “compete” for the same site, we modeled interaction dynamics using the three proteins in AlphaFold-Multimer. Our results showed that TPR site of Bd0075 shows a preference for Bd0474. This was corroborated by results from the calculated binding affinities (ΔG). ΔG of Bd0075-Bd0,474 (−19.6 kcal mol-1) is 2.3 times lower than that of Bd0074-Bd3743 (−8.6 kcal mol-1). Interestingly, a previous report showed that Bd0474 is downregulated in the growth phase but not in the attack phase. Hence it is attack phase-specific, while Bd3743 is upregulated in both phases (Lambert et al., 2010). The higher affinity of Bd0075 for Bd0474 may allow this preferential binding in the presence of Bd3743 at the attack phase events. Later in the growth phase stage, when Bd0474 cellular level is depleted, Bd3743 could bind to Bd0075.
Bd0075 shows three distinct Domains: A, B, and C. The Domain-B and Domain-C located in the extra-cytoplasmic region contain TPR domains. TPR domains are known to participate in protein-protein interactions and facilitation of protein complexes (Cerveny et al., 2013) and have been identified in proteins playing various roles in vital cell processes, including cell-cycle regulation, transcription, chaperones, and cell signaling (Blatch and Lä Ssle, 1999).
We found that the Domain-A of Bd0075, a small-sized independent domain in the cytoplasmic region of the protein, resembles the GYF domain of the human CD2 cytoplasmic domain binding G protein (CD2BP2). The GYF domain has the conserved motif, GP [YF]xxxx [MV]xxWxxx [GN]YF, and can bind sites containing two tandem PPPGHR segments within the cytoplasmic region of CD2. The existence of the eukaryote-associated GYF domain in B. bacteriovorus, could represent the possibility of a eukaryote-like domain being used by B. bacteriovorus. Recently, histones, which were generally thought of as being associated exclusively with eukaryotes, have been reported as major chromatin components in B. bacteriovorus (Hocher et al., 2023).
Furthermore, we identified two novel complexes in clusters 10 and 12 of our MCL clustered global network. Each of the complexes is constituted by a group of proteins belonging to the same operon. The first complex (Bd2212 complex) is formed by the operon consisting of proteins Bd2209, Bd2210, Bd2211, and Bd2212, while the second complex (Bd2726 complex) is formed by the operon consisting of Bd2723, Bd2724, Bd2725, and Bb2726. These two operons are expressed in the attack phase of predation and comprise mostly unexplored hypothetical proteins. Whereas all proteins in the Bd2212 complex are annotated as hypothetical/unknown proteins, the Bd2726 complex comprises hypothetical/unknown proteins except for Bd2724 (NCBI-Tfp) pilus assembly protein FimT/FimU) and Bd2725 (NCBI- type II secretion system protein). Interestingly, the hypothetical proteins Bd2210, Bd2211, Bd2212, and Bd2723 have been linked with phenotypes with loss of predation in a transposon mutagenesis study (Duncan et al., 2019).
The Bd2209-Bd2212 and Bd2723-2726 operons have a similar arrangement of genes and form structurally similar complexes (TM-score = 0.80356). However, the corresponding genes from both operons have low BLAST percentage identities, suggesting that these gene clusters have undergone significant evolutionary divergence or were obtained from different origins. Additionally, the RMSD value of 4.17 Å of the matched complexes indicates differences at the atomic level. This divergence in gene sequences could reflect functional specialization or adaptation to different environmental conditions or cellular processes. Studying these types of operons could offer valuable insights into the mechanisms of gene duplication, evolution, and functional divergence in B. bacteriovorus genomes.
In AlphaFold-multimer models, Bd2212 interacted with other protein members in its operon, forming a complex with a high ipTM score of 0.852, comparable to the most stringent cutoff, iPTM = 0.85, used in a study by O’Reilly and coworkers in studying protein complexes (O’reilly et al., 2023). Likewise, Bd2723 interacted with other members of the Bd2723-2,726 operon to form a similar complex with a high ipTM score of 0.837. The binding affinities (ΔG) of the Bd2212 complex and Bd2723 complex were −64.7 kcal mol-1 and -50.6 kcal mol-1, indicating a strong affinity among the members for complex formation.
4 Conclusion
Using two transcriptomics data and genome-wide transposon sequencing data, this study provides a robust functional interaction network comprising 169 proteins relevant to the B. bacteriovorus attack phase. This network was clustered into 23 clusters. With the exploration of 3 clusters, our approach found novel protein-protein interactions and complexes and validated the Bd0075-Bd0,474 interaction. Furthermore, we show that as predicted by AlphaFold-Multimer, the C-Domain of Bd0075 is involved in the interaction. This demonstrates the prospects of this approach for further significant discoveries among the remaining clusters. This study is the first to report the protein complexes involving two operons (Bd2209 to Bd2212 and Bd2723 to Bd2726), similar in genomic structure in B. bacteriovorus, which are both expressed in the attack phase. Future work will focus on the experimental exploration of these novel interactions and complexes. Taken together, our approach has not only discovered functional novel interactions and complexes but has also provided templates and resources for further discovery and exploration of these protein interactions with an implication for understanding the underlying molecular mechanisms of B. bacteriovorus predation.
Data availability statement
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding authors.
Author contributions
IA: Formal Analysis, Visualization, Writing – original draft, Writing – review and editing, Investigation, Data curation, Methodology, Validation. ICR-L: Project administration, Writing – review and editing. AS-V: Writing – review and editing, Resources. AC: Supervision, Validation, Writing – review and editing, Methodology, Resources. DK: Conceptualization, Supervision, Validation, Writing – review and editing, Resources. XG: Conceptualization, Supervision, Validation, Writing – review and editing, Funding acquisition.
Funding
The author(s) declare that financial support was received for the research and/or publication of this article. This research was supported by grants (20230747, 20231050) from “la Scretario de investigación y Posgrado del Instituto Politécnico Nacional”.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declare that Generative AI was used in the creation of this manuscript. Image resolution enhancer.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fbinf.2025.1566486/full#supplementary-material
References
Abulude, I. J., Kadouri, D. E., and Guo, X. (2023). Bdellovibrio bacteriovorus therapy, an merging alternative to antibiotics. Lett. Drug Des. Discov. 21, 2505–2520. doi:10.2174/1570180820666230912161923
Ajao, Y. O., Rodríguez-Luna, I. C., Elufisan, T. O., Sánchez-Varela, A., Cortés-Espinosa, D. V., Camilli, A., et al. (2022). Bdellovibrio reynosensis sp. nov., from a Mexico soil sample. Int. J. Syst. Evol. Microbiol. 72, 005608. doi:10.1099/ijsem.0.005608
Blatch, G. L., and Lä Ssle, M. (1999). The tetratricopeptide repeat: a structural motif mediating protein-protein interactions. BioEssays 21, 932–939. doi:10.1002/(sici)1521-1878(199911)21:11<932::aid-bies5>3.0.co;2-n
Blum, M., Chang, H. Y., Chuguransky, S., Grego, T., Kandasaamy, S., Mitchell, A., et al. (2021). The InterPro protein families and domains database: 20 years on. Nucleic Acids Res. 49, D344–D354. doi:10.1093/nar/gkaa977
Cadby, I. T., Basford, S. M., Nottingham, R., Meek, R., Lowry, R., Lambert, C., et al. (2019). Nucleotide signaling pathway convergence in a cAMP-sensing bacterial c-di-GMP phosphodiesterase. EMBO J. 38, e100772. doi:10.15252/embj.2018100772
Caulton, S. G., Lambert, C., Tyson, J., Radford, P., Al-Bayati, A., Greenwood, S., et al. (2024). Bdellovibrio bacteriovorus uses chimeric fibre proteins to recognize and invade a broad range of bacterial hosts. Nat. Microbiol. 9, 214–227. doi:10.1038/s41564-023-01552-2
Cavallo, F. M., Jordana, L., Friedrich, A. W., Glasner, C., and van Dijl, J. M. (2021). Bdellovibrio bacteriovorus: a potential ‘living antibiotic’ to control bacterial pathogens. Crit. Rev. Microbiol. 47, 630–646. doi:10.1080/1040841x.2021.1908956
Cerveny, L., Straskova, A., Dankova, V., Hartlova, A., Ceckova, M., Staud, F., et al. (2013). Tetratricopeptide repeat motifs in the world of bacterial pathogens: role in virulence mechanisms. Infect. Immun. 81, 629–635. doi:10.1128/IAI.01035-12
Cingolani, G., and Duncan, T. M. (2011). Structure of the ATP synthase catalytic complex (F1) from Escherichia coli in an auto-inhibited conformation. Nat. Struct. Mol. Biol. 18, 701–707. doi:10.1038/NSMB.2058
Dehal, P. S., Joachimiak, M. P., Price, M. N., Bates, J. T., Baumohl, J. K., Chivian, D., et al. (2009). MicrobesOnline: an integrated portal for comparative and functional genomics. Nucleic Acids Res. 38, D396–D400. doi:10.1093/NAR/GKP919
Duncan, M. C., Gillette, R. K., Maglasang, M. A., Corn, E. A., Tai, A. K., Lazinski, D. W., et al. (2019). High-throughput analysis of gene function in the bacterial predator bdellovibrio bacteriovorus. MBio 10, e01040. doi:10.1128/mBio.01040-19
Dwyer, C., and Volle, C. B. (2019). Investigating the biophysical changes in prey cells under attack by wild bdellovibrio. Biophys. J. 116, 577a. doi:10.1016/j.bpj.2018.11.3102
Evans, K. J., Lambert, C., and Sockett, R. E. (2007). Predation by Bdellovibrio bacteriovorus HD100 requires type IV pili. J. Bacteriol. 189, 4850–4859. doi:10.1128/jb.01942-06
Evans, R., O’Neill, M., Pritzel, A., Antropova, N., Senior, A., Green, T., et al. (2022). Protein complex prediction with AlphaFold-Multimer. bioRxiv 2021.doi:10.1101/2021.10.04.463034
Herencias, C., Salgado-Briegas, S., Prieto, M. A., and Nogales, J. (2020). Providing new insights on the biphasic lifestyle of the predatory bacterium Bdellovibrio bacteriovorus through genome-scale metabolic modeling. PLOS Comput. Biol. 16, e1007646. doi:10.1371/JOURNAL.PCBI.1007646
Hocher, A., Laursen, S. P., Radford, P., Tyson, J., Lambert, C., Stevens, K. M., et al. (2023). Histones with an unconventional DNA-binding mode in vitro are major chromatin constituents in the bacterium Bdellovibrio bacteriovorus. Nat. Microbiol. 8, 2006–2019. doi:10.1038/s41564-023-01492-x
Homma, F., Lyu, J., and van der Hoorn, R. A. L. (2024). Using AlphaFold Multimer to discover interkingdom protein–protein interactions. Plant J. 120, 19–28. doi:10.1111/tpj.16969
Johansson-Åkhe, I., and Wallner, B. (2022). Improving peptide-protein docking with AlphaFold-Multimer using forced sampling. Front. Bioinforma. 2, 959160. doi:10.3389/fbinf.2022.959160
Jurkevitch, E. (2012). Isolation and classification of bdellovibrio and like Organisms. Curr. Protoc. Microbiol. 26 (7B.1), 1–7B.1.20. doi:10.1002/9780471729259.mc07b01s26
Käll, L., Krogh, A., and Sonnhammer, E. L. L. (2004). A combined transmembrane topology and signal peptide prediction method. J. Mol. Biol. 338, 1027–1036. doi:10.1016/j.jmb.2004.03.016
Karunker, I., Rotem, O., Dori-Bachash, M., Jurkevitch, E., and Sorek, R. (2013). A global transcriptional switch between the attack and growth forms of bdellovibrio bacteriovorus. PLoS One 8, e61850. doi:10.1371/journal.pone.0061850
Lai, T. F., Ford, R. M., and Huwiler, S. G. (2023). Advances in cellular and molecular predatory biology of Bdellovibrio bacteriovorus six decades after discovery. Front. Microbiol. 14, 1168709. doi:10.3389/fmicb.2023.1168709
Lambert, C., Chang, C. Y., Capeness, M. J., and Sockett, R. E. (2010). The first bite - profiling the predatosome in the bacterial pathogen Bdellovibrio. PLoS One 5, e8599. doi:10.1371/journal.pone.0008599
Makowski, Ł., Trojanowski, D., Till, R., Lambert, C., Lowry, R., Sockett, R. E., et al. (2019). Dynamics of chromosome replication and its relationship to predatory attack lifestyles in bdellovibrio bacteriovorus. Appl. Environ. Microbiol. 85, 007300-19–e814. doi:10.1128/aem.00730-19
Marchler-Bauer, A., Bo, Y., Han, L., He, J., Lanczycki, C. J., Lu, S., et al. (2017). CDD/SPARCLE: functional classification of proteins via subfamily domain architectures. Nucleic Acids Res. 45, D200–D203. doi:10.1093/nar/gkw1129
Medina, A. A., Shanks, R. M., and Kadouri, D. E. (2008). Development of a novel system for isolating genes involved in predator-prey interactions using host independent derivatives of Bdellovibrio bacteriovorus 109J. BMC Microbiol. 8, 33. doi:10.1186/1471-2180-8-33
Milner, D. S., Till, R., Cadby, I., Lovering, A. L., Basford, S. M., Saxon, E. B., et al. (2014). Ras GTPase-like protein MglA, a controller of bacterial social-motility in myxobacteria, has evolved to control bacterial predation by bdellovibrio. PLoS Genet. 10, e1004253. doi:10.1371/journal.pgen.1004253
Mirdita, M., Schütze, K., Moriwaki, Y., Heo, L., Ovchinnikov, S., and Steinegger, M. (2022). ColabFold: making protein folding accessible to all. Nat. Methods 19, 679–682. doi:10.1038/S41592-022-01488-1
Moore, D. T., Berger, B. W., and DeGrado, W. F. (2008). Protein-protein interactions in the membrane: sequence, structural, and biological motifs. Structure 16, 991–1001. doi:10.1016/J.STR.2008.05.007
Omidi, A., Møller, M. H., Malhis, N., Bui, J. M., and Gsponer, J. (2024). AlphaFold-Multimer accurately captures interactions and dynamics of intrinsically disordered protein regions. Proc. Natl. Acad. Sci. U. S. A. 121, e2406407121. doi:10.1073/pnas.2406407121
O’reilly, F. J., Graziadei, A., Forbrig, C., Bremenkamp, R., Charles, K., Lenz, S., et al. (2023). Protein complexes in cells by AI-assisted structural proteomics. Mol. Syst. Biol. 19, e11544. doi:10.15252/MSB.202311544
Pettersen, E. F., Goddard, T. D., Huang, C. C., Meng, E. C., Couch, G. S., Croll, T. I., et al. (2021). UCSF ChimeraX: structure visualization for researchers, educators, and developers. Protein Sci. 30, 70–82. doi:10.1002/pro.3943
Prehna, G., Ramirez, B. E., and Lovering, A. L. (2014). The lifestyle switch protein Bd0108 of bdellovibrio bacteriovorus is an intrinsically disordered protein. PLoS One 9, e115390. doi:10.1371/JOURNAL.PONE.0115390
Rotema, O., Pasternak, Z., Shimoni, E., Belausov, E., Porat, Z., Pietrokovski, S., et al. (2015). Cell-cycle progress in obligate predatory bacteria is dependent upon sequential sensing of prey recognition and prey quality cues. Proc. Natl. Acad. Sci. U. S. A. 112, E6028–E6037. doi:10.1073/pnas.1515749112
Shannon, P., Markiel, A., Ozier, O., Baliga, N. S., Wang, J. T., Ramage, D., et al. (2003). Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504. doi:10.1101/GR.1239303
Sun, J., Lu, F., Luo, Y., Bie, L., Xu, L., and Wang, Y. (2023). OrthoVenn3: an integrated platform for exploring and visualizing orthologous data across genomes. Nucleic Acids Res. 51, W397–W403. doi:10.1093/NAR/GKAD313
Tudor, J. J., Davis, J. J., Panichella, M., and Zwolak, A. (2008). Isolation of predation-deficient mutants of Bdellovibrio bacteriovorus by using transposon mutagenesis. Appl. Environ. Microbiol. 74, 5436–5443. doi:10.1128/aem.00256-08
Umezu, K., Chi, N. W., and Kolodner, R. D. (1993). Biochemical interaction of the Escherichia coli RecF, RecO, and RecR proteins with RecA protein and single-stranded DNA binding protein. Proc. Natl. Acad. Sci. U. S. A. 90, 3875–3879. doi:10.1073/pnas.90.9.3875
Vangone, A., and Bonvin, A. (2017). PRODIGY: a contact-based predictor of binding affinity in protein-protein complexes. BIO-PROTOCOL 7, e2124. doi:10.21769/BIOPROTOC.2124
van Kempen, M., Kim, S. S., Tumescheit, C., Mirdita, M., Lee, J., Gilchrist, C. L. M., et al. (2023). Fast and accurate protein structure search with Foldseek. Nat. Biotechnol. 422 42, 243–246. doi:10.1038/s41587-023-01773-0
von Mering, C., Huynen, M., Jaeggi, D., Schmidt, S., Bork, P., and Snel, B. (2003). STRING: a database of predicted functional associations between proteins. Nucleic Acids Res. 31, 258–261. doi:10.1093/nar/gkg034
Walker, J. M., Gasteiger, E., Hoogland, C., Gattiker, A., Duvaud, S., Wilkins, M. R., et al. (2005). Protein identification and analysis tools on the ExPASy server. Proteomics Protoc. Handb., 571–607. doi:10.1385/1-59259-890-0<x><u>:571</u></x>
Waterhouse, A., Bertoni, M., Bienert, S., Studer, G., Tauriello, G., Gumienny, R., et al. (2018). SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Res. 46, W296–W303. doi:10.1093/nar/gky427
Zhang, C., Shine, M., Pyle, A. M., and Zhang, Y. (2022). US-align: universal structure alignments of proteins, nucleic acids, and macromolecular complexes. Nat. Methods 19, 1109–1115. doi:10.1038/s41592-022-01585-1
Keywords: Bdellovibrio, hypothetical proteins, protein-protein interactions, alphafold-Multimer, attack phase
Citation: Abulude IJ, Luna ICR, Varela AS, Camilli A, Kadouri DE and Guo X (2025) Using AlphaFold-Multimer to study novel protein-protein interactions of predation essential hypothetical proteins in Bdellovibrio. Front. Bioinform. 5:1566486. doi: 10.3389/fbinf.2025.1566486
Received: 24 January 2025; Accepted: 31 March 2025;
Published: 14 April 2025.
Edited by:
Petras Kundrotas, University of Kansas, United StatesReviewed by:
Vaishali Waman, University College London, United KingdomNeelima Boora, CEA Cadarache, France
Copyright © 2025 Abulude, Luna, Varela, Camilli, Kadouri and Guo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Xianwu Guo, eGd1b0BpcG4ubXg=; Daniel E. Kadouri, a2Fkb3VyZGVAc2RtLnJ1dGdlcnMuZWR1; Andrew Camilli, YW5kcmV3LmNhbWlsbGlAdHVmdHMuZWR1