Extensive in silico analysis of Mimivirus coded Rab GTPase homolog suggests a possible role in virion membrane biogenesis

Rab GTPases are the key regulators of intracellular membrane trafficking in eukaryotes. Many viruses and intracellular bacterial pathogens have evolved to hijack the host Rab GTPase functions, mainly through activators and effector proteins, for their benefit. Acanthamoeba polyphaga mimivirus (APMV) is one of the largest viruses and belongs to the monophyletic clade of nucleo-cytoplasmic large DNA viruses (NCLDV). The inner membrane lining is integral to the APMV virion structure. APMV assembly involves extensive host membrane modifications, like vesicle budding and fusion, leading to the formation of a membrane sheet that is incorporated into the virion. Intriguingly, APMV and all group I members of the Mimiviridae family code for a putative Rab GTPase protein. APMV is the first reported virus to code for a Rab GTPase (encoded by R214 gene). Our thorough in silico analysis of the subfamily specific (SF) region of Mimiviridae Rab GTPase sequences suggests that they are related to Rab5, a member of the group II Rab GTPases, of lower eukaryotes. Because of their high divergence from the existing three isoforms, A, B, and C of the Rab5-family, we suggest that Mimiviridae Rabs constitute a new isoform, Rab5D. Phylogenetic analysis indicated probable horizontal acquisition from a lower eukaryotic ancestor followed by selection and divergence. Furthermore, interaction network analysis suggests that vps34 (a Class III PI3K homolog, coded by APMV L615), Atg-8 and dynamin (host proteins) are recruited by APMV Rab GTPase during capsid assembly. Based on these observations, we hypothesize that APMV Rab plays a role in the acquisition of inner membrane during virion assembly.


Introduction
With a particle size of about 750 nm, Acanthamoeba polyphaga mimivirus (APMV) is one of the largest viruses known so far (La Scola et al., 2003;Claverie et al., 2006). APMV belongs to the monophyletic clade of large eukaryotic DNA viruses known as the nucleo-cytoplasmic large DNA viruses (NCLDV) (Iyer et al., 2001). In the mature APMV particle, its 1.2 Mbp linear genome is encapsidated within the icosahedral capsid that is underlined by a lipid bilayer (Xiao et al., 2005). Acquisition of inner viral membrane is a critical step during the capsid assembly and is poorly understood.
APMV is a cytoplasmically replicating virus that carries out viral genome replication transcription, protein synthesis and the subsequent stages of particle formation and virus budding in the giant cytosplasmic structures known as viral factories. Viral factories are formed at 4 h post infection (PI) and the budding of new viral particles has been observed at around 8 h PI (Suzan-Monti et al., 2007;Zauberman et al., 2008). The membrane biogenesis of APMV is initiated from the host cytoplasmic membrane cisternae at around 7.5 h PI (Zauberman et al., 2008;Mutsafi et al., 2013). Regular budding of ∼70 nm vesicles from the cellular cisternae have been observed around the viral factories at earlier PI times (Mutsafi et al., 2013). Continuous fusion of smaller vesicles leads to the formation of multivesicular bodies followed by its fissure to form large membrane sheets that are incorporated into the viral progenies (Mutsafi et al., 2013). The major capsid protein L425, a homolog of vaccinia virus D13, acts as a scaffolding protein and initiates capsid assembly above the membrane sheet. A consistent membrane overhang that prevents the premature closure of the capsid is trimmed off upon completion of the capsid assembly, leaving a ∼20 nm nonvertex transient opening for genome packaging (Mutsafi et al., 2013).
Rab GTPases, a subfamily of small GTP binding proteins within the Ras superfamily, are the key regulators of membrane trafficking (Bourne et al., 1990). Rab GTPase functions by alternating between two states; a GTP-bound active state in which it interacts with effector proteins and a GDP-bound inactive state in which it interacts with proteins like Rab escort protein (REP) and GDP dissociation inhibitor (GDI) (Lee et al., 2009). Functionally divergent Rab GTPases are involved in budding and scission of membrane vesicles from donor organelles, their transport along the actin and microtubules, association with target membrane through tethering complex, and finally their fusion with recipient organelle (Zerial and McBride, 2001). Many intracellular pathogens reside in the vacuoles and extensively modify its host-derived membrane by either recruiting or excluding surface Rab proteins with the help of pathogen proteins (Brumell and Scidmore, 2007;Kumar and Valdivia, 2009). Aside from Legionella pneumophila, which encode proteins that directly interact with Rab, other Rab-mediated mechanisms employed by bacterial pathogens remain poorly understood (Brumell and Scidmore, 2007).
Rab GTPases are the hallmarks of the eukaryotic endomembrane system that are found in prokaryotes. It is quite intriguing for a virus, which does not have a cellular structure, to code for this particular protein. In this report, we present a thorough sequence, structural, and phylogenetic analysis of Rab GTPases of Mimiviridae Group I viruses. Our results indicate that Mimiviridae Rabs belong to Rab5 family and were probably acquired horizontally from an early unicellular eukaryote. Further, sequence and structural comparisons suggest that Mimiviridae Rabs possess all the signature motifs of family Rab5, and have probably diverged to form a new isoform, 5D. Considering the above observations and the available APMV transcriptome data and membrane assembly insights (Legendre et al., 2011;Mutsafi et al., 2013), we speculate that the APMV Rab GTPase might play a role in the acquisition of inner membrane during capsid assembly.

Retrieval of Rab Sequences
The putative Rab GTPase amino acid sequences from Mimiviridae family viruses were retrieved from UniprotKB. The APMV R214 sequence was used as a query to search for homologous sequences in the nr (nonredundant) GenBank protein database. Sequence similarity searches were performed using the BLASTP application (http://blast.ncbi.nlm.nih.gov/ Blast.cgi) with standard settings except the maximum target sequences menu under General Algorithm parameters was changed to 250 from 100. Redundant, unnamed and hypothetical sequences were removed before the alignment. A dataset for comparative analysis was prepared using all the annotated Rab GTPase isoforms from Homo sapiens, Plasmodium falciparum, Caenorhabditis elegance, Drosophila melanogaster, Trichomonas vaginalis, and Saccharomyces cerevisiae.

Sequence Analysis, Conserved Motif Identification, and Phylogenetic Reconstruction
Mimiviridae Rab GTPase sequences were subjected to multiple sequence alignment using Clustal Omega (Sievers et al., 2011) with default settings. The alignment was subjected to the ESPript 3.0 (Gouet, 2003) and secondary structure was predicted using PSIPRED (McGuffin et al., 2000) and Jpred3 servers (Cole et al., 2008).
Phylogenetic reconstruction was done in MEGA6 (Molecular Evolutionary Genetics Analysis) program (Tamura et al., 2013). Sequences were aligned using ClustalW with BLOSUM62 as weight matrix. Neighbor-Joining (NJ) method based trees were generated with complete deletion and p-distance options employed in MEGA6. Trees were analyzed using Bootstrap method for 1000 iterations (Nei and Kumar, 2000). MEGA6 was also used for visual details and representation.

Structure Prediction and Alignment
APMV R214 sequence was threaded to I-TASSER online server (Zhang, 2008) against the I-TASSER template database for homology based structural modeling. Structure with the best Cscore of 0.59, was selected for further analysis. Homo sapiens Rab5b (HsRab5b) structure was retrieved from PDB database (2HEI). Predicted APMV Rab and HsRab5b structures were aligned using PyMol (DeLano, 2002).

Interaction Network
APMV Rab interaction network was prepared using the Cytoscape platform (Saito et al., 2012). P. falciparum was used as the source organism to generate APMV Rab interactome. P. falciparum Rab5B (Pf Rab5B) physical interaction network was retrieved from StringDB and homologous protein sequences were obtained from UniprotKB database. Interacting species were used to search for their homologs using multiple reciprocal BLASTP at NCBI (http://blast.ncbi.nlm.nih.gov/Blast.cgi) against either APMV or Acanthamoeba castellanii str. Neff sequence datasets. Orthologs with E < 0.05 were selected and used to build the physical interaction network.

All Members of the Group I Mimiviridae Family Encode for Rab GTPase
A thorough BLASTP search of the NCBI GenBank database, with APMV R214 gene, that was annotated as a probable small GTPase protein (Raoult et al., 2004) as the query sequence against all members of the NCLDV superfamily, revealed that only Moumouvirus, Mamavirus, Lentille virus, Hirudovirus, Courdo virus, Samba virus, and Megavirus code for R214 homolog (Table 1). Interestingly, all seven viruses belong to the group I of Mimiviridae family of NCLDVs. A phylogenetic tree, constructed using the DNA polymerase B, with representative sequences from all NCLDVs, suggests that Rab GTPase was acquired by an ancestor of group I Mimiviridae lineages (Figure 1).

R214 Gene Product and its Mimiviridae Group I Homologs are Rab GTPases
The signature Rab GTPase motifs, the PM1 motif or the ploop with a consensus GxxxxGKS/T and, PM2 and PM3 motifs, with conserved sequences T and DxxGQ, respectively, are strictly conserved in all three lineages of Mimiviridae (Figure 2A). The G1, G2, and G4 motifs, with consensus F, ANKxD, N, respectively are conserved, while the G3 motif is represented by either SSF or NCI (Figure 2A). Mimiviridae Rab GTPase possess all the conserved five Rab-family specific motifs, RabF1-RabF5, clustered around the switch regions I and II (Figure 2A). The first motif RabF1, represented by a consensus IGAAF in the Mimiviridae family, is critical for facilitating the crosstalk between the switch I and switch II residues (Dumas et al., 1999). The other four motifs, RabF2-RabF5, although show some variations, are readily identifiable and are present in the same structural context, between β3 and β4 sheets, and are closer to or part of the switch II region as in the well-known Rabs (Figure 2A) (Pereira-Leal and Seabra, 2000). Furthermore, the hypervariable region of Mimiviridae Rab is situated at the carboxy-terminal to the GTPase-fold that is followed by the CAAX boxes (C, cys; A, aliphatic; X, any amino acid). The CAAX boxes are the signature prenylation motif of Rabs that consists of two cysteine residues within the five residue stretch at the extreme C-terminus in one of the following combinations, XXXCC, XXCCX, XCCXX, CCXXX, or XXCXC (Pereira- Leal and Seabra, 2000). In Mimiviridae Rabs, the prenylation motif shows dimorphism as XXXCC and XXCXC (Figure 2A). Rab GTPases undergo prenylation through the attachment of geranylgeranyl moieties to these two cysteine residues that regulate the membrane insertion (Gomes et al., 2003). Rab subfamilies are further distinguished based on the presence of subfamily specific conserved sequences (RabSF1-4) that show higher homology within the subfamily (Moore et al., 1995). The RabSF1-4 regions in the Mimiviridae family are also identified corresponding to sequences upstream of PM1, α1/loop3, α3/loop7 and α5, respectively (Figure 2A). To delineate the isoform-type of Mimiviridae Rabs that could give clues about their function, an alignment of only SF regions of Rab isoforms A-C with Mimiviridae Rabs was generated ( Figure 2B). While the key residues in the SF1 and SF3 regions of Mimiviridae Rabs and Rab isoforms A-C are conserved, the residues in the SF2 and SF4 regions showed higher divergence from all other Rab5 isoforms ( Figure 2B).

Predicted Structure of Mimiviral Rab
The GTPase-fold consisting of five α-helices flanking a sixstranded β-sheet, five parallel and one antiparallel, is a common feature of the Ras superfamily (Dumas et al., 1999). These structural features were also identified in the predicted APMV Rab GTPase structure ( Figure 2C). Motifs responsible for binding GTP and Mg 2+ are located in the loop regions between  α-helix and β-strands as seen in all the Ras superfamily proteins. The nucleotide-dependent Rab functions are primarily determined by the switch I and II regions. Solved structures indicate that the GDP bound state tends to have switch regions disordered, which upon binding to GTP, transmutes into a wellordered structure (Stroupe and Brunger, 2000). The γ-phosphate of GTP forms contact with switch I and II regions (Lee et al., 2009). The switch I and II regions in the predicted APMV Rab structure are located in the loop2 and loop4-α2-loop5, respectively (Figure 2A).

Mimiviral Rab GTPase has Diverged Early from Lower Eukaryotic Rab5 and Belongs to Group II Family of Rab GTPases
Homology search using APMV Rab GTPase retrieved Rab homologs in eukaryotes ranging from unicellular Paramecium to multicellular organisms like Homo sapiens and, different Rab families like Rab5, Rab11, and Rab4. A neighborjoining phylogenetic tree of Mimiviridae Rab with homologs showed their close association with the lower eukaryotic Rabs, specifically, protozoan Rab GTPases, including a Acanthamoeba castellanii str. Neff Rab (Figure 3A). This analysis is consistent with an earlier report that suggested that the APMV R214 gene is closely related to a homolog found in its host amoeba (Moreira and Brochier-Armanet, 2008). Furthermore, the phylogeny showed two distinct clades; one consisting of Mimiviridae family Rab, Plasmodium Rab5B, Acanthamoeba castellanii str. Neff Rab5, Trichomonas RabD1; and the other consisting of Rab4 and Rab11 from a diverse group of organisms ( Figure 3A). To further extend this analysis, all annotated Rab sequences from few representative organisms including Mimiviridae Rabs were used to generate a comprehensive phylogenetic tree and found that Rabs from Mimiviridae, Plasmodium and few lower eukaryotes form a separate clade ( Figure 3B).

Interaction Network
Interestingly, APMV Rab and Plasmodium falciparum Rab5B (Pf Rab5) are homologs and the phylogeny suggests that they share a common ancestor ( Figure 3A). The physical interactome of Plasmodium Rab5B was constructed using the yeast interaction network as a template and the predicted interaction network was experimentally validated (Rached et al., 2012). For example, one of the proteins that was predicted, a Plasmodium falciparum coded casein kinase1 (Pf CK1), showed interaction with only Pf Rab5B, but not with other Pf Rab5 isoforms (Rached et al., 2012). We constructed the global physical interactome of APMV Rab GTPase using Plasmodium Rab5B interaction network as a template (Figure 4). The interactome predicted interacting partners from both APMV and its host Acanthamoeba. Some of the important interactors identified are autophagy-related protein 8 (Atg-8), dynamin (both are host factors) and phosphoinositide-3-kinase (PI3K), coded by APMV (Figure 4). We speculate that these proteins, along with Rab GTPase, are the key players of membrane remodeling during APMV capsid assembly and their probable functions are discussed below.

Atg8: An Important Player for Membrane Enlargement
Atg-8 is a ubiquitin-like protein that tethers to the membrane through its interaction with phosphatidylethanolamine (PE) (Nakatogawa et al., 2007). Atg-8 is most likely to be recruited from the host as it was not found in the APMV genome. The membrane-tethering and hemi-fusion activities of Atg-8 are pivotal for membrane expansion (Nakatogawa et al., 2007). Atg-8 also plays an important role in the enlargement of autophagosomal membranes during autophagosome formation (Mizushima et al., 1998;Ichimura et al., 2000).

Dynamin: Helical Scissors for Vesicle Scission
Dynamin, another protein recruited from the host, plays an important role during budding of vesicles from the Trans-Golgi Network (TGN) and endosomes (Jones et al., 1998;Kreitzer et al., 2000). Dynamin family members bind to and oligomerize helically around inositol lipid molecules and impart a tubular membrane topology (Hinshaw and Schmid, 1995;Marks et al., 2001). Cooperative recruitment of dynamin and BAR (Bin-Amphiphysin-Rvs) domain proteins induces positive curvature on the membrane with the help of exoskeleton/endoskeleton during vesicle scission (McMahon and Gallop, 2005).

Class III PI3K or Vps34: A Component of the Autophagosome Vesicle Formation Complex
A Class III PI3K (phosphoinositide 3-kinase) or Vps34 is a kinase recruited to the vesicles in cells expressing active Rab5 and was found to colocalize with Rab5 bound to the endosomal tethering protein EEV1 (Christoforidis et al., 1999;Murray et al., 2002). Our genome-wide search found a PI3K homolog in APMV coded by L615 gene that could interact with APMV Rab and bring about vesicle enlargement. Rab5 has also been shown to form a complex with Vps34 and Beclin1 proteins that is essential for autophagosome formation (Ravikumar et al., 2008).

Discussion
The NCLDV is a large, apparently monophyletic clade of viruses that consists of seven families of eukaryotic double stranded DNA viruses, namely, Poxviridae, Iridoviridae, Ascoviridae, Asfaviridae, Phycodnaviridae, Marseilleviridae, and Mimiviridae (Iyer et al., 2001(Iyer et al., , 2006Yutin and Koonin, 2009). Based on the phylogenetic reconstructions of the highly conserved genes called as NCLDV clusters of orthologous groups of proteins (NCVOGs) which includes primase-helicase, DNA polymerase B, packaging ATPase and A2L-like transcription factor, Mimiviridae family has been subgrouped into two groups, I and II, and, group I Mimiviridae has been further delineated into three lineages; A, B, and C (Colson et al., 2012). Mimiviridae group I viruses possess a membrane layer underlining the icosahedral capsid (Xiao et al., 2005;Mutsafi et al., 2013). Source of this lipid bilayer and how the viruses acquire the membrane layer from endoplasmic reticulum cisternae have been demonstrated microscopically, although their molecular mechanisms remain unclear (Mutsafi et al., 2013).
All three lineages of group I Mimiviridae viruses' code for Rab GTPase (Figure 1A). Our thorough sequence analysis of Mimiviridae Rabs and the predicted structure of APMV R214 gene product suggests the presence of all the Rab signature motifs viz. RabF(1-5), RabSF(1-4), switch regions I and II, and the classical GTP binding motifs (Figures 2A,C). The clustering of all members of Mimiviridae under the same clade, sharing a common origin with the lower eukaryotes, suggests the acquisition of Rab GTPase by a Mimiviridae ancestor from a lower eukaryotic ancestor, probably through HGT events ( Figure 3A). Furthermore, a comprehensive phylogenetic analysis with all annotated Rab subfamilies indicated the close homology of Mimiviridae Rabs to the lower eukaryotic Rab5B ( Figure 3B). Based on the conservation pattern in RabSF regions, Rab5 subfamily has been further divided into three isoforms, namely, 5A, 5B, and 5C (Bucci et al., 1995). Our sequence analysis showed that Mimiviridae Rabs have significant sequence divergence in the RabSF regions. This is particularly interesting since RabSF2 residues are part of the switch region I, where the nucleotide-dependent conformational changes occur (Stenmark and Olkkonen, 2001). Because the switch region residues are exposed on the surface of the molecule, it was hypothesized that they are involved in binding to effectors and regulators (Stenmark and Olkkonen, 2001). Since Rab GTPases from different species cosegregate during phylogenetic analysis, it was suggested to use phylogenetic analysis as one of the criteria, along with specific sequence in the RabSF regions, to assign Rab subfamily (Pereira-Leal and Seabra, 2001). Divergence in the RabSF region in Mimiviridae Rabs implies different functional specificity dictated by specific effector/s and regulator/s suggesting that the Mimiviridae Rabs could potentially form a novel subfamily of Rab5, Rab5D. Apparently, Rab group V representative, Rab5, has a complex evolutionary history (Klöpper et al., 2012). It was suggested that the Rab5 duplication has occurred independently in fungi (Ypt52), apicomplexans (Rab5B), and kinetoplastids (Rab5B) (Klöpper et al., 2012). Mimiviridae Rab BLAST search failed to retrieve other isoforms of Plasmodium Rab5 viz. A and C, but selectively retrieves only Rab5B. This observation leads to the scenario that the acquisition of Rab by Mimiviridae family might have occurred concomitantly with the duplication event in the apicomplexans and evolved to be a new isoform Rab5D.
Interestingly, superimposition of Human Rab5B bound to GDP and the predicted APMV Rab structures, although showed high structural conservation with an RMSD (Root Mean Square Deviation) value of 1.08, the observed heterogeneity is localized in the switch regions (data not shown). The structural differences in the switch regions are the determinants of specificity toward effectors (Pfeffer, 2005;Lee et al., 2009). This observation is consistent with the RabSF sequence alignments and suggests divergence of an acquired Rab5B in the Mimiviridae family into a new isoform.
Although all known Rab5 isoforms are specialized in endosome fusion, some of them are also involved in endocytic function, endosome sorting etc. (Woodman, 2000). Studies have also shown that, besides the hypervariable region, certain F and SF regions are also important determinants of membrane targeting and effector protein interaction specificity (Ali and Seabra, 2005). The R214 transcript is not detectable at 3 h PI and is expressed optimally between 6 and 9 h PI (Legendre et al., 2011). In addition, proteomic study of the purified APMV virion particles showed that both Rab GTPase and PI3K homologs are not associated with the viral particles suggesting that their participation in the early events in establishing infection such as intracellular transport following phagocytosis is unlikely (Renesto et al., 2006). Earlier it was suggested that APMV Rab could be involved in the regulation of host cell cycle (Moreira and Brochier-Armanet, 2008). Here, we speculate that Rab GTPase could also play a role in the viral membrane acquisition during capsid assembly.

A Hypothetical Model for Membrane Acquisition during Mimivirus Capsid Assembly
The APMV Rab GTPase R214 gene expression starts at 6 h PI with highest expression at 9 h PI (Legendre et al., 2011). Microscopic studies indicated the appearance of viral factory at around 4 h PI (Suzan-Monti et al., 2007;Mutsafi et al., 2010). The presence of vesicular structures and membrane sheets on the periphery of the viral factory with extensive membrane network are seen at 7.5 h PI (Mutsafi et al., 2013). The small vesicles budding from the ER fuse together to form a multivesicular body (Mutsafi et al., 2013). On the basis of our interaction network analysis, we speculate that the Trans-Golgi network (TGN) could also contribute to the vesicles. The APMV Rab, which is optimally expressed at around the same time, could insert itself into the membrane through its C-terminal prenylation site, get localized near ER and/or TGN and initiate FIGURE 5 | Hypothetical model for the inner membrane acquisition during APMV capsid assembly. The molecular events leading to the inner membrane acquisition for APMV capsid assembly begins with the expression of the Rab GTPase (yellow) at around 6 h PI. The ER and TGN membrane (green) cisternae gathered at the viral production lines of VF are then subjected to the Rab GTPase mediated pinching of small vesicles. These vesicles fuse together to form a membrane sheet with the help of Rab effectors like PI3K and Atg-8 (orange). The membrane, made amenable through de-phosphorylation (red), serves as a platform for the scaffolding protein and capsid assembly (violet).
budding of small vesicles (Figure 5). APMV Rab could also facilitate the fusion of smaller vesicles by forming complex with Vps34 and Beclin1. Both Rab5 and Vps34 regulate Atg5-Atg12 conjugation and promote fusion of Atg5-rich phagosome leading to the formation of autophagosomes (Ravikumar et al., 2008). Further, Rab5 mediated recruitment of Atg-8 that is tethered to the membrane, leads to the hemifusion of small vesicles and expansion of autophagosomes (Nakatogawa et al., 2007). We speculate that the multivesicular bodies observed (Mutsafi et al., 2013) are the result of Atg-8 mediated fusion of smaller vesicles. These multivesicular bodies fuse and, then rupture leading to the formation of open sheets that are incorporated into the virion. The factors governing the rupture event and further stabilization of the ruptured membrane are yet to be identified (Mutsafi et al., 2013). The Rab GTPase, still bound to the membrane sheet, possibly recruits inositol-5-phosphatase that de-phosphorylates PI(4,5)P 2 of the membrane (Sarantis et al., 2012). The de-phosphorylated membrane is amiable to be molded into any structural form and the scaffolding proteins initiate the expansion of the capsid angular structure over the "softened" membrane ( Figure 5) (Chang-Ileto et al., 2011).
The presence of a membrane overhang, lining the assembled nascent capsid, was suggested to prevent the premature sealing of the capsid (Mutsafi et al., 2013). The membrane hang is trimmed after the completion of capsid assembly, leaving a ∼20 nm nonvertex opening (portal) for genome packaging (Mutsafi et al., 2013). The host dynamin protein could mediate the membrane trimming by forming helical oligomers around the inositol lipid giving a tubular morphology to bring about scission (Figure 5, McMahon and Gallop, 2005).

Conclusions
Our thorough sequence, structural, and phylogenetic analysis showed that the Mimiviridae coded Rab GTPases could constitute a novel isoform Rab5D. In addition, generating an interaction network helped in identifying potential viral and host proteins that might play a role APMV membrane biogenesis. Proposed hypothetical model for APMV membrane biogenesis suggests that intricate interactions between the viral Rab GTPase, other viral coded proteins and several host factors are necessary to bring about APMV membrane biogenesis. These insights gained from the in silico analysis will further aid in understanding the roles of viral and host proteins in APMV membrane biogenesis.