Giant Viruses of Amoebas: An Update

During the 12 past years, five new or putative virus families encompassing several members, namely Mimiviridae, Marseilleviridae, pandoraviruses, faustoviruses, and virophages were described. In addition, Pithovirus sibericum and Mollivirus sibericum represent type strains of putative new giant virus families. All these viruses were isolated using amoebal coculture methods. These giant viruses were linked by phylogenomic analyses to other large DNA viruses. They were then proposed to be classified in a new viral order, the Megavirales, on the basis of their common origin, as shown by a set of ancestral genes encoding key viral functions, a common virion architecture, and shared major biological features including replication inside cytoplasmic factories. Megavirales is increasingly demonstrated to stand in the tree of life aside Bacteria, Archaea, and Eukarya, and the megavirus ancestor is suspected to be as ancient as cellular ancestors. In addition, giant amoebal viruses are visible under a light microscope and display many phenotypic and genomic features not found in other viruses, while they share other characteristics with parasitic microbes. Moreover, these organisms appear to be common inhabitants of our biosphere, and mimiviruses and marseilleviruses were isolated from human samples and associated to diseases. In the present review, we describe the main features and recent findings on these giant amoebal viruses and virophages.


INTRODUCTION
Viruses were first described at the end of the nineteenth century as ultrafilterable and submicroscopic infectious agents (Beijerinck, 1898;Loeffler and Frosch, 1898). At that time, they were considered to be entities smaller than microbes, and they were mainly defined on the basis of negative criteria, including the absence of both DNA or RNA or components from the translation apparatus (Lwoff, 1957;Lwoff and Tournier, 1966). The discovery of Mimivirus in 2003 challenged this paradigm and fostered new debates on the definition and classification of viruses (Raoult et al., 2004;Raoult and Forterre, 2008;Raoult, 2014). In fact, Mimivirus was visible under a light microscope and its gene content was dramatically broader than that of other viruses, with as many genes as small bacteria. Moreover, some of these genes suggest a relative autonomy from their host cell for transcription and even translation (Raoult et al., 2004). Mimivirus was serendipitously isolated using coculture on Acanthamoeba polyphaga, a culture strategy that consists in inoculating samples onto an axenic amoebal culture and was implemented to grow microbes (Rowbotham, 1983). Thus, Mimivirus was discovered by bacteriologists and not by virologists (La Scola et al., 2003).
Subsequently, over the past 12 years, giant viruses were hunted using amoebal co-culture methods. As of 2015, five new or putative families of viruses infecting amoebas were described, which encompass members from giant viral families Mimiviridae and Marseilleviridae, pandoraviruses, faustoviruses, and mimivirus virophages, and Pithovirus sibericum and Mollivirus sibericum represent type strains of putative new giant virus families (La Scola et al., 2008;Desnues and Raoult, 2012;Colson et al., 2013a,d;Philippe et al., 2013;Legendre et al., 2014Legendre et al., , 2015Reteno et al., 2015). Giant amoebal viruses were linked by phylogenomic analyses to nucleocytoplasmic large DNA viruses (NCLDV), a group of double-stranded (ds) DNA viruses coined in 2001 that includes poxviruses, ascoviruses, iridoviruses, asfarviruses, and phycodnaviruses (Iyer et al., 2001;Yutin et al., 2009). Amoebal giant viruses and these dsDNA viruses were shown to share a small set of nine core genes, including five found in all of their genomes, as well as a larger subset of almost 200 genes shared by at least two NCLDV families. Moreover, these viruses were shown to share a common ancestor whose genome was inferred to harbor about 50 conserved genes, and they were suspected to have an early origin, concomitant with eukaryogenesis Yutin and Koonin, 2012). NCLDV families were proposed in 2013 to be reclassified in a new viral order, the Megavirales, on the basis of their common origin as shown by a large set of ancestral genes encoding key viral functions, a common virion architecture, and shared major biological features including replication inside cytoplasmic factories . The order Megavirales encompasses viruses with a jelly roll capsid whose diameter is >150 nm and a genome larger than 100 kb that encodes all five former NCLDV class I genes (namely, major capsid protein, D5 helicase, DNA polymerase B, A32-like packaging ATPase, and Very Late Transcription Factor 3). However, in pandoraviruses and P. sibericum that were discovered since 2013, virion architecture differs and jelly roll capsid was not detected (Philippe et al., 2013;Klose and Rossmann, 2014;Legendre et al., 2014). In the present review, the main features of amoebal giant viruses and virophages are described.

MIMIVIRUSES AND VIROPHAGES
A. polyphaga mimivirus (APMV), the pioneer representative of giant viruses of amoebas, was isolated in 1992 from a water sample collected in England in an air-conditioning system, while investigating a pneumonia outbreak (  Scola et al., 2003). APMV isolation was performed by co-culturing on A. polyphaga, a strategy implemented primarily to retrieve "Legionella-like pathogens." Mimivirus virions were first mistaken as bacteria, as they could be visualized by optical microscopy and resembled Gram-positive cocci. They were only identified as viral particles in 2003 by visualizing their icosahedral capsid by electron microscopy, then named Mimivirus to stress that they were viruses mimicking microbes (La Scola et al., 2003;Raoult et al., 2004Raoult et al., , 2007. APMV founded the family Mimiviridae. Mimivirus virions were shown to have 500 nmlarge capsids and 75 nm-long fibrils. These fibers have a unique structure amongst viruses  and were recently shown to allow attachment via glycans on different organisms including bacteria, arthropods and fungi (Rodrigues et al., 2015). In addition, genome sequencing revealed a dsDNA harboring 1.2 mega base pairs (Mbp) and 979 putative genes (Raoult et al., 2004;Legendre et al., 2010). Some of these genes had never been previously identified in viruses, like those encoding proteins involved in translation, including amino-acyl tRNA synthetases, and translation factors. Some amino-acyl tRNA synthetases were found to be functional (Abergel et al., 2007). In addition, the expression of Mimivirus genes related to translation was found to vary according to nutrient availability . Other Mimivirus genes were found to encode proteins unique amongst viruses and involved in DNA repair, protein folding, nucleotide synthesis, amino acid metabolism, protein modification, or lipid or polysaccharide metabolisms (Raoult et al., 2004). Moreover, messenger RNA were detected in Mimivirus capsids and the genome was found to encode transfer RNAs and to harbor late and early gene promoters (Raoult et al., 2004;Renesto et al., 2006;Legendre et al., 2010).
High throughput strategies of isolation of giant viruses of amoebas carried out in Marseille led to the isolation to date of about 100 amoebal mimiviruses from various environmental water and soil samples, but also from human samples, which were collected in France, Tunisia, and Brazil, suggesting an ubiquitous distribution ( Figure 5; Table 2) Dornas et al., 2015). More recently, Hirudovirus, a mimivirus from the lineage A was isolated from a leech . Moreover, Lentillevirus was found by culturing the contact lens cleaning solution of a patient with keratitis, from which A. polyphaga co-infected by two other bacteria was also isolated (Cohen et al., 2011). Lastly, mimiviruses of lineage C were isolated from humans with pneumonia (Saadi et al., 2013a,b).  APMV was first identified during a pneumonia outbreak investigation while searching for resistant amoebal bacteria (La Scola et al., 2003;Raoult et al., 2007). The association between Mimivirus and pneumonia was strengthened by serological evidence. Thus, it was significant that patients with communityand hospital-acquired pneumonia exhibited antibodies to APMV more frequently than healthy controls, and seroconversions were observed (La Scola et al., 2005;Berger et al., 2006;Vincent et al., 2009;Bousbia et al., 2013;Colson et al., 2013c). Moreover, a Mimivirus seroconversion occurred in a 38-year-old technician who manipulated high amounts of Mimivirus . He reported transfixing pain in his chest and bilateral basilar infiltrates suggesting viral pneumonia. Infections with the usual pneumonia agents were excluded, whereas seroconversion to 23 Mimivirus proteins was documented. Finally, in 2013, the first mimivirus isolated from a human was retrieved from the bronchoalveolar fluid of a Tunisian patient with unexplained pneumonia, and named LBA111 virus (Saadi et al., 2013a). Then, Shan virus, another lineage C mimivirus, was isolated from the feces of another patient with pneumonia (Saadi et al., 2013b). Moreover, in an experimental mouse model of infection, histopathological features of pneumonia, with thickened alveolar walls, inflammatory infiltrates, and diffuse alveolar damages, occurred after intracardiac Mimivirus inoculation (Khan et al., 2007).
Unexpectedly, Mimivirus isolations also led to the discovery of a new type of viruses, named virophages by analogy to bacteriophages (Table 1; Figures 1, 3-5; La Scola et al., 2008). Virophages cannot replicate alone in Acanthamoeba spp., but they replicate in the presence of a mimiviral host, in their viral factories (La Scola et al., 2008;Desnues and Raoult, 2012). Virophages have small (50 nm in diameter) virions with an icosahedral capsid and a dsDNA genome of ≈18,000 bp that encodes 20-21 proteins. The first virophage, named Sputnik, was described as infecting another strain of APMV, Mamavirus (La Scola et al., 2008). Sputnik replication impaired the normal replicative cycle and the morphogenesis of Mamavirus, decreasing by 70% the amoebal lysis and generating Mimivirus particles with abnormal morphologies. Three other Sputnik isolates were subsequently found, including Sputnik 2, which was co-isolated from a contact lens liquid with Lentillevirus, a mimivirus from lineage A (Cohen et al., 2011), and the Rio Negro virophage, which was isolated with Samba virus, a mimivirus recovered from the Negro river in Brazil (Campos et al., 2014). A divergent virophage of amoebal viruses, named Zamilon, was isolated from a soil sample collected in Tunisia with Mont1 virus, a mimivirus from lineage C, (Gaia et al., 2014). This virophage was capable of infecting mimiviruses from the B and C lineages, but not from the A lineage. Interestingly, antibodies against the Sputnik virophage were detected in two febrile patients returning from Laos, and seroconversion was noted in one case (Parola et al., 2012). Virophage DNA integration into the genome of mimiviruses, as pro-virophages, was demonstrated . In addition, transposable elements, which were named transpovirons, were discovered in mimiviruses . They are ∼7 kb-long DNA elements that encode 6-8 proteins, among which two are homologous to virophage genes. They replicate in the mimivirus factory and accumulate inside virions, as well as in virophage particles and amoebas. Integrated genomes or fragments of transpovirons were detected in mimivirus and Sputnik DNA. Virophages, pro-virophages, and transpovirons comprise a mobilome in mimiviruses . Other virophages were also described from distant mimiviruses, or phycodnaviruses (Zablocki et al., 2014;Blanc et al., 2015). Notably, a virophage named Mavirus was isolated with C. roenbergensis virus, a mimivirus deeply rooted in the branch of the family Mimiviridae (Fischer and Suttle, 2011).

MARSEILLEVIRUSES
In 2009, Marseillevirus was isolated in Marseille by co-culturing on A. polyphaga, from water collected in Paris in a cooling tower (  (Boyer et al., 2009). It had a 250 nm-large capsid with an icosahedral shape. Marseillevirus became the founding member of a new family of giant viruses of amoebas named Marseilleviridae (Colson et al., 2013d). Then, Lausannevirus , the second member of this new family, and Cannes 8 virus (Aherfi et al., 2013), Tunisvirus (Aherfi et al., 2014), and Melbournevirus (Doutre et al., 2014), three other close relatives to Marseillevirus were isolated from the environment, including water samples collected from the Seine river, a cooling tower in Cannes (France), a decorative fountain in Tunis (Tunisia), and a pond in Melbourne (Australia), respectively. In addition, Insectomime virus was isolated from a diptera collected in Tunisia, and was the closest relative to Tunisvirus (Boughalmi et al., 2013b). Moreover, Senegalvirus was the first giant virus of amoeba isolated from a human sample, human feces collected from a young healthy man in Senegal and for which a metagenomic study serendipitously led to the generation and detection of marseillevirus-related reads (Lagier et al., 2012;Colson et al., 2013b).
The description of the Marseillevirus genome led to highlight a considerable genomic mosaicism, which has been suspected to be linked to the sympatric lifestyle in amoebas, where giant viruses can multiply in contact with other bacteria or fungi, and lateral genomic sequence transfers can occur . Marseillevirus genomes are 350-380 kilobp (kbp) large, with a G+C content of ≈45% and harbor genes they might have shared with bacteria, archaea, viruses, and eukaryotes, including amoebas (Aherfi et al., 2014). A surprising finding was the presence of genes encoding histone-like proteins and histone doublets, proteins only found so far in eukaryotic genomes (Boyer et al., 2009;Thomas et al., 2011). Notably, a recent report has suggested an evolutionary scenario in which marseillevirus core histones, as well as DNA topoisomerase II, would derive from a stem-eukaryotic lineage, which predates the neofunctionalization of histone paralogs of eukaryotes (Erives, 2015). In addition, large families of paralogous genes were found in marseillevirus genomes (Boyer et al., 2009;Aherfi et al., 2014). One of these families of genes encode proteins containing bacterial-like membrane occupation and recognition nexus (MORN) repeat domains. Moreover, marseillevirus genomes contain a very high number of family ORFans, i.e., genes only found in the family Marseilleviridae when searching into the NCBI GenBank non redundant protein sequence database. Phylogenetic reconstruction based on the core genes of the Megavirales showed that the family Marseilleviridae encompasses three distinct lineages to date (Aherfi et al., 2014). The first one is led by Marseillevirus and contains close relatives, namely Cannes 8 virus, Senegalvirus, and Melbournevirus (Figure 5). Lausannevirus is the only representative of the second lineage. Lastly, the third lineage is comprised of Tunisvirus and Insectomime viruses.
After the Senegalvirus discovery from human feces, a metagenomic study of the human blood virome conducted in blood donors revealed the unexpected presence of a substantial number of sequences, representing 2.5% of the whole set of reads, that were related to Marseillevirus and could enable the assembly of two ≈10-kb-large contigs very similar to the Marseillevirus genome (Popgeorgiev et al., 2013a). Inoculation of the blood on Jurkat cells (human lymphocytes) enabled the replication, albeit not the propagation, of this virus that was named Giant Blood Marseillevirus. Subsequently, serological testing performed on 20 additional blood donors showed a high level of anti-Marseillevirus IgG for three of them, including two with a positive PCR for Marseillevirus. Moreover, seroprevalence studies showed seroprevalence rates ranging between 1.7 and 13% in the general population (Mueller et al., 2013) and up to 23% in polytransfused patients (Popgeorgiev et al., 2013b). Then, in 2013, a high titer of antibodies against Marseillevirus was serendipitously observed in one patient, while attempting to implement serological testing. This led to the detection of Marseillevirus by fluorescence in situ hybridization and immunohistochemistry in the lymph node from this patient, who was a 11-month-old child with a lymphadenitis (Popgeorgiev et al., 2013c).

PANDORAVIRUSES
From 2013, 10 and 5 years after the discoveries of Mimivirus and Marseillevirus, respectively, new giant viruses of amoebas were described and considerably expanded the phenotypic and genotypic diversity of this viral group. The two giant viral strains first described in 2013 were named Pandoravirus salinus and Pandoravirus dulcis (Table 1; Figures 1-5; Philippe et al., 2013). They were isolated by co-culturing on Acanthamoeba castellanii samples recovered from sediment layers collected in a river on a coast of central Chile and from mud collected from a freshwater pond close to Melbourne, Australia, respectively. Viral particles have ovoid shapes, with three layered membranes, and are 1 µm in length and 0.5 µm in diameter, which increased the size of the largest viral representative. As for Mimivirus that was thought for several years to be a Gram-positive bacterium , pandoraviruses were previously described as Acanthamoeba parasites (Scheid et al., 2008). The genome from a third pandoravirus, Pandoravirus inopinatum, which was the one observed in 2008 but firstly mistaken as a parasite, was described in 2015 (Antwerpen et al., 2015).
The pandoravirus replicative cycle lasts 10-15 h, starting with the virus internalization through phagocytic vacuoles (Philippe et al., 2013). Viral particle contents are emptied via an apical pore and due to fusion of the viral internal lipid membrane with the vacuole membrane, which precedes the eclipse phase. Whereas, the replication cycle of APMV is entirely cytoplasmic, the cycle of pandoraviruses involves the reorganization of the host nucleus. Replication of the viral DNA and virion assembly occur simultaneously. New viral particles are released approximately after 10 h post-infection. Consistently with their exceptionally large particle sizes, P. salinus and P. dulcis were found to harbor the first and third largest viral genomes described so far, whose length are 2.45 and 1.91 Mbp, respectively. The genome tips harbor tandem repeats. The G+C-content is more than twice that of Mimivirus (61-64%; Philippe et al., 2013;Antwerpen et al., 2015). Interestingly, P. dulcis and P. salinus, the two first representatives of this new family have very different gene content sizes with 1502 and 2556 predicted genes, respectively (Philippe et al., 2013). Four large genomic fragments are specific to P. salinus compared to P. dulcis. The genome size of P. inopinatum (2.24 Mbp) is intermediate between those from the two other pandoravirus strains, and shows a nucleotide identity of 85 and 89% with genomes from P. salinus and P. dulcis, respectively (Antwerpen et al., 2015). The P. salinus genome has a coding density of 80% (Philippe et al., 2013). A total of 401 genes (16% of the gene content) were shown to have known homologs in the GenBank non redundant protein sequence database, with a mean similarity of 38%. Among these genes, 54% contain ankyrin, MORN or F-box signatures. Best matches for the remaining genes were from eukaryotes, in nearly half of the cases, then from bacteria and viruses in equal proportions. Only 17 and 92 genes had Mimivirus and amoebozoa as best hit, respectively. Thus, strikingly, a large majority of the huge gene content of P. salinus has no homolog in the sequence databases. The P. salinus gene content includes 14 of the 31 class I-III core genes initially defined for the NCLDV (Iyer et al., 2001). Nonetheless, it is devoid of several core genes involved in DNA replication. More unexpectedly, no gene encoding a capsid protein was identified in pandoravirus genomes (Philippe et al., 2013;Yutin and Koonin, 2013). Previously, only a handful of smaller viruses had been identified as lacking a capsid (Koonin and Dolja, 2014). Approximately 10% of the P. salinus genes with homologs in the GenBank non redundant protein sequence database contain spliceosomal introns, which were described as differing from group I or II selfsplicing introns found in other giant amoebal viruses (Philippe et al., 2013). Recently, 30 miniature inverted-repeat transposable elements (MITEs) were detected in the P. salinus genome, which might have been mobilized by an amoebal host (Sun et al., 2015). Proteomic analyses identified 210 P. salinus proteins in the virion, 80% being encoded by ORFans when searching for homologs in the GenBank non redundant protein sequence database, and none being a component of a transcription apparatus (Philippe et al., 2013). This latter finding, together with the detection of spliceosomal introns, suggested the significant implication of the Acanthamoeba nucleus in the viral replication, which was further bolstered by observations performed during the replicative cycle. Overall, pandoraviruses display remarkable features, including genomes mostly comprised of ORFans. Pandoraviruses are the most frequently clustered in phylogeny reconstructions with Emiliana huxleyi virus, a coccolithovirus, and it was suspected that they are highly derived phycodnaviruses; this remains FIGURE 5 | Google map of locations of samples from where giant viruses of amoebas were isolated or in which metagenomic reads related to these viruses were detected. , indicates location of samples from which an amoebal virus was isolated; , indicates location of samples from which reads related to an amoebal virus were generated by metagenomics; , indicates the discovery of giant viral particles for which a virus could not be isolated. Blue color indicates environmental samples; green color indicates human samples; red color indicates animal (non-human) samples. This figure is a screenshot of a goggle map that is freely available at the following URL: https://www.google.com/maps/d/edit?mid=zA3X4ljlz-uM.kFSrbnCtoBLc.

PITHOVIRUS SIBERICUM
P. sibericum was described in 2014 and was recovered by co-culturing on A. castellanii with a ≈30,000-year-old Siberian permafrost sample collected in 2000 (Table 1; Figures 1, 3-5; Legendre et al., 2014). Virions have similar morphology as pandoraviruses but are still larger (1.5 µm in length × 0.5 µm in diameter), being the largest virions known so far. They have a structured envelope with a thickness of 60 nm. A hexagonal grid resembling structure, absent in pandoraviruses, closes the apical pore. The pithovirus replicative cycle lasts 10-20 h and is similar overall to that of pandoraviruses, apart from the absence of significant modification of the nucleus morphology. Mature virions appear after 6-8 h and may be released by exocytosis. Contrasting dramatically with a slightly larger virion size compared to pandoraviruses, the P. sibericum ds DNA genome is ∼3-4 times shorter (610 kbp). In addition, the G+Ccontent of this genome is 36%, at mid-distance between those of mimiviruses and marseilleviruses, and almost half that of pandoraviruses. Genome conformation is either linear with terminal repeats, or circular. It was predicted to encode 467 proteins, of which 159 were detected inside the virion by proteomics, and two thirds of these protein sets are encoded by ORFans (as assessed by searching in the GenBank non redundant protein sequence database). BLAST matches were evenly distributed amongst bacteria, eukaryotes, and viruses, and among viruses between various Megavirales families. The greatest number of best viral matches was with marseilleviruses (19), then mimiviruses (15), and iridoviruses (10). These findings are consistent with phylogenetic and phyletic analyses that indicate a close evolutionary relationship of P. sibericum with marseilleviruses and iridoviruses (Legendre et al., 2014;Sharma et al., 2015a). Unexpectedly, similar amounts of proteins were detected in the virion than for P. sibericum and P. salinus, albeit these viruses have gene contents that differ dramatically in size.
One fifth of the P. sibericum genome corresponds to regularly interspersed copies of non-coding tandem repeats of a conserved palindromic pattern that is 150 bp-long and has a G+C content of 23%, similar to that of intergenic regions (compared to 41% for predicted genes). This feature lowers the genome coding density to 68%. A single group I self-splicing intron was detected, which is located inside the gene encoding the DNA dependent RNA polymerase subunit 1. No tRNA and component of the translation apparatus are encoded by the genome. In contrast, a comprehensive set of proteins involved in transcription was detected, which is consistent with cytoplasmic only replication. Only a very low similarity level was detected in the P. sibericum gene content with major capsid proteins, which belong to iridoviruses, and this pithovirus gene product was predicted to harbor a jelly-roll fold. As for pandoraviruses, pithovirusressembling endosymbionts had been previously reported inside Acanthamoeba (Michel et al., 2003).

FAUSTOVIRUSES
Faustovirus strain E12 was the first giant virus isolated on another free-living amoeba than Acanthamoeba spp. (Table 1;  Figures 1-5; Reteno et al., 2015). The recovery of this virus from a sewage sample collected in Marseille, France, resulted from the implementation of a high-throughput strategy to isolate new giant viruses from environmental samples that included the use of five other protists, in addition to those from the genus Acanthamoeba, including Vermamoeba vermiformis, which appeared as the most common free-living protist in human environments (Bradbury, 2014). Faustovirus E12 was the prototype isolate to be described, then seven additional faustovirus isolates were recovered and their genomes sequenced (Reteno et al., 2015). Faustovirus virions have an icosahedral capsid with a diameter of 200 nm, and are devoid of fibers. These viruses are internalized inside V. vermiformis through phagocytosis. As for other Megavirales representatives, virions have an internal lipid membrane surrounding the core that fuses with the vacuole membrane to release the dsDNA genome. This seeds a cytoplasmic viral factory close to the host cell nucleus, which loses its regular morphology and shrinks. Approximately 10 h after infection, some amoebas show DNA-filled virions whereas others contain viral factories with only DNA-free capsids. The Faustovirus replicative cycle lasts 18-20 h post infection.
The Faustovirus E12 genome is a 466,265 bp long circular dsDNA. Its G+C content is 36%. With 451 genes predicted, its coding density is 85%. More than two thirds of these genes have no homologs in the GenBank non redundant protein sequence database whereas 13% have homologs in other Megavirales representatives, majoritarily in asfarviruses, then in phycodnaviruses, mimiviruses, marseilleviruses, and ascoviruses. In addition, other best hits are mostly sequences from bacteria (9%) and eukaryotes (7%), and, rarely (≈2%), sequences from archaea and phages. No tRNA gene was detected. One fifth of the gene content is comprised by paralogs, among which the most abundant are MORN-repeat containing proteins, previously first described among viruses in Marseillevirus (Boyer et al., 2009). Among the most remarkable genes were those encoding a ribosomal protein acetyltransferase, a bacteriophage tail fiber protein, and two polyproteins shared with asfarviruses (Reteno et al., 2015). Comparative genomics and phylogeny showed that faustoviruses were the most closely related, although distantly, to asfarviruses. Nonetheless, both groups of viruses have a distant evolutionary relationship and homologs represent only 12% of the faustovirus gene repertoire. Moreover, the size of the faustovirus genomes is approximately three times that of the asfarvirus genomes, and codon usage differs. Importantly, the core genome of faustoviruses and asfarviruses taken together is reduced 10-fold compared with the core genome of faustoviruses, which indicates large differences between the core genomes and gene contents of these two groups of viruses. Notably, phylogeny reconstruction based on the family B DNA polymerase showed that faustoviruses and asfarviruses were clustered with the Heterocapsa circularisquama virus, which infects dinoflagellates of marine water and whose genome is not available, but is thought to be a 356 kbp-long dsDNA (Ogata et al., 2009). One third of the Faustovirus E12 predicted proteins were detected inside its virions by proteomics (Reteno et al., 2015). An unexpected feature of Faustovirus was the architecture of a 17,000 kbp-long region harboring the capsid encoding genomic fragments, which appeared to be scattered along this region and separated from each other by non-coding regions interrupted by six group I self-splicing introns.

MOLLIVIRUS SIBERICUM
M. sibericum was isolated from the same 30,000-year-old Siberian permafrost sample as P. sibericum (Table 1; Figures 1, 3-5; Legendre et al., 2015). The virion has an original spheric shape, a diameter of 500-600 nm and is covered with 2-4 layers of fibers. Like the other giant viruses, it enters into amoeba by phagocytosis and fusion occurs between viral internal lipid and vacuole membranes, leading to genome release in the amoebal cytoplasm. Interestingly, the viral genome appears to enter into the amoebal nucleus, which is deformed. Then, neo-virions appear at its periphery, and are released after 6 h post-infection. These virions can be seen in vacuoles, suggesting an exocytosis pathway. Synthesis of the envelope and inner content of virions appears to occur simultaneously, as for pandoraviruses and P. sibericum. Strikingly, unlike for all other giant viruses of amoebas, viral factory is not cytoplasmic but perinuclear and the replication cycle is not lytic for the amoebal host.
The dsDNA genome of M. sibericum is linear, 651,523 bp in length and harbors inverted terminal repeats whose length is ∼10 kbp. The G+C content is 60%, similar to that of pandoraviruses. A total of 523 genes and 3 tRNAs were predicted; no conserved promoter signal was identified in intergenic regions. Spliceosomal introns were detected in 4% of the genes. Approximately two thirds of the genes are ORFans. Among proteins with homologs, 18 and 3% were most similar to proteins from viruses and prokaryotes, respectively and 14% were most similar to eukaryotic proteins including 4% from Acanthamoeba. Among viral best hits, a large majority (90%) corresponded to pandoravirus sequences; they included the B family DNA polymerase and RNA polymerase II subunits 1 and 2. Fifty proteins were most similar to Acanthamoeba, a majority having an undetermined function. Among proteins with a functional annotation, the main group encompassed proteins containing ankyrin repeats, then proteins involved in DNA processing and nucleotide biosynthesis. However, thymidylate synthase and ribonucleotide reductase encoding genes were not detected. A total of 230 proteins were identified in virions, including 60% from M. sibericum and 40% from Acanthamoeba. No viral protein involved in transcription was detected. In contrast, unexpectedly, among Acanthamoeba encoded proteins detected inside virions, there were 23 ribosomal proteins from both large and small subunits, a ribosomal RNA assembly protein and a ribosome anti-association factor. Nevertheless, no intact ribosome could be seen inside virion. In addition, host encoded histone homologs and HMG-like chromatin-associated proteins were detected.

GIANT AMOEBAL VIRUSES AS TRUC
A major controversy is whether giant amoebal viruses comprise a fourth branch in the tree of life, aside Bacteria, Archaea, and Eukarya Sharma et al., 2014). No gene has been detected that is shared by all viruses, and only five major viral groups could be shown being monophyletic (Koonin et al., 2006). Notably, viruses were overlooked in the classification of the living as they are devoid of ribosomal genes, which from the 1970s onwards became the most frequently considered molecular marker to build a tree of life (Woese et al., 1990). Therefore, due to their limited gene content, their subsequent strictly parasitic lifestyle with a replicative cycle that largely relies on the host proteins, and their invisibility under a light microscope, viruses long remained considered as being at the edge of the living world (Doermann, 1992;Raoult and Forterre, 2008).
At the time of Mimivirus genome description, Mimivirus was shown to branch near the origin of the eukaryotic domain in a phylogeny reconstruction based on concatenated sequences of seven conserved proteins (Raoult et al., 2004). Later, this observation has been reiterated by independent phylogenies based on informative and universally conserved genes, and strengthened based on hierarchical clustering that used the informational genes from the clusters of orthologous groups of proteins database (COG) and their giant viral homologs Sharma et al., 2015a,b). This fourth branch was not considered as an additional domain, since they were defined by Woese based on ribosomal genes that are missing in the giant viruses (Raoult and Forterre, 2008). Therefore, this new branch of life was named a fourth TRUC (an acronym for Things Resisting Uncompleted Classification; Raoult, 2013). TRUC corresponds to a new classification of microbes that divides the microbial world in four branches, including Bacteria, Archaea, Eukarya, and giant viruses. This reclassification allows including giant viruses in the tree of life, from which they are excluded in a classification centered on ribosomal DNA. This is further warranted as they are bona fide microbes, i.e., visible under an optical microscope, and have genomes that are larger than those of small bacteria and harbor an enormous gene content including homologs to cellular genes. The existence of a fourth branch of life was considered as unreliable by E. Koonin and his team, whose interpretation of their phylogenomic analyses is that giant viral genes were transferred from their eukaryotic hosts (Yutin et al., 2014), and by others who argued for lateral gene transfer from giant viral hosts, or artifactual results from phylogeny reconstructions (Moreira and Lopez-Garcia, 2009;Williams et al., 2011). In contrast, the fourth branch hypothesis was supported by data from other teams. Particularly, the analyses of protein fold superfamilies and their distribution among viruses and cellular organisms indicated that Megavirales representatives are grouped together, and apart from other viruses, while they overlap with some parasitic bacteria (Nasir et al., 2012;Nasir and Caetano-Anolles, 2015). These analyses further highlighted their ancestrality as they showed that giant viruses coexisted with the ancestors of cells and compose a distinct supergroup along with Archaea, Bacteria, and Eukarya. In addition, some sequences of the RecA superfamily or of DNA-dependent RNA polymerase generated by metagenomics from marine water stood between Archaea, Bacteria, and Eukarya in phylogeny reconstructions, and were suspected being from giant viruses (Wu et al., 2011).
Overall, megaviruses that infect amoebas exhibit remarkable features that place them on the edge of the viral world (Raoult, 2014). These giant viruses have virion sizes that are 2-15 times larger than traditional viruses, such as human immunodeficiency virus and hepatitis C virus, and genomes that contains ≈50-250 times more genes, among which large proportions are unique amongst viruses, and at least half have unknown functions. Moreover, they enter amoebas through phagocytosis (Clement et al., 2006;Ghigo et al., 2008;Ghigo, 2010;Backovic and Rey, 2012;Philippe et al., 2013;Legendre et al., 2014Legendre et al., , 2015Reteno et al., 2015). Hence, no specific interaction with cell receptors is needed, unlike for traditional viruses, and the giant size of amoebal viruses might be linked with this entry mechanism due to phagocytosis by these amoebas, as is the case for any particles larger than 0.5 microns . It has been further suspected that, more generally, amoebas could be a training field for microorganisms to render them capable of entering human macrophages, and Mimivirus was demonstrated to enter these cells via phagocytosis (Ghigo et al., 2008;Salah et al., 2009).

GIANT AMOEBAL VIRUSES ARE COMMON AND HIGHLY DIVERSE ENTITIES
Recently, the number and diversity of giant viruses of amoebas have expanded considerably (Table 1; Figures 1, 5), and it is likely that their diversity is still largely untapped. Their isolation has been recently boosted by using high throughput protocols and new protists as culture support Reteno et al., 2015). These megaviruses, isolated using co-culturing on various amoebas and described over the last 12 years, display a wide range of virion sizes and shapes, structures, genome lengths, G+C%, gene repertoires and replicative sites (Table 1;  Figure 3). Nevertheless, they comprise a monophyletic clade based on a limited set of core and informational genes Yutin and Koonin, 2012;Colson et al., 2013a;Sharma et al., 2015a). They were obtained from various environmental samples, ecosystems, and geographical locations, which indicate that they are common in our biosphere, and they were detected or isolated from amoebozoa, invertebrates or mammals (Colson and Raoult, 2012;Boughalmi et al., 2013a,b;Colson et al., 2013c;Pagnier et al., 2013;Popgeorgiev et al., 2013a;Dornas et al., 2014;Legendre et al., 2015; Figure 5; https://www.google.com/ maps/d/edit?mid=zA3X4ljlz-uM.kFSrbnCtoBLc). Metagenomic data strengthen these observations as sequences related to amoebal megaviruses and virophages have been detected in several studies from environmental, animal, and human samples (Ghedin and Claverie, 2005;Monier et al., 2008;Loh et al., 2009;Kristensen et al., 2010;Colson et al., 2013b;Law et al., 2013;Zhang et al., 2015). Moreover, novel approaches to find sequences related to megaviruses in metagenomic datasets, using reconstructed putative ancestral sequences of conserved genes, can discover megaviruses previously overlooked (Sharma et al., 2014). Sequences from new putative giant viruses were also detected in marine environmental metagenomes (Wu et al., 2011;Mozar and Claverie, 2014) and in plant genomes (Maumus et al., 2014). Regarding P. sibericum and M. sibericum, recovered from 30,000-year-old permafrost samples, they are likely not viruses from ancient times that were revived. Indeed, metagenome sequences whose best matches were P. sibericum were recently detected, which suggests that close relatives to these viruses will probably be isolated in the near future. Lastly, the detection of mimiviruses and marseilleviruses in humans and accumulated hints of their potential pathogenicity is an emerging field (Colson et al., 2013c). This warrants investigating the presence and impact of all giant amoebal viruses in humans.

AUTHOR CONTRIBUTIONS
All authors listed, have made substantial, direct and intellectual contribution to the work, and approved it for publication.