MINI REVIEW article

Front. Microbiol., 31 May 2024

Sec. Phage Biology

Volume 15 - 2024 | https://doi.org/10.3389/fmicb.2024.1390726

Tools and methodology to in silico phage discovery in freshwater environments

  • 1. Department of Biochemistry and Immunology, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil

  • 2. Laboratory of Simulation and Computational Biology — SIMBIC, High Performance Computing Center — CCAD, Federal University of Pará, Belém, Pará, Brazil

  • 3. Laboratory of Bioinformatics and Genomics of Microorganisms, Institute of Biological Sciences, Federal University of Pará, Belém, Pará, Brazil

Article metrics

View details

4

Citations

3,5k

Views

1,3k

Downloads

Abstract

Freshwater availability is essential, and its maintenance has become an enormous challenge. Due to population growth and climate changes, freshwater sources are becoming scarce, imposing the need for strategies for its reuse. Currently, the constant discharge of waste into water bodies from human activities leads to the dissemination of pathogenic bacteria, negatively impacting water quality from the source to the infrastructure required for treatment, such as the accumulation of biofilms. Current water treatment methods cannot keep pace with bacterial evolution, which increasingly exhibits a profile of multidrug resistance to antibiotics. Furthermore, using more powerful disinfectants may affect the balance of aquatic ecosystems. Therefore, there is a need to explore sustainable ways to control the spreading of pathogenic bacteria. Bacteriophages can infect bacteria and archaea, hijacking their host machinery to favor their replication. They are widely abundant globally and provide a biological alternative to bacterial treatment with antibiotics. In contrast to common disinfectants and antibiotics, bacteriophages are highly specific, minimizing adverse effects on aquatic microbial communities and offering a lower cost–benefit ratio in production compared to antibiotics. However, due to the difficulty involving cultivating and identifying environmental bacteriophages, alternative approaches using NGS metagenomics in combination with some bioinformatic tools can help identify new bacteriophages that can be useful as an alternative treatment against resistant bacteria. In this review, we discuss advances in exploring the virome of freshwater, as well as current applications of bacteriophages in freshwater treatment, along with current challenges and future perspectives.

1 Introduction

Freshwater is an indispensable resource for maintaining life on Earth and has been consistently impacted by the increasing anthropogenic influence. Urban and rural expansion around water bodies, coupled with waste disposal from hospitals, water treatment systems, industry, agriculture, and residences, contribute to rivers and lakes becoming hotspots for the proliferation of pathogenic microorganisms (Reddy et al., 2022). Water disinfection methods have become limited due to the growing demand for water reuse and the inefficiency of a significant portion of antibiotics against the spread of antibiotic-resistant bacteria (Mathieu et al., 2019). Therefore, there is a pressing need to explore natural compounds to control multidrug-resistant bacteria, such as bacteriophages.

Bacteriophages (or phages) are the most abundant biological entities globally. They were first described in the early 1900s, and by now, we know they are widespread in the environment, with estimates of ~1031 phages present in the biosphere (Twort, 1915; Rohwer and Edwards, 2002). Phages act as natural predators of bacteria and archaea, and exploit host machinery favoring their own replication (Dion et al., 2020). Phages may interact with bacterial or archaeal hosts by transferring genes that might be ecologically relevant, thus favoring the host genetic fitness through horizontal gene transfer (HGT) (Touchon et al., 2017; De Mandal et al., 2021). When associated with their hosts as prophages, phages may introduce auxiliary metabolism genes that potentially enhance host adaptability (Luo et al., 2022). The initial discovery that phages were highly abundant in aquatic samples (Bergh et al., 1989) laid the groundwork for the eventual determination of their pivotal impact on the ecosystem.

The paramount significance of phages arises from the viral shunt phenomenon, wherein organic matter is recycled through the lysis of host cells, driving global-scale biogeochemical cycles (Breitbart et al., 2018). Bacteriophages represent an ecological alternative to the use of antibiotics, with a lower cost–benefit ratio of production, and exhibit high specificity to their hosts, minimizing dysbiosis (Romero-Calle et al., 2019). They have been employed for at least a century in controlling bacterial infections in humans (Rohde et al., 2018), and have recently been advocated for applications in freshwater environments (Naknaen et al., 2021; Ben Saad et al., 2022; Hu et al., 2023).

Phages can be classified into three groups: (1) virulent bacteriophages that solely undergo the lytic cycle, leading to the lysis of the host cell; (2) temperate phages that can suffer lysogenic cycles, remaining dormant within the host cell (prophages) but can be induced to switch to the lytic or chronic cycle; and (3) filamentous phages: go through a chronic cycle in which viral replication occurs without host cell lysis (Chevallereau et al., 2022; Zhang et al., 2022). Lytic phages are the most desirable due to their cell lysis capability and lower risk of horizontal gene transfer.

Classical studies of phages relied on isolation and culture methods for their identification (Hyman, 2019). Currently, with the advancement of culture-independent methodologies such as metagenomics, databases are increasingly enriched with viral data, enabling a more comprehensive understanding at the taxonomic level and potential interactions of phages with their hosts (Santiago-Rodriguez and Hollister, 2023) showing that bioinformatics tools for mining viral data can be a powerful aid in discovering bacteriophages.

This review discusses the identification of phages in freshwater environments, the primary in silico tools used for phage data exploration, and types of phage applications in freshwater. We also discuss the possible challenges and future possibilities for the field.

2 Identification of phages in freshwater

The metaviromics field (phage metagenomics) essentially is a shotgun metagenomic approach focused on studying the genomes of viral populations from the environment (Hurwitz and Sullivan, 2013; Coutinho et al., 2017; Moon and Cho, 2021), and due to the importance of freshwater bodies as sources of drinking water, recreation, and commerce, more recent studies have dedicated their efforts to freshwater systems (Bruder et al., 2016). Since water chemistry and hydrological factors can contribute to a dynamic environment on a microbial level, likely to be reflected in the indigenous phage populations, the exploration of metagenomic data sampled from freshwater sources from different biomes and places in the world is bound to reveal a plethora of yet unknown and undocumented species of phages (Hayes et al., 2017; Alanazi et al., 2022).

Previous studies have explored how nutrient availability, seasonality, temperature, and human activity influence freshwater viral communities (Bruder et al., 2016). By example, the study of Mohiuddin and Schellhorn (2015) observed that geographic location does not appear to have had a major impact on viral abundance and diversity for two freshwater lakes of the lower Great Lakes region, Lake Ontario and Lake Erie, since the virome composition of both lakes were found to be similar. However, temporal variation in taxonomic composition was observed for both lakes after a year apart sampling.

Another interesting relationship against phage diversity are the possibly related effects of anthropogenic actions on the microbial environment. The study of Green et al., (2015) of the Virginian Lake Matoaka found viral species richness and diversity to be negatively correlated with the level of human activity at the sampling sites, observing the highest levels of diversity and species richness at the main body of the lake, the area least affected by human activity. Another study, conducted by Fancello et al. (2013), observed that the most anthropogenically influenced out of four perennial ponds of the Mauritanian Sahara presented the lowest amount of viral diversity, and higher abundance of heterotrophic microorganisms and human pathogens.

Freshwater viral metagenomics studies also can assist in tackling significant threats to global health, such as the spread of antibiotic resistance. Not only antibiotic resistance genes (ARGs) can spread across different bacterial populations through horizontal gene transfer mediated by bacteriophages, but bacteriophage-carried ARGs are especially threatening due to their prolonged persistence in the environment, fast replication rates, and ability to infect diverse hosts (Brown-Jaque et al., 2015). Moon et al. (2020) explored ARGs recovered from urban surface water viral metagenome data, revealing novel phage-borne antibiotic resistance genes that were also found in bacterial metagenomes, indicating that they were harbored by actively infecting phages. These results suggest that those environmental bacteriophages could act as reservoirs of unknown ARGs that could be widely disseminated via virus-host interactions and illustrate the potential of the viral metagenomics for the discovery of phages involved in spreading antimicrobial resistance on the environment.

In addition, freshwater metagenomic data can also be used to study the viral ecology in the context of other organisms. Chen et al. (2019) investigated and revealed a worldwide distribution of distinct phage genotypes that may infect Fonsibacter, one the most abundant bacterioplankton in freshwater ecosystems, suggesting their substantial role in shaping indigenous microbial communities and potentially significant influence on biogeochemical cycling.

3 In silico phage mining with bioinformatic tools

Due to the advances in sequencing technologies and in viral databases, we selected some of the currently most used tools developed to analyze the viral community on metaviromic data. A classic virome analysis pipeline include tools for (i) assembly, (ii) viral sequence prediction, (iii) quality check, (iv) annotation, (v) taxonomy classification, (vi) phage-host prediction tools and (vii) viral microdiversity analysis (Table 1), some being also present in general metagenomic studies (steps i, iv and v). They are essential to understand the diversity of viruses and know their function in the environment, and can be used to identify new uncultivated viral genomes (UViGs) (Green et al., 2015; Moon and Cho, 2021; Naknaen et al., 2021).

Table 1

Tool nameMetavirome analysis typeDescriptionInput typeAccessibility—web or tandaloneCitation
IDBA-UDAssemblyAssembly of fastq readsProcessed fastq filesStandalonePeng et al. (2012)
MegahitAssemblyAssembly of fastq readsProcessed fastq filesStandaloneLi et al. (2016)
MetaSpadesAssemblyAssembly of fastq readsProcessed fastq filesStandaloneNurk et al. (2017)
MetaViralSpadesAssemblyAssembly of fastq readsProcessed fastq filesStandaloneAntipov et al. (2020)
VirSorterViral sequence predictionPredicts viral regions using probabilistic similarity models and referenced-based protein homology searchesContigs in FASTAStandaloneRoux et al. (2015)
ProphetViral sequence predictionPredicts viral regions based on similarity searches against own databaseContigs in FASTAStandaloneReis-Cunha et al. (2019)
PHASTESTViral sequence predictionPredicts viral regions based on similarity searches against own databaseContigs in Genbank or FASTAStandalone or WebWishart et al. (2023)
MetaPhinderViral sequence predictionIdentifies phage sequences in assembled contigs by integrating BLAST matches to several phage genomes in a database.Contigs in FASTAStandaloneJurtz et al. (2016)
VirFinderViral sequence predictionMachine learning method for identification of viral contigs based on K-mer distributionContigs in FASTAStandaloneRen et al. (2017)
DeepVirFinderViral sequence predictionUses convolutional neural networks to learn viral genomic signatures and predict if a sequence is viralContigs in FASTAStandaloneRen et al. (2020)
PPR-MetaViral sequence predictionUtilizes neural network structure (CNN) to predict viral sequencesContigs in FASTAStandaloneFang et al. (2019)
PhaMersViral sequence predictionUtilizes a machine learning model based on k-mer frequencies to identify viral sequencesContigs in FASTAStandaloneDeaton et al. (2019)
VirSorter2Viral sequence predictionPredicts viral regions based on HMM alignment to databaseContigs in FASTAStandaloneGuo et al. (2021)
ViralVerifyViral sequence predictionAnalyses the gene content of a contig through a Naive Bayesian classifier and classifies it as viral/bacterial/uncertainContigs in FASTAStandaloneAntipov et al. (2020)
geNomadViral sequence predictionUses an hybrid approach with gene content and a deep neural network to identify sequences of plasmids and virusesContigs in FASTAStandaloneCamargo et al. (2023)
MarvelViral sequence predictionIdentify viral bins based on a Random Forest machine learning approachBins in FASTAStandaloneAmgarten et al. (2018)
CheckVQuality checkEstimation of viral completeness by viral sequence comparison to database of complete viral genomesContigs in FASTAStandaloneNayfach et al. (2021)
ViralCompleteQuality checkEstimate viral completeness using the Naive Bayesian Classifier to compute the similarity of a sequence to known virus from the RefSeq databaseContigs in FASTAStandaloneAntipov et al. (2020)
DRAM-vAnnotationProtein-similarity-based pipeline specific for viral functional and metabolic profilingContigs in FASTA plus a affi table generated by VirSorter2StandaloneShaffer et al. (2020)
PHANOTATEAnnotationA gene calling annotation tool that treats a phage genome as a network of paths, being ORFs treated as favorable, and overlaps and gaps less favorable, but still possible. These paths are represented as a weighted network of connections graph to find the optimal path.Contigs in FASTAStandaloneMcnair et al. (2019)
VIBRANTAnnotationHybrid pipeline that uses protein similarity and machine learning approach to annotate viral sequencesContigs in FASTAStandaloneKieft et al. (2020)
viral-EggNog-MapperAnnotationPipeline that uses orthology assignment approach to annotate eukaryotic or prokaryotic organisms from genome or metagenome samplesContigs in FASTAStandalone and WebCantalapiedra et al. (2021)
Kraken 2Read or contig taxonomic classificationTaxonomic classification of microbiome fastq reads or contigs based on k-mer alignmentFastq or Fasta filesStandaloneWood et al. (2019)
VContact 2Contig taxonomic classificationUtilizes whole genome gene-sharing profiles integrating distance-based hierarchical clustering to generate confidence scores for virus classificationFASTA protein file plus a Gene-to-genome mapping tableStandaloneBin Jang et al. (2019)
MMSeqs 2Read or contig taxonomic classificationA protein-search-based taxonomy assignment tool that uses a weighted method to assign taxonomic labelsFastq or Fasta filesStandaloneSteinegger and Söding (2017)
CATContig taxonomic classificationDIAMOND BLASTP homology for contig taxonomic classificationContigs in FASTAStandalonevon Meijenfeldt et al. (2019)
PhaGCNContig taxonomic classificationMachine-Learning -Based tools that combines the DNA sequence features and protein sequence similarity to assign taxonomic labelsContigs in FASTAStandaloneShang et al. (2021)
VirusTaxoContig taxonomic classificationTaxonomic classification tool that provides a framework to organize the population of viruses from metagenomic data.Contigs in FASTAStandaloneRaju et al. (2022)
PHISTPhage-host predictionInfer virus host relationships based on the number of k-mers shared between their sequencesFASTA file with host sequences and a FASTA file with viral sequencesStandaloneZielezinski et al. (2022)
CrisprOpenDB pipelinePhage-host predictionPredicts virus-hosts relationships by searching for CRISPR spacer matches and uses several criteria to create predictionsContigs in FASTAStandalone and WebDion et al. (2021)
RaFahPhage-host predictionUtilizes Random Forest model to assign phage-host interaction independent from databasesContigs in FASTAStandaloneCoutinho et al. (2021)
VirHostMatcherPhage-host predictionAssign virus-host relations based on oligonucleotide frequency similarityFASTA file with host sequences, FASTA file with viral sequences and Host taxonomy tableStandaloneAhlgren et al. (2017)
IPHoPPhage-host predictionA pipeline that combined multiple tools and methods for phage-host predictionContigs in FASTAStandaloneRoux et al. (2023)
PhyloseqDiversity analysisGenerate microbial diversity statisticsVirus count table, viral taxonomy table, metadata tableStandaloneMcMurdie and Holmes (2013)
MicrobiomeAnalystDiversity analysisGenerate microbial diversity statisticsVirus count table, viral taxonomy table, metadata tableStandalone and WebDhariwal et al. (2017)
AnimalculesDiversity analysisGenerate microbial diversity statisticsVirus count table, viral taxonomy table, metadata tableStandaloneZhao et al. (2021)
MicroecoDiversity analysisGenerate microbial diversity statisticsVirus count table, viral taxonomy table, metadata tableStandaloneLiu et al. (2021)

Tools available for metagenomic analysis of data for viral identification from environmental data.

In 2017, Roux et al. (2017), identified IDBA-UD (Peng et al., 2012), Megahit (Li et al., 2016), and MetaSpades (Nurk et al., 2017) as the best available options for assembly of viral contigs from short reads. Later on Sutton et al. (2019) analyzed a set of simulated, mocked, and human gut virome with 16 assemblers and identified MetaSpades as the most efficient. However, it showed less effectiveness in reconstructing microdiversity, being more useful to study the mutation rates of the virome. Additionally, although not present in the previous study for being later published, MetaViralSpades (Antipov et al., 2020), a variation of MetaSpades (Nurk et al., 2017), outperformed it in an analysis of 18 real virome data sets, where the contig completeness was superior in 12 cases (Antipov et al., 2020).

After the assembly, a viral sequence prediction analysis can be applied to filter out phages’ host sequences from the metagenomic data. There are three main approaches (Andrade-Martínez et al., 2022) which includes tools that uses protein homology searches to databases: VirSorter (Roux et al., 2015), Prophet (Reis-Cunha et al., 2019), PHASTEST (Wishart et al., 2023), MetaPhinder (Jurtz et al., 2016); machine learning based tools that employs reference-free viral genomic features detection: VirFinder (Ren et al., 2017), DeepVirFinder (Ren et al., 2020), PPR-Meta (Fang et al., 2019), PhaMers (Deaton et al., 2019); and hybrid tools that employ machine learning classification reference based or reference independent: VirSorter2 (Nurk et al., 2017; Guo et al., 2021), ViralVerify (Nurk et al., 2017), geNomad (Camargo et al., 2023), Marvel (Amgarten et al., 2018), and VIBRANT (Kieft et al., 2020) (which can do the steps of identify viral sequences, annotation, and determine genome quality and completeness) (Table 1). Each methodology will have its limitations, and, for machine learning, is related to how updated is the training dataset, the alignment-based tools may also be limited by how updated are the datasets and the difficulty to handle large data. The best approach would be a combination of results from tools that utilize different methodologies for phage sequence prediction (Andrade-Martínez et al., 2022).

Contigs obtained from short-read metagenomic sequencing are normally segmented and it might have misleading information, making it difficult to perform further analysis. To help with this issue, the use of tools such as CheckV (Nayfach et al., 2021), ViralComplete (Antipov et al., 2020), or VIBRANT (Kieft et al., 2020), that identify the completeness and possible host contamination on viral genomes is essential, but yet, still need improvements due to be dependable of the database of virus and the tools used (Green et al., 2015; Sutton et al., 2019). In terms of annotation, some of the most known tools to predict ORFs (Open Reading Frames) are prodigal (Hyatt et al., 2010), Glimmer (Delcher et al., 2007) and GeneMarks (Besemer et al., 2001; Andrade-Martínez et al., 2022), but there are other more specific tools for virus annotations such as VIBRANT (Kieft et al., 2020), viral-Eggnog-mapper (Cantalapiedra et al., 2021), DRAM-v (Shaffer et al., 2020), and PHANOTATE (Mcnair et al., 2019; Table 1). They are suitable for viral annotation and can be applied in manual curation of possible viral false positive results taking into account characteristics such as number of viral and cellular genes hits, bitscores, absence of viral hallmark genes, and presence of plasmid genes (Guo et al., 2021).

For taxonomic classification, it is currently a challenge to find tools that can classify viral sequences under the latest ICTV taxonomy framework, given the high variability, lack of universally conserved genes, and unknown regarding viruses. Kraken 2 (Wood et al., 2019), is a powerful tool for virus taxonomy and identification, and a study performed by Ho et al. (2023) detected a high F1 score of 0.86 in the correct detection of sequences of a viral mock community of characterized viruses. However, it has limited homology to the reference used, so it’s a good option for the identification of known viruses, and when discovery of new viruses is considered, the use of Kraken 2 combined with other tools is advised (Ho et al., 2023). Among the tools that do taxonomy analysis MMSeqs 2 (Steinegger and Söding, 2017), and CAT (von Meijenfeldt et al., 2019), perform protein homology searches to own databases, VContact 2 (Bin Jang et al., 2019), who employs clustering of viral contigs based on shared genes, PhaGCN (Shang et al., 2021), a deep learning classifier based on gene-sharing networks, and VirusTaxo (Raju et al., 2022), that uses a k-mer enrichment database approach (Table 1). All of these tools have customizable databases or the option to retrain their machine-learning models with the latest ICTV taxonomy, which is essential since the ICTV taxonomy is frequently changing (Zhu et al., 2022).

Considering that freshwater environments are expected to have a considerable percentage of new uncultivated viral genomes (UViGs), if a researcher needs to identify its possible host, it is necessary to perform a phage-host prediction. Current methods include mainly similar oligonucleotide frequency (ONF) analysis (VirHostMatcher) (Ahlgren et al., 2017), k-mer similarity (PHIST) (Zielezinski et al., 2022), CRISPR spacer alignment (Dion et al., 2021), and machine learning algorithms (RAFaH) (Coutinho et al., 2021). For researchers new to metavirome analysis it might helpful to use a software that computes the results of other tools such as iPHoP (Roux et al., 2023), which computes the results of six tools utilizing different methodologies and summarizes the putative taxonomy of phage hosts in a table.

The high volume of data produced by the metagenomic studies stimulated the development of tools to simplify the analysis of metagenomic data that also can be applied to metaviromic datasets. Among them, packages such as Phyloseq (McMurdie and Holmes, 2013), MicrobiomeAnalyst (Dhariwal et al., 2017), Animalcules (Zhao et al., 2021), and Microeco (Liu et al., 2021) are some of the most known integrated R packages available (Wen et al., 2023) and offer great set of graphics to support analysis of environmental viruses and their role through metagenomics.

4 Applications of phages in freshwater

Safe drinking water is a high demand limited resource that gains more attention in research as water resources get scarcer worldwide, and multi-resistant water-borne pathogens and overall pollution grows as an even bigger threat to society over the years (Mathieu et al., 2019). Approximately one-ninth of the global population reportedly lacks access to safe drinking water (Jassim et al., 2016). Given the capacity of phages to infect bacterial hosts, they have recently been used as novel tools in water pollution control, to monitor and treat fresh and wastewater (Ji et al., 2021).

4.1 Bacteriophages as pollution indicators in water

There have been a few applied methods using phages to evaluate water quality as properties indicators to monitor pathogenic bacteria in wastewater. Immobilized phages have been used on an electrode surface as biorecognition elements, through a technique known as electrochemical impedance spectroscopy (EIS), to detect bacteria, such as E. coli, Staphylococcus aureus, and Pseudomonas aeruginosa (Yue et al., 2017; Zhou et al., 2017; Richter et al., 2018). Phages have also been employed as capture elements by other alternative combinations with nanoparticles for bacterial pathogen detection (Richter et al., 2018), and as biomechanisms to assess membrane performance and monitor membrane integrity in water treatment facilities (McMinn et al., 2017; Wu et al., 2017; Dias et al., 2018).

A specific group of bacteriophages named crAssphages have been proposed as potential universal human feces viral indicators in water bodies (Farkas et al., 2019; Mafumo et al., 2023). CrAssphages were described by Dutilh et al. (2014) as the most abundant phages in the human gut virome. Further studies identified that crAssphages are highly specific and abundant to human feces (Sabar et al., 2022), highly prevalent in sewage samples (Stachler et al., 2017), and maintain correlation to the presence of human enteric viruses in water (Jennings et al., 2020). Given the previous characteristics, crAssphages have been preconized in favor of currently used fecal indicator bacteria (FIB), which poorly explain viral pathogen dynamics in water and have low host specificity, making difficult the identification of the source of contamination (Ward et al., 2020; Toribio-Avedillo et al., 2021; Mafumo et al., 2023). CrAssphage applicability has been evaluated in several countries (Crank et al., 2020; Ward et al., 2020; Sangkaew et al., 2021; Nam et al., 2022) and shows promising possibilities for human fecal contamination detection in freshwater.

4.2 Bacteriophages in water treatment

Another challenge that greatly affects the operation of wastewater treatment systems is the formation of flocs and sludge bulking by filamentous microorganisms that proliferate excessively that form thick, viscous foams (Aracic et al., 2015). The study conducted by Petrovski et al. (2011a,b) showed how phages that can lyse multiple host bacteria can circumvent the stability of foams. Additionally, Liu et al. (2015) performed tests in a simulated aeration tank system using isolated Gordonia phages, achieving significant reduction in the sludge sedimentation volume. However, all these methods are still experimental as current research still focuses on evaluating and monitoring the behavior of potential phage candidates on wastewater treatment systems (Reisoglu and Aydin, 2023).

Other lines of research have employed phages as low-cost biological control agents to treat specific pathogenic bacteria in sewage. Studies reported the successful inhibition and lysis of drug-resistant A. baumannii (Lin et al., 2010), waterborne disease-causing Vibrio cholerae (Wei et al., 2011), and dysentery-causing Shigella (Jun et al., 2016) through the combination of different phages in co-culture essays. Also, some studies act on the biological control of cyanobacteria, harmful prokaryotes often causing water blooms on green or red tides and producing cyanotoxins, which endanger the surrounding wildlife, aquatic farming animals, threaten human health and can cause tremendous economic losses (Jassim and Limoges, 2013). The strategy in some of those studies is to isolate and employ cyanophages that effectively reduce phycobilisome proteins and destroy the thylakoid structure of cyanobacteria (Gao et al., 2012; Yoshida-Takashima et al., 2012).

However, for both cases, some problems still emerge in the practical application of phage-based biological control, with the emergence of host-resistant mutants, the reduction of cyanophage infectivity caused by sunlight irradiation, and the feasibility of multiple-host approaches are still challenges to be overcome. Nonetheless, phage-based technology also has the advantage of reducing the use of chemical reagents, thus reinforcing the appeal of such strategies and interest in their future development (Mathieu et al., 2019; Ji et al., 2021).

5 Current limitations and perspectives

The study of viral sequences in environmental samples is challenging due to the low representativity or fragmentation of DNA in short sequencing data, the high error rate and the large amount of DNA necessary for long-read sequencing (Warwick-Dugdale et al., 2019). As technology advances, improved read length and sequencing quality have partially addressed this issue. This progress has also opened up the opportunity to implement hybrid approaches for sequencing, combining short and long reads that might allow better environmental virus detection, characterization, and understanding of the microdiversity of virus populations (Warwick-Dugdale et al., 2019; Pratama et al., 2021; Andrade-Martínez et al., 2022).

For the identification of phages, common tools employ distinct methods, such as sequence composition, sequence similarity, and machine learning approaches (Titus Brown and Irber, 2016; Fang et al., 2019; Kieft et al., 2020), but there is no standardization for these techniques. Currently, each method yields slightly different results, and phage identification still relies heavily on trial and error usage of software packages. It is crucial that a golden standard be established to ensure the robustness of methodologies and techniques, thereby enhancing the replicability and reliability of phage identification.

An alternative for an assembly-free, culture-independent study of phages is the analysis of the whole genome of phages by using long-read sequencing technologies, like Oxford nanopore or PacBio technologies (Warwick-Dugdale et al., 2019; Zaragoza-Solas et al., 2022). The advantages of this approach are avoiding over-fragmentation of sequencing data and adopting portable sequencing technologies, allowing the researcher to identify phages from natural sources in situ (Warwick-Dugdale et al., 2019). This opportunity leads to the study of phages directly from their natural environment, allowing for the identification of phages and the analysis of the samples in real time, which is a significant and desirable feature for the genomic surveillance field (Lisotto et al., 2021).

Most virus databases are derived from uncultivated viral genomes (UViGs) representing >95% of public databases (Roux et al., 2019), leading to another significant problem: most of the phage-host interactions are obtained solely from in silico predictions of the study of metagenomes. This lack of lab-studied observations implies the absence of a clear understanding of host-phage dynamics in nature (Coclet and Roux, 2021). In addition to avoiding the intrinsic wet lab biases, such as the identification of false positive or negative viruses due to contamination, the increase of biases related to the process of sample collection, storage, genetic material extraction, purification, and sequencing (Cantalupo and Pipas, 2019). However, the lack of this holistic vision might affect the build of future databases and the scientific interpretations from related results, so it is vital to keep these current limitations presented by bioinformatics tools in mind and apply different combinations of analysis to confirm the identity of phages coming from metagenomic data (Roux et al., 2013, 2019).

Statements

Author contributions

CD: Writing – original draft. DM: Writing – original draft. WN: Writing – original draft. OA: Formal analysis, Supervision, Writing – review & editing. RR: Conceptualization, Formal analysis, Supervision, Writing – review & editing.

Funding

The author(s) declare that financial support was received for this article’s research, authorship, and publication. The authors would like to thank the funding agencies Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) and National Council for Scientific and Technological Development (CNPq), the support by the CNPq project #312316/2022-4, the Secretary of State for Science, Technology, and Professional and Technological Education (SECTET), the Dean’s Office for Research and Graduate Studies/Federal University of Pará–PROPESP/UFPA (PAPQ), and the partnership SECTEC/UFPA/FADESP for the financial support on this work.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb.2024.1390726/full#supplementary-material

References

  • 1

    AhlgrenN. A.RenJ.LuY. Y.FuhrmanJ. A.SunF. (2017). Alignment-free $d_2^*$ oligonucleotide frequency dissimilarity measure improves prediction of hosts from metagenomically-derived viral sequences. Nucleic Acids Res.45, 3953. doi: 10.1093/nar/gkw1002

  • 2

    AlanaziF.NourI.HanifA.Al-AshkarI.AljowaieR. M.EifanS. (2022). Novel findings in context of molecular diversity and abundance of bacteriophages in wastewater environments of Riyadh, Saudi Arabia. PLoS One17:e0273343. doi: 10.1371/journal.pone.0273343

  • 3

    AmgartenD.BragaL. P. P.da SilvaA. M.SetubalJ. C. (2018). MARVEL, a tool for prediction of bacteriophage sequences in metagenomic bins. Front. Genet.9:304. doi: 10.3389/fgene.2018.00304

  • 4

    Andrade-MartínezJ. S.Camelo ValeraL. C.Chica CárdenasL. A.Forero-JuncoL.López-LealG.Moreno-GallegoJ. L.et al. (2022). Computational tools for the analysis of uncultivated phage genomes. Microbiol. Mol. Biol. Rev.86:e0000421. doi: 10.1128/mmbr.00004-21

  • 5

    AntipovD.RaikoM.LapidusA.PevznerP. A. (2020). Metaviral SPAdes: assembly of viruses from metagenomic data. Bioinformatics36, 41264129. doi: 10.1093/bioinformatics/btaa490

  • 6

    AracicS.MannaS.PetrovskiS.WiltshireJ. L.MannG.FranksA. E. (2015). Innovative biological approaches for monitoring and improving water quality. Front. Microbiol.6:826. doi: 10.3389/fmicb.2015.00826

  • 7

    Ben SaadM.Ben SaidM.BousselmiL.GhrabiA. (2022). Use of bacteriophage to inactivate pathogenic bacteria from wastewater. J. Environ. Sci. Health A57, 111116. doi: 10.1080/10934529.2022.2036551

  • 8

    BerghØ.KYB. Ø.BratbakG.HeldalM. (1989). High abundance of viruses found in aquatic environments. Nature340, 467468. doi: 10.1038/340467a0

  • 9

    BesemerJ.LomsadzeA.BorodovskyM. (2001). GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions. Nucleic Acids Res.29, 26072618. doi: 10.1093/nar/29.12.2607

  • 10

    Bin JangH.BolducB.ZablockiO.KuhnJ. H.RouxS.AdriaenssensE. M.et al. (2019). Taxonomic assignment of uncultivated prokaryotic virus genomes is enabled by gene-sharing networks. Nat. Biotechnol.37, 632639. doi: 10.1038/s41587-019-0100-8

  • 11

    BreitbartM.BonnainC.MalkiK.SawayaN. A. (2018). Phage puppet masters of the marine microbial realm. Nat. Microbiol.. Nature Publishing Group3, 754766. doi: 10.1038/s41564-018-0166-y

  • 12

    Brown-JaqueM.Calero-CáceresW.MuniesaM. (2015). Transfer of antibiotic-resistance genes via phage-related mobile elements. Plasmid79, 17. doi: 10.1016/j.plasmid.2015.01.001

  • 13

    BruderK.MalkiK.CooperA.SibleE.ShapiroJ. W.WatkinsS. C.et al. (2016). Freshwater metaviromics and bacteriophages: a current assessment of the state of the art in relation to bioinformatic challenges. Evol. Bioinforma.12, 2533. doi: 10.4137/EBO.S38549

  • 14

    CamargoA. P.RouxS.SchulzF.BabinskiM.XuY.HuB.et al. (2023). Identification of mobile genetic elements with geNomad. Nat. Biotechnol. doi: 10.1038/s41587-023-01953-y

  • 15

    CantalapiedraC. P.Hernández-PlazaA.LetunicI.BorkP.Huerta-CepasJ. (2021). eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Mol. Biol. Evol.38, 58255829. doi: 10.1093/molbev/msab293

  • 16

    CantalupoP. G.PipasJ. M. (2019). Detecting viral sequences in NGS data. Curr. Opin. Virol.39, 4148. doi: 10.1016/j.coviro.2019.07.010

  • 17

    ChenL. X.ZhaoY.McMahonK. D.MoriJ. F.JessenG. L.NelsonT. C.et al. (2019). Wide distribution of phage that infect freshwater SAR11 Bacteria. mSystems.4:e00410-19. doi: 10.1128/mSystems.00410-19

  • 18

    ChevallereauA.PonsB. J.van HouteS.WestraE. R. (2022). Interactions between bacterial and phage communities in natural environments. Nat Rev Microbiol20, 4962. doi: 10.1038/s41579-021-00602-y

  • 19

    CocletC.RouxS. (2021). Global overview and major challenges of host prediction methods for uncultivated phages. Curr. Opin. Virol.49, 117126. doi: 10.1016/j.coviro.2021.05.003

  • 20

    CoutinhoF. H.SilveiraC. B.GregoracciG. B.ThompsonC. C.EdwardsR. A.BrussaardC. P. D.et al. (2017). Marine viruses discovered via metagenomics shed light on viral strategies throughout the oceans. Nat. Commun.8:15955. doi: 10.1038/ncomms15955

  • 21

    CoutinhoF. H.Zaragoza-SolasA.López-PérezM.BarylskiJ.ZielezinskiA.DutilhB. E.et al. (2021). RaFAH: host prediction for viruses of Bacteria and Archaea based on protein content. Patterns.2:100274. doi: 10.1016/j.patter.2021.100274

  • 22

    CrankK.LiX.NorthD.FerraroG. B.IaconelliM.ManciniP.et al. (2020). CrAssphage abundance and correlation with molecular viral markers in Italian wastewater. Water Res.184:116161. doi: 10.1016/j.watres.2020.116161

  • 23

    De MandalS.PandaA. K.KumarN. S.BishtS. S.JinF. (2021). Metagenomics and microbial ecology. Boca Raton: CRC Press.

  • 24

    DeatonJ.YuF. B.QuakeS. R. (2019). Mini-metagenomics and nucleotide composition aid the identification and host Association of Novel Bacteriophage Sequences. Adv Biosyst3:e1900108. doi: 10.1002/adbi.201900108

  • 25

    DelcherA. L.BratkeK. A.PowersE. C.SalzbergS. L. (2007). Identifying bacterial genes and endosymbiont DNA with glimmer. Bioinformatics23, 673679. doi: 10.1093/bioinformatics/btm009

  • 26

    DhariwalA.ChongJ.HabibS.KingI. L.AgellonL. B.XiaJ. (2017). MicrobiomeAnalyst: a web-based tool for comprehensive statistical, visual and meta-analysis of microbiome data. Nucleic Acids Res.45, W180W188. doi: 10.1093/nar/gkx295

  • 27

    DiasE.EbdonJ.TaylorH. (2018). The application of bacteriophages as novel indicators of viral pathogens in wastewater treatment systems. Water Res.129, 172179. doi: 10.1016/j.watres.2017.11.022

  • 28

    DionM. B.OechslinF.MoineauS. (2020). Phage diversity, genomics and phylogeny. Nat. Rev. Microbiol.18, 125138. doi: 10.1038/s41579-019-0311-5

  • 29

    DionM. B.PlanteP. L.ZuffereyE.ShahS. A.CorbeilJ.MoineauS. (2021). Streamlining CRISPR spacer-based bacterial host predictions to decipher the viral dark matter. Nucleic Acids Res.49, 31273138. doi: 10.1093/nar/gkab133

  • 30

    DutilhB. E.CassmanN.McNairK.SanchezS. E.SilvaG. G. Z.BolingL.et al. (2014). A highly abundant bacteriophage discovered in the unknown sequences of human faecal metagenomes. Nat. Commun.5:4498. doi: 10.1038/ncomms5498

  • 31

    FancelloL.TrapeS.RobertC.BoyerM.PopgeorgievN.RaoultD.et al. (2013). Viruses in the desert: a metagenomic survey of viral communities in four perennial ponds of the Mauritanian Sahara. ISME J.7, 359369. doi: 10.1038/ismej.2012.101

  • 32

    FangZ.TanJ.WuS.LiM.XuC.XieZ.et al. (2019). PPR-meta: a tool for identifying phages and plasmids from metagenomic fragments using deep learning. Gigascience8:giz066. doi: 10.1093/gigascience/giz066

  • 33

    FarkasK.AdriaenssensE. M.WalkerD. I.McDonaldJ. E.MalhamS. K.JonesD. L. (2019). Critical evaluation of CrAssphage as a molecular marker for human-derived wastewater contamination in the aquatic environment. Food Environ Virol11, 113119. doi: 10.1007/s12560-019-09369-1

  • 34

    GaoE. B.GuiJ. F.ZhangQ. Y. (2012). A novel Cyanophage with a cyanobacterial nonbleaching protein a gene in the genome. J. Virol.86, 236245. doi: 10.1128/JVI.06282-11

  • 35

    GreenJ.RahmanF.SaxtonM.WilliamsonK. (2015). Metagenomic assessment of viral diversity in Lake Matoaka, a temperate, eutrophic freshwater lake in southeastern Virginia, USA. Aquat. Microb. Ecol.75, 117128. doi: 10.3354/ame01752

  • 36

    GuoJ.BolducB.ZayedA. A.VarsaniA.Dominguez-HuertaG.DelmontT. O.et al. (2021). VirSorter2: a multi-classifier, expert-guided approach to detect diverse DNA and RNA viruses. Microbiome9:37. doi: 10.1186/s40168-020-00990-y

  • 37

    GuoJ.VikD.PratamaA. A.RouxS.SullivanM.. Viral sequence identification SOP with VirSorter2. protocols.io. (2021). Available at: https://www.protocols.io/view/viral-sequence-identification-sop-with-virsorter2-5qpvoyqebg4o/v3 (Accessed Jan 5, 2024).

  • 38

    HayesS.MahonyJ.NautaA.van SinderenD. (2017). Metagenomic approaches to assess bacteriophages in various environmental niches. Viruses9:127. doi: 10.3390/v9060127

  • 39

    HoS. F. S.WheelerN. E.MillardA. D.van SchaikW. (2023). Gauge your phage: benchmarking of bacteriophage identification tools in metagenomic sequencing data. Microbiome11:84. doi: 10.1186/s40168-023-01533-x

  • 40

    HuM.XingB.YangM.HanR.PanH.GuoH.et al. (2023). Characterization of a novel genus of jumbo phages and their application in wastewater treatment. iScience26:106947. doi: 10.1016/j.isci.2023.106947

  • 41

    HurwitzB. L.SullivanM. B. (2013). The Pacific Ocean Virome (POV): a marine viral metagenomic dataset and associated protein clusters for quantitative viral ecology. PLoS One8:e57355. doi: 10.1371/journal.pone.0057355

  • 42

    HyattD.ChenG. L.LoCascioP. F.LandM. L.LarimerF. W.HauserL. J. (2010). Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics11:119. doi: 10.1186/1471-2105-11-119

  • 43

    HymanP. (2019). Phages for phage therapy: isolation, characterization, and host range breadth. Pharmaceuticals12:35. doi: 10.3390/ph12010035

  • 44

    JassimS. A. A.LimogesR. G. (2013). Impact of external forces on cyanophage–host interactions in aquatic ecosystems. World J. Microbiol. Biotechnol.29, 17511762. doi: 10.1007/s11274-013-1358-5

  • 45

    JassimS. A. A.LimogesR. G.El-CheikhH. (2016). Bacteriophage biocontrol in wastewater treatment. World J. Microbiol. Biotechnol.32:70. doi: 10.1007/s11274-016-2028-1

  • 46

    JenningsW. C.Gálvez-ArangoE.PrietoA. L.BoehmA. B. (2020). CrAssphage for fecal source tracking in Chile: covariation with norovirus, HF183, and bacterial indicators. Water Res X9:100071. doi: 10.1016/j.wroa.2020.100071

  • 47

    JiM.LiuZ.SunK.LiZ.FanX.LiQ. (2021). Bacteriophages in water pollution control: advantages and limitations. Front. Environ. Sci. Eng.15:84. doi: 10.1007/s11783-020-1378-y

  • 48

    JunJ. W.GiriS. S.KimH. J.YunS. K.ChiC.ChaiJ. Y.et al. (2016). Bacteriophage application to control the contaminated water with Shigella. Sci. Rep.6:22636. doi: 10.1038/srep22636

  • 49

    JurtzV. I.VillarroelJ.LundO.Voldby LarsenM.NielsenM. (2016). MetaPhinder—identifying bacteriophage sequences in metagenomic data sets. PLoS One11:e0163111. doi: 10.1371/journal.pone.0163111

  • 50

    KieftK.ZhouZ.AnantharamanK. (2020). VIBRANT: automated recovery, annotation and curation of microbial viruses, and evaluation of viral community function from genomic sequences. Microbiome8:90. doi: 10.1186/s40168-020-00867-0

  • 51

    LiD.LuoR.LiuC. M.LeungC. M.TingH. F.SadakaneK.et al. (2016). MEGAHIT v1.0: a fast and scalable metagenome assembler driven by advanced methodologies and community practices. Methods102, 311. doi: 10.1016/j.ymeth.2016.02.020

  • 52

    LinN. T.ChiouP. Y.ChangK. C.ChenL. K.LaiM. J. (2010). Isolation and characterization of ϕAB2: a novel bacteriophage of Acinetobacter baumannii. Res. Microbiol.161, 308314. doi: 10.1016/j.resmic.2010.03.007

  • 53

    LisottoP.RaangsE. C.CoutoN.RosemaS.LokateM.ZhouX.et al. (2021). Long-read sequencing-based in silico phage typing of vancomycin-resistant Enterococcus faecium. BMC Genomics22:758. doi: 10.1186/s12864-021-08080-5

  • 54

    LiuC.CuiY.LiX.YaoM. (2021). Microeco: an R package for data mining in microbial community ecology. FEMS Microbiol. Ecol.97:fiaa255. doi: 10.1093/femsec/fiaa255

  • 55

    LiuM.GillJ. J.YoungR.SummerE. J. (2015). Bacteriophages of wastewater foaming-associated filamentous Gordonia reduce host levels in raw activated sludge. Sci. Rep.5:13754. doi: 10.1038/srep13754

  • 56

    LuoX. Q.WangP.LiJ. L.AhmadM.DuanL.YinL. Z.et al. (2022). Viral community-wide auxiliary metabolic genes differ by lifestyles, habitats, and hosts. Microbiome10:190. doi: 10.1186/s40168-022-01384-y

  • 57

    MafumoN.BezuidtO. K. I.le RouxW.MakhalanyaneT. P. (2023). CrAssphage may be viable markers of contamination in pristine and Contaminated River water. mSystems8:e0128222. doi: 10.1128/msystems.01282-22

  • 58

    MathieuJ.YuP.ZuoP.Da SilvaM. L. B.AlvarezP. J. J. (2019). Going viral: emerging opportunities for phage-based bacterial control in water treatment and reuse. Acc. Chem. Res.52, 849857. doi: 10.1021/acs.accounts.8b00576

  • 59

    McMinnB. R.AshboltN. J.KorajkicA. (2017). Bacteriophages as indicators of faecal pollution and enteric virus removal. Lett. Appl. Microbiol.65, 1126. doi: 10.1111/lam.12736

  • 60

    McMurdieP. J.HolmesS. (2013). Phyloseq: an R package for reproducible interactive analysis and graphics of microbiome census data. PLoS One8:e61217. doi: 10.1371/journal.pone.0061217

  • 61

    McnairK.ZhouC.DinsdaleE. A.SouzaB.EdwardsR. A. (2019). PHANOTATE: a novel approach to gene identification in phage genomes. Bioinformatics35, 45374542. doi: 10.1093/bioinformatics/btz265

  • 62

    MohiuddinM.SchellhornH. E. (2015). Spatial and temporal dynamics of virus occurrence in two freshwater lakes captured through metagenomic analysis. Front. Microbiol.6:960. doi: 10.3389/fmicb.2015.00960

  • 63

    MoonK.ChoJ. C. (2021). Metaviromics coupled with phage-host identification to open the viral ‘black box.’. J. Microbiol.59, 311323. doi: 10.1007/s12275-021-1016-9

  • 64

    MoonK.JeonJ. H.KangI.ParkK. S.LeeK.ChaC. J.et al. (2020). Freshwater viral metagenome reveals novel and functional phage-borne antibiotic resistance genes. Microbiome8:75. doi: 10.1186/s40168-020-00863-4

  • 65

    NaknaenA.SuttinunO.SurachatK.KhanE.PomwisedR. (2021). A novel jumbo phage PhiMa05 inhibits harmful Microcystis sp. Front. Microbiol.12:660351. doi: 10.3389/fmicb.2021.660351

  • 66

    NamS. J.HuW. S.KooO. K. (2022). Evaluation of crAssphage as a human-specific microbial source-tracking marker in the Republic of Korea. Environ. Monit. Assess.194:367. doi: 10.1007/s10661-022-09918-5

  • 67

    NayfachS.CamargoA. P.SchulzF.Eloe-FadroshE.RouxS.KyrpidesN. C. (2021). CheckV assesses the quality and completeness of metagenome-assembled viral genomes. Nat. Biotechnol.39, 578585. doi: 10.1038/s41587-020-00774-7

  • 68

    NurkS.MeleshkoD.KorobeynikovA.PevznerP. A. (2017). metaSPAdes: a new versatile metagenomic assembler. Genome Res.27, 824834. doi: 10.1101/gr.213959.116

  • 69

    PengY.LeungH. C. M.YiuS. M.ChinF. Y. L. (2012). IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics28, 14201428. doi: 10.1093/bioinformatics/bts174

  • 70

    PetrovskiS.SeviourR. J.TillettD. (2011a). Prevention of Gordonia and Nocardia stabilized foam formation by using bacteriophage GTE7. Appl. Environ. Microbiol.77, 78647867. doi: 10.1128/AEM.05692-11

  • 71

    PetrovskiS.SeviourR. J.TillettD. (2011b). Characterization of the genome of the polyvalent lytic bacteriophage GTE2, which has potential for biocontrol of Gordonia-, Rhodococcus-, and Nocardia-stabilized foams in activated sludge plants. Appl. Environ. Microbiol.77, 39233929. doi: 10.1128/AEM.00025-11

  • 72

    PratamaA. A.BolducB.ZayedA. A.ZhongZ. P.GuoJ.VikD. R.et al. (2021). Expanding standards in viromics: in silico evaluation of dsDNA viral genome identification, classification, and auxiliary metabolic gene curation. PeerJ9:e11447. doi: 10.7717/peerj.11447

  • 73

    RajuR. S.Al NahidA.Chondrow DevP.IslamR. (2022). VirusTaxo: taxonomic classification of viruses from the genome sequence using k-mer enrichment. Genomics114:110414. doi: 10.1016/j.ygeno.2022.110414

  • 74

    ReddyS.KaurK.BaratheP.ShriramV.GovarthananM.KumarV. (2022). Antimicrobial resistance in urban river ecosystems. Microbiol. Res.263:127135. doi: 10.1016/j.micres.2022.127135

  • 75

    Reis-CunhaJ. L.BartholomeuD. C.MansonA. L.EarlA. M.CerqueiraG. C. (2019). ProphET, prophage estimation tool: a stand-alone prophage sequence prediction tool with self-updating reference database. PLoS One14:e0223364. doi: 10.1371/journal.pone.0223364

  • 76

    ReisogluŞ.AydinS. (2023). Bacteriophages as a promising approach for the biocontrol of antibiotic resistant pathogens and the reconstruction of microbial interaction networks in wastewater treatment systems: a review. Sci. Total Environ.890:164291. doi: 10.1016/j.scitotenv.2023.164291

  • 77

    RenJ.AhlgrenN. A.LuY. Y.FuhrmanJ. A.SunF. (2017). VirFinder: a novel k-mer based tool for identifying viral sequences from assembled metagenomic data. Microbiome5:69. doi: 10.1186/s40168-017-0283-5

  • 78

    RenJ.SongK.DengC.AhlgrenN. A.FuhrmanJ. A.LiY.et al. (2020). Identifying viruses from metagenomic data using deep learning. Quant. Biol.8, 6477. doi: 10.1007/s40484-019-0187-4

  • 79

    RichterŁ.Janczuk-RichterM.Niedziółka-JönssonJ.PaczesnyJ.HołystR. (2018). Recent advances in bacteriophage-based methods for bacteria detection. Drug Discov. Today23, 448455. doi: 10.1016/j.drudis.2017.11.007

  • 80

    RohdeC.ReschG.PirnayJ. P.BlasdelB.DebarbieuxL.GelmanD.et al. (2018). Expert opinion on three phage therapy related topics: bacterial phage resistance, phage training and prophages in bacterial production strains. Viruses10:178. doi: 10.3390/v10040178

  • 81

    RohwerF.EdwardsR. (2002). The phage proteomic tree: a genome-based taxonomy for phage. J. Bacteriol.184, 45294535. doi: 10.1128/JB.184.16.4529-4535.2002

  • 82

    Romero-CalleD.Guimarães BenevidesR.Góes-NetoA.BillingtonC. (2019). Bacteriophages as alternatives to antibiotics in clinical care. Antibiotics8:138. doi: 10.3390/antibiotics8030138

  • 83

    RouxS.AdriaenssensE. M.DutilhB. E.KooninE. V.KropinskiA. M.KrupovicM.et al. (2019). Minimum information about an uncultivated virus genome (MIUViG). Nat. Biotechnol.37, 2937. doi: 10.1038/nbt.4306

  • 84

    RouxS.CamargoA. P.CoutinhoF. H.DabdoubS. M.DutilhB. E.NayfachS.et al. (2023). iPHoP: an integrated machine learning framework to maximize host prediction for metagenome-derived viruses of archaea and bacteria. PLoS Biol.21:e3002083. doi: 10.1371/journal.pbio.3002083

  • 85

    RouxS.EmersonJ. B.Eloe-FadroshE. A.SullivanM. B. (2017). Benchmarking viromics: an in silico evaluation of metagenome-enabled estimates of viral community composition and diversity. PeerJ5:e3817. doi: 10.7717/peerj.3817

  • 86

    RouxS.EnaultF.HurwitzB. L.SullivanM. B. (2015). VirSorter: mining viral signal from microbial genomic data. PeerJ3:e985. doi: 10.7717/peerj.985

  • 87

    RouxS.KrupovicM.DebroasD.ForterreP.EnaultF. (2013). Assessment of viral community functional potential from viral metagenomes may be hampered by contamination with cellular sequences. Open Biol.3:130160. doi: 10.1098/rsob.130160

  • 88

    SabarM. A.HondaR.HaramotoE. (2022). CrAssphage as an indicator of human-fecal contamination in water environment and virus reduction in wastewater treatment. Water Res.. Elsevier Ltd221:118827. doi: 10.1016/j.watres.2022.118827

  • 89

    SangkaewW.KongprajugA.ChyerochanaN.AhmedW.RattanakulS.DenpetkulT.et al. (2021). Performance of viral and bacterial genetic markers for sewage pollution tracking in tropical Thailand. Water Res.190:116706. doi: 10.1016/j.watres.2020.116706

  • 90

    Santiago-RodriguezT. M.HollisterE. B. (2023). Viral metagenomics as a tool to track sources of fecal contamination: a one health approach. Viruses. MDPI15:236. doi: 10.3390/v15010236

  • 91

    ShafferM.BortonM. A.McGivernB. B.ZayedA. A.La RosaS. L.SoldenL. M.et al. (2020). DRAM for distilling microbial metabolism to automate the curation of microbiome function. Nucleic Acids Res.48, 88838900. doi: 10.1093/nar/gkaa621

  • 92

    ShangJ.JiangJ.SunY. (2021). Bacteriophage classification for assembled contigs using graph convolutional network. Bioinformatics37, i25i33. doi: 10.1093/bioinformatics/btab293

  • 93

    StachlerE.KeltyC.SivaganesanM.LiX.BibbyK.ShanksO. C. (2017). Quantitative CrAssphage PCR assays for human fecal pollution measurement. Environ. Sci. Technol.51, 91469154. doi: 10.1021/acs.est.7b02703

  • 94

    SteineggerM.SödingJ. (2017). MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat. Biotechnol.35, 10261028. doi: 10.1038/nbt.3988

  • 95

    SuttonT. D. S.ClooneyA. G.RyanF. J.RossR. P.HillC. (2019). Choice of assembly software has a critical impact on virome characterisation. Microbiome7:12. doi: 10.1186/s40168-019-0626-5

  • 96

    Titus BrownC.IrberL. (2016). Sourmash: a library for MinHash sketching of DNA. J. Open Source Softw.1:27. doi: 10.21105/joss.00027

  • 97

    Toribio-AvedilloD.BlanchA. R.MuniesaM.Rodríguez-RubioL. (2021). Bacteriophages as fecal pollution indicators. Viruses13:1089. doi: 10.3390/v13061089

  • 98

    TouchonM.Moura de SousaJ. A.RochaE. P. (2017). Embracing the enemy: the diversification of microbial gene repertoires by phage-mediated horizontal gene transfer. Curr. Opin. Microbiol.38, 6673. doi: 10.1016/j.mib.2017.04.010

  • 99

    TwortF. W. (1915). An investigation on the nature of ultra-microscopic viruses. Lancet186, 12411243. doi: 10.1016/S0140-6736(01)20383-3

  • 100

    von MeijenfeldtF. A. B.ArkhipovaK.CambuyD. D.CoutinhoF. H.DutilhB. E. (2019). Robust taxonomic classification of uncharted microbial sequences and bins with CAT and BAT. Genome Biol.20:217. doi: 10.1186/s13059-019-1817-x

  • 101

    WardL. M.Ghaju ShresthaR.TandukarS.SherchandJ. B.HaramotoE.SherchanS. P. (2020). Evaluation of CrAssphage marker for tracking fecal contamination in river water in Nepal. Water Air Soil Pollut.231:282. doi: 10.1007/s11270-020-04648-1

  • 102

    Warwick-DugdaleJ.SolonenkoN.MooreK.ChittickL.GregoryA. C.AllenM. J.et al. (2019). Long-read viral metagenomics captures abundant and microdiverse viral populations and their niche-defining genomic islands. PeerJ7:e6800. doi: 10.7717/peerj.6800

  • 103

    WeiY.KirbyA.LevinB. R. (2011). The population and evolutionary dynamics of vibrio cholerae and its bacteriophage: conditions for maintaining phage-limited communities. Am. Nat.178, 715725. doi: 10.1086/662677

  • 104

    WenT.NiuG.ChenT.ShenQ.YuanJ.LiuY. X. (2023). The best practice for microbiome analysis using R. Protein Cell14, 713725. doi: 10.1093/procel/pwad024

  • 105

    WishartD. S.HanS.SahaS.OlerE.PetersH.GrantJ. R.et al. (2023). PHASTEST: faster than PHASTER, better than PHAST. Nucleic Acids Res.51, W443W450. doi: 10.1093/nar/gkad382

  • 106

    WoodD. E.LuJ.LangmeadB. (2019). Improved metagenomic analysis with kraken 2. Genome Biol.20:257. doi: 10.1186/s13059-019-1891-0

  • 107

    WuB.WangR.FaneA. G. (2017). The roles of bacteriophages in membrane-based water and wastewater treatment processes: a review. Water Res.110, 120132. doi: 10.1016/j.watres.2016.12.004

  • 108

    Yoshida-TakashimaY.YoshidaM.OgataH.NagasakiK.HiroishiS.YoshidaT. (2012). Cyanophage infection in the bloom-forming Cyanobacteria <i>Microcystis aeruginosa</i> in surface freshwater. Microbes Environ.27, 350355. doi: 10.1264/jsme2.ME12037

  • 109

    YueH.HeY.FanE.WangL.LuS.FuZ. (2017). Label-free electrochemiluminescent biosensor for rapid and sensitive detection of pseudomonas aeruginosa using phage as highly specific recognition agent. Biosens. Bioelectron.94, 429432. doi: 10.1016/j.bios.2017.03.033

  • 110

    Zaragoza-SolasA.Haro-MorenoJ. M.Rodriguez-ValeraF.López-PérezM. (2022). Long-read metagenomics improves the recovery of viral diversity from complex natural marine samples. mSystems7:e0019222:202228. doi: 10.1128/msystems.00192-22

  • 111

    ZhangM.ZhangT.YuM.ChenY. L.JinM. (2022). The life cycle transitions of temperate phages: regulating factors and potential ecological implications. Viruses14, 14:1904:1904. doi: 10.3390/v14122818

  • 112

    ZhaoY.FedericoA.FaitsT.ManimaranS.SegrèD.MontiS.et al. (2021). Animalcules: interactive microbiome analytics and visualization in R. Microbiome9:76. doi: 10.1186/s40168-021-01013-0

  • 113

    ZhouY.MararA.KnerP.RamasamyR. P. (2017). Charge-directed immobilization of bacteriophage on nanostructured electrode for whole-cell electrochemical biosensors. Anal. Chem.89, 57345741. doi: 10.1021/acs.analchem.6b03751

  • 114

    ZhuY.ShangJ.PengC.SunY. (2022). Phage family classification under Caudoviricetes: a review of current tools using the latest ICTV classification framework. Front. Microbiol.13:1032186. doi: 10.3389/fmicb.2022.1032186

  • 115

    ZielezinskiA.DeorowiczS.GudyśA. (2022). PHIST: fast and accurate prediction of prokaryotic hosts from metagenomic viral sequences. Bioinformatics38, 14471449. doi: 10.1093/bioinformatics/btab837

Summary

Keywords

freshwater, phage, bacteriophage, cyanophage, virome

Citation

Dantas CWD, Martins DT, Nogueira WG, Alegria OVC and Ramos RTJ (2024) Tools and methodology to in silico phage discovery in freshwater environments. Front. Microbiol. 15:1390726. doi: 10.3389/fmicb.2024.1390726

Received

23 February 2024

Accepted

16 May 2024

Published

31 May 2024

Volume

15 - 2024

Edited by

Marcin Łoś, University of Gdansk, Poland

Reviewed by

Przemyslaw Decewicz, University of Warsaw, Poland

Updates

Copyright

*Correspondence: Rommel Thiago Jucá Ramos,

†These authors have contributed equally to this work and share first authorship

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics