Recovery of High Quality Metagenome-Assembled Genomes From Full-Scale Activated Sludge Microbial Communities in a Tropical Climate Using Longitudinal Metagenome Sampling

The analysis of metagenome data based on the recovery of draft genomes (so called metagenome-assembled genomes, or MAG) has assumed an increasingly central role in microbiome research in recent years. Microbial communities underpinning the operation of wastewater treatment plants are particularly challenging targets for MAG analysis due to their high ecological complexity, and remain important, albeit understudied, microbial communities that play ssa key role in mediating interactions between human and natural ecosystems. Here we consider strategies for recovery of MAG sequence from time series metagenome surveys of full-scale activated sludge microbial communities. We generate MAG catalogs from this set of data using several different strategies, including the use of multiple individual sample assemblies, two variations on multi-sample co-assembly and a recently published MAG recovery workflow using deep learning. We obtain a total of just under 9,100 draft genomes, which collapse to around 3,100 non-redundant genomic clusters. We examine the strengths and weaknesses of these approaches in relation to MAG yield and quality, showing that co-assembly may offer advantages over single-sample assembly in the case of metagenome data obtained from closely sampled longitudinal study designs. Around 1,000 MAGs were candidates for being considered high quality, based on single-copy marker gene occurrence statistics, however only 58 MAG formally meet the MIMAG criteria for being high quality draft genomes. These findings carry broader broader implications for performing genome-resolved metagenomics on highly complex communities, the design and implementation of genome recoverability strategies, MAG decontamination and the search for better binning methodology.

The analysis of metagenome data based on the recovery of draft genomes (so called metagenome-assembled genomes, or MAG) has assumed an increasingly central role in microbiome research in recent years. Microbial communities underpinning the operation of wastewater treatment plants are particularly challenging targets for MAG analysis due to their high ecological complexity, and remain important, albeit understudied, microbial communities that play ssa key role in mediating interactions between human and natural ecosystems. Here we consider strategies for recovery of MAG sequence from time series metagenome surveys of full-scale activated sludge microbial communities. We generate MAG catalogs from this set of data using several different strategies, including the use of multiple individual sample assemblies, two variations on multi-sample co-assembly and a recently published MAG recovery workflow using deep learning. We obtain a total of just under 9,100 draft genomes, which collapse to around 3,100 non-redundant genomic clusters. We examine the strengths and weaknesses of these approaches in relation to MAG yield and quality, showing that co-assembly may offer advantages over single-sample assembly in the case of metagenome data obtained from closely sampled longitudinal study designs. Around 1,000 MAGs were candidates for being considered high quality, based on single-copy marker gene occurrence statistics, however only 58 MAG formally meet the MIMAG criteria for being high quality draft genomes. These findings carry broader broader implications for performing genome-resolved metagenomics on highly complex communities, the design and implementation of genome recoverability strategies, MAG decontamination and the search for better binning methodology.

INTRODUCTION
Over the course of the last half decade the use of genomeresolved metagenome analysis has become a common approach for dealing with whole community metagenome data collected from microbiomes and complex microbial communities (Quince et al., 2017b). Starting with deeply sequenced genomic DNA, metagenome assembly is performed in order to reconstruct short fragments of the underlying member genomes, which are then analyzed further using data clustering procedures (genome binning Sangwan et al., 2016) with the objective of recovering draft genomes of the member species, referred to as metagenome-assembled genomes (MAG). This approach, now readily deployable due to the availability of near-automated bioinformatics workflows (Uritskiy et al., 2018;Kieser et al., 2020), has been successfully used on a great variety of microbial communities (Hu et al., 2016;Parks et al., 2017;Tully et al., 2018;Nayfach et al., 2019;Pasolli et al., 2019;Stewart et al., 2019;Almeida et al., 2021;Singleton et al., 2021) and has resulted in recovery of draft genomes for many new species that would have most likely remained uncharacterised due to a lack of knowledge of their required culture conditions (Parks et al., 2017).
Despite impressive accomplishments, the MAG approach still harbors many challenges and limitations. By nature, short read metagenome assemblies remain highly fractionated, resulting from the limited ability of short read sequencing to accurately capture complex repeat regions (Chen et al., 2020) and the difficulties encountered in reconstructing sequence from closely related strains or sub-species (Quince et al., 2017aBertrand et al., 2019;Vicedomini et al., 2021). In practice a draft genome obtained from these methods would contain at best, tens and, more typically, hundreds, of distinct contigs, and so there are inherent difficulties in accurately determining the degree of genome completeness and the extent of contamination from non-cognate genomes (Chen et al., 2020), and in identifying the presence of horizontally transferred sequence (Douglas and Langille, 2019). Another limitation relates to impact of the ecogenomic complexity of the community under study, both in terms of genomic diversity, particularly at sub-species or strain level, but also in terms of overall community richness and evenness (Quince et al., 2017b). When applied to microbial communities of high complexity, a typical MAG analysis will return many draft genomes of low quality, as defined by currently accepted criteria (Bowers et al., 2017).
Some of these challenges may be addressed using emerging methods, such as long-read sequencing (Arumugam et al., 2021;Singleton et al., 2021), synthetic long-read methods (Bishara et al., 2018) and adaptations of chromosome conformation capture methods (DeMaere and Darling, 2019;Bickhart et al., Abbreviations: AOB, ammonium oxidizing bacteria; BAM, binary alignment map (files); DNA, deoxyribonucleic acid; EBPR, enhanced biological phosphorus removal; GAO, glycogen accumulating organism; HQ, high quality (MAG); LQ, low quality (MAG); MAG, metagenome assembled genome; MIMAG, Minimum Information about a Metagenome-Assembled Genome; MQ, medium quality (MAG); NOB, nitrite oxidizing bacteria; PAO, polyphosphate-accumulating organisms; pHQ, putative high quality (MAG); UC, unclassified (MAG); VAMB, variational autoencoders for metagenomic binning. 2022). However, all of these new techniques are themselves complex and contain their own limitations, and since the vast majority of non-amplicon metagenome data has been collected using Illumina shotgun sequencing, there remains a clear need to develop more refined methods to recover genomes from short read metagenome assemblies.
Complex microbial communities associated with full-scale wastewater treatment plants (activated sludge) are particularly challenging targets for MAG-based analyses due to high species richness, high species evenness and extent of genetic diversity (Law et al., 2016;Pérez et al., 2019;Yang et al., 2020;Ye et al., 2020;Singleton et al., 2021). Recent comparative analyses undertaken with amplicon sequencing surveys suggest that these activated sludge communities are more complex than the host-associated microbiomes, including the human fecal microbiomes, by an order of magnitude (Wu et al., 2019). To date, several MAG-based analyses of activated sludge communities have been reported, varying in sequencing depth, raw sequence and availability of recovered genome (MAG) sequence, including one recently published study that employed long-read metagenomics (Singleton et al., 2021). In this paper, we consider strategies for recovery of MAG sequence from time series metagenome surveys of full-scale activated sludge microbial communities. We generate MAG catalogs from this set of data using several different strategies, including the use of multiple individual sample assemblies, two variations on multisample co-assembly and a recently published MAG recovery workflow using deep learning (Nissen et al., 2021). We examine the strengths and weaknesses of these approaches in relation to MAG yield and quality, and present a catalog of nonredundant draft genomes comprised of at least putatively high quality under the MIMAG criteria. All raw data and high quality MAG sequence have been made available via NCBI (BioProject Accession PRJNA731554), and key data products, including metagenome assemblies and the complete set of recovered MAG sequence data, are being made publicly available on Zenodo (doi: 10.5281/zenodo.5215738).

Summary of Data Obtained and Overall Study Design
As part of a long-term sampling project surveying the microbial ecology of wastewater treatment in tropical climates, we sampled activated sludge from aerobic-stage tanks in a full-scale wastewater treatment plant in Singapore, known to perform enhanced biological phosphorus removal (EBPR) and previously studied by us in Law et al. (2016), obtaining 24 samples over approximately a 10 month period. The median sampling interval was 7 days (mean 13 days, with range 7-56 days). At each sampling event, we obtained samples for DNA extraction from the aerobic treatment tank (including a panel of coassayed physico-chemical measurements), and performed whole community shotgun metagenome sequencing on all samples. In total, we obtained 1.5 billion reads with a mean of 62.6 M reads per sample (range: 45.7-101.4 M; Supplementary Table 1). From these data we constructed catalogs of metagenome-assembled genomes (MAG) using several approaches as described below.
In our primary analysis, we performed both individual sample assembly of data from each of the 24 samples and co-assembly of the same ensemble of data (see Section 5.2), in order to formally compare the results of each of these two major approaches to MAG-based analysis. Metagenome assembly was performed using metaSPAdes (Nurk et al., 2017) and genome binning was performed using MetaBAT2 (Kang et al., 2019) in both cases. From the co-assembly, we generated two sets of MAGs, one using coverage profiles generated across all 24 samples and the other generated using the entire read set treated as a single meta-sample (see Section 5.2), which we refer to as multi-BAM and single-BAM co-assembly binning, respectively. As a secondary analysis, we performed metagenome binning using a recently published deep learning workflow called VAMB (Nissen et al., 2021), which is described later in the article.

General Features of MetaBAT2-Based MAG Recovery
A total of 7,138 MAGS were recovered from the three types of assembly-binning workflows. Between 94 and 273 MAGs (mean 143) were obtained from each individual sample assembly, with a total of 3,429 MAGs being generated from all 24 individual assemblies ( Table 1 and Supplementary Table 2). Approximately 10% and 27% of individual sample assembly MAGs were candidates for being high (pHQ) and medium quality (MQ) under the MIMAG criteria (Bowers et al., 2017) (see Section 5.8). The single-BAM and multi-BAM co-assembly binning workflows returned 1,997 and 1,712 MAGs, respectively ( Table 1). The proportions of pHQ-and MQ-MAGs obtained from co-assemblies were higher compared to those observed from the ensemble of individual sample assemblies (Table 1), with 14.3% and 17.7% being classifiable as pHQ-MAGs in the single-BAM and multi-BAM co-assembly binning, respectively, and approximately 30% of MAG from each type of co-assembly binning workflow, holding MQ status.
The proportion of reads mapped to co-assemblies was higher (mean 92%; n=2) than the proportion observed to map to individual sample assemblies (mean 67%, n=24) (Supplementary Table 2). At the level of recovered MAG sequence, on average 27.5% (range: 21.1-37.3%) of reads were mappable to MAG recovered from single sample assemblies, whereas for MAGs recovered from single-BAM or multi-BAM co-assembly, this increased to 67.4 and 69.3%, respectively.
As expected, estimated MAG genome quality demonstrated a strong association with relative abundance-expressed as a normalized coverage measure that permits comparisons across workflows whose variable number of input sequence reads would bias estimation (see Section 5.2)-with the proportion of pHQ MAGs being highest in the top 10%-ile of the normalized coverage, and decreasing in a roughly uniform manner thereafter (Figure 1). A similar trend was observed for MQ-MAGs, and the proportion of poor quality MAGs expanded in the bottom 50% of the normalized coverage distribution.
Given the expected high degree of genomic redundancy among the complete set of 7,138 MAGs generated from the three assembly-binning workflows employed, the entire set was de-replicated and grouped into non-redundant genome clusters [secondary clusters as defined by the dRep workflow (Olm et al., 2017); see Section 5.9]. In total 2,912 non-redundant clusters where obtained, comprised of between 1 and 26 MAGs (median 2; mean 2.45) (Supplementary Table 3). Of these 2,912 secondary clusters, 382 (13.1%) contained at least one MAG that was pHQ, and 690 (23.7%) contained MAGs that were MQ at best, with the remainder containing MAGs of either low quality (LQ; n =1,576; 54.1%) or else unclassifiable (UC; n = 264; 9.1%).
To provide some insight into the extent to which the recovered MAGs are representative of the underling community composition, we screened all assembled contigs for the presence of 16S SSU-rRNA genes using BAsic Rapid Ribosomal RNA Predictor (Barrnap version 0.9) [ref] and then dereplicated these sequences using CD-hit [ref] (see Section 5), under the simplifying assumption that the total number of non-redundant could be considered an rough approximation for the lower limit of overall community complexity. The number of non-redundant 16S sequences was observed to be 3,199 (from 3,489 detected in total) from the co-assembly and 5,381 from all 24 single sample assemblies (from 14,591 detected in total). Thus, we based on the set of the 2,912 secondary clusters of MAGs defined above, the assembly-binning approach is probably capturing, at most, no more than 55% (29,12/5,381) of the recoverable taxonomic content of the community.

Comparative Analysis of Binning Strategies Using MetaBAT2
To gain further insight into the effectiveness and interrelationship of each type of genome recovery workflow, the set of 2,912 non-redundant clusters were further categorized according to the the types of workflow which had contributed at least one genome to a given non-redundant cluster FIGURE 1 | Relationship between genome quality and relative abundance within the MetaBAT2-derived MAG catalogs obtained in this study. Relative abundance is inferred using a normalised measure of coverage, permitting comparison of mean MAG coverage among the three different assembly-binning workflows (see Table 1). Within each decile of the normalised coverage distribution, the numbers of MAG meeting each of the four MIMAG-derived quality levels is shown. Table 4 and Figure 2A). Of these 2,912 secondary clusters, 346 (11.9%) contained genomes contributed from both individual sample assemblies and both types of co-assembly binning procedure, and 1,070 (36.8%) contained MAGs recovered from both types of co-assembly but not from any individual sample assembly (Figure 2A). Relatively few MAGs were observed arising from individual sample assembly and from either, but not both, types of co-assembly (23 and 14 secondary clusters, respectively, against single-BAM and multi-BAM; multi-BAM; Figure 2A). In contrast, we observed substantial numbers of secondary clusters that were only comprised of MAGs obtained from within one of the three workflows (Figure 2A).

(Supplementary
We then examined how these associations were patterned by genome quality and relative abundance, using a composite quality statistic as defined in the dRep pipeline and a normalized measure of MAG coverage that adjusted for differences in coverage that are present across the three types of workflows ( Figures 2B-H). Each secondary clusters was represented by the best quality MAG observed in that cluster, as defined by the maximum dRep quality score within the highest MAG quality category from that cluster.
We observed that the 346 secondary clusters comprised of MAGs recovered from all three workflows had the highest overall coverage and over half of these secondary clusters contained at least one pHQ genome (189/346 or 54.6%). In the larger set of 1070 secondary clusters that arose from both types of coassembly workflow, 185 (17.3%) and 483 (45.1%) of these held at least one genome of pHQ and MQ level, respectively. These secondary clusters were also distributed across a lower coverage range than the previous category ( Figure 2C), consistent with the expectation that co-assembly procedures can recover genomes of less common taxa. Of the remaining set of 1,496 secondary clusters from the remaining five categories there were only 8 (0.54%) which held candidates for being pHQ-MAGs, with being pHQ-MAGs, with the remainder holding unremarkable or frankly poor quality (Figures 2D-H).
We then undertook several secondary analyses to examine whether co-assembly or individual sample assembly showed any inherent biases in genome quality (Figure 3). Firstly, for secondary clusters that contained MAGs from all three workflows (Figures 2A,B) we examined the proportion of pHQ-MAG in secondary cluster that came from either type of co-assembly or from an individual assembly, but observed no clear pattern in relation to the origin of pHQ-MAGS ( Figure 3A). Secondly, we compared completeness and contamination statistics within a subset of 48 secondary clusters that contained at least one pHQ genome sourced from co-assembly and at least one pHQ genome FIGURE 2 | Inter-relationships between genome quality, relative abundance and recovery workflow type for 7.138 MAGs recovered from the use of MetaBAT2. (A) secondary cluster membership by workflow 7,138 MAG categorised into 2,912 secondary clusters; the remaining panels show the highest observed genome quality with a secondary cluster (vertical axis) against normalised coverage measure (horizontal axis) for secondary clusters (B) containing MAGs from all three workflows (n=346); (C) recovered from both single-BAM and multi-BAM co-assembly workflows (n=1070); (D) recovered from both individual assemblies and multi-BAM co-assembly (n=14); (E) recovered from both individual assemblies and single-BAM co-assembly (n = 23); (F-H) recovered solely from multi-BAM co-assembly (n = 282), single-BAM co-assembly (n = 557) and individual assemblies (n = 620), respectively. Secondary clusters containing pHQ, MQ, and LQ MAGs are coloured in red, blue, and grey, respectively.
from an individual assembly. Removing all genomes that did not attain pHQ status, on average this subset of secondary clusters contained 1.8 pHQ-MAGs (range: 1-2) sourced from the coassembly workflows and 6.3 pHQ-MAGs (range: 1-22) arising from the individual assembly workflow. We calculated median completeness and median contamination within each secondary cluster, conditioned on workflow type, observing a bias toward higher completeness (Figures 3B,D) and a lower contamination (Figures 3C,D) in co-assembled genomes relative to genomes obtained from individual assemblies.
Collectively, these data suggest that if we focus attention on recovered genomes that are plausibly of high quality, then these results indicate, in the communities studied here, that co-assembly conveys advantages in regards to MAG yield.

Decontamination of Recovered Draft Genomes
To improve the number of high quality MAGs produced from the workflows above, we applied RefineM (Parks et al., 2017) to all MAGs from the three assembly procedures that possessed completeness of more than 90% (1,307 MAGs, contributed from 550 secondary clusters) regardless of their contamination and strain heterogeneity levels, as calculated by CheckM (Parks et al., 2015), considering these suitable candidates for refinement to high quality.
The results of the decontamination analysis are summarized in Table 2A. Of the 1,307 MAGs, 929 (71.1%) were classified as pHQ prior to decontamination, and of these, 855 (92.0%) retained the same quality level after application of RefineM. Of the remaining 74 pHQ-MAGs, the majority 94.6% were converted to MQ, with only four being reduced to LQ status. Of the 127 MAGs originally classified as MQ, 35.4% (45/127) attained pHQ status following application of RefineM. In the set of 251 MAGs that held UC status before decontamination, only 7.2% and 16.7% improved their quality to pHQ and MQ, respectively, suggesting that most MAGs that hold contamination above 10% are likely to be of highly flawed construction.

Genomes Recovered Using a Deep Variational Autoencoder Workflow
As a secondary, complementary analysis to the canonical approach taken above, we performed genome recovery using FIGURE 3 | Examination of genome quality in MAGs recovered among different MetaBAT2-derived workflows. Each data point references a secondary cluster of highly related genomes obtained from the dRep workflow. (A) proportion of MAGs within a secondary cluster that attain pHQ quality status from co-assembly workflows (x-axis) and individual assemblies (y-axis); (B-D) comparison of genome quality measures from within 48 secondary clusters in which at least one pHQ MAG is observed from a co-assembly workflow and from an individual assembly; (B) median completeness (C p ) observed from co-assemblies (x-axis) and individual assemblies (y-axis); (C) median contamination (C n ) observed from co-assemblies (x-axis) and individual assemblies (y-axis); (D) relationship between the difference in median completeness observed in co-assemblies with respect to individual assemblies (y-axis) and difference in median contamination (co-assemblies with respect to individual assemblies) (x-axis).
a recently described workflow called VAMB that utilizes deep variational autoencoders (Nissen et al., 2021). Using data from the 24 individual-sample assemblies, VAMB generated 1,941 MAGs of minimum total sequence length of 200 kbp (to match that used by the default MetaBAT2-based workflows).
Of the recovered draft genomes, 8.0, 24.5, 66.6, and 0.9% were classified as pHQ, MQ, LQ and UC, respectively ( Table 1 and Supplementary Table 6). The pHQ-MAGs from VAMB were strongly associated with those detected by the MetaBAT2 workflows, with only 1 and 5 secondary clusters containing pHQ-MAGs ( Figure 4A) and MQ-MAGs ( Figure  4B), respectively, that were not recovered by any other workflow. While a substantive number of secondary clusters containing LQ-MAGs were recovered by VAMB (Table 1 and Figure 4C), interestingly, the number of secondary clusters containing UC-MAGs was two orders of magnitude lower than the number observed in the MetaBAT2 workflows ( Figure 4D). This suggests that the VAMB methodology may likely provide superior control of gross contamination, although possibly at the expense of recovery of more complete, higher quality MAGs.
As above, we applied RefineM workflow to the 175 MAGs with completeness above 90% (contributed from 36 distinct secondary clusters), which were primarily comprised of pHQ-MAGs (n = 156, 89.14%). After application of the RefineM workflow, 59% of pHQ-MAGs retained their quality status, and 38% were reduced to MQ-status. The numbers of MAGs in remaining categories was low ( Table 2B). The average number of contigs removed by RefineM was 53 (range: 0-495), and completeness, contamination and strain heterogeneity were reduced on average by 4.6%, 1.4% and 1.3 units, respectively (Supplementary Table 6).

Catalog of High Quality Genomes From Tropical Climate Activated Sludge
The entire set of MAGs recovered from all four sources were combined into a single set of 9,079 bins (7,138 bins Across the 382 secondary clusters, only 14 (3.7%) were comprised of MAGs annotated to species level and 155 (40.6%) were annotated to genus level, highlighting that over half the recovered pHQ MAGs were likely to be previously uncharacterised. Species-level annotations were observed for the polyphosphate accumulating organisms (PAO) Candidatus Accumulibacter SK-02 (n = 2 MAGs)  and the cyanobacteria Obscuribacter phosphatis (Soo et al., 2014;Stokholm-Bjerregaard et al., 2017) (n = 2 MAGs), and the glycogen accumulating organism, Candidatus Competibacter (McIlroy et al., 2014) (n = 1 MAG). Interestingly, we recovered a single MAG from Romboutsia timonensis, a member of the human gut microbiome (Ricaboni et al., 2016), and to our knowledge not previously identified in activated sludge communities, and genomes of the methaneoxidizing bacteria Methylosarcina fibrata (Hamilton et al., 2015) (n=2 MAG). Genomes from the denitrifier Hyphomicrobium denitrificans (Martineau et al., 2014) were recovered (n=2 MAG), along with genomes from two species within the UBA2359 lineage within order Chitinophagales (GTDB), namely Sphingobacteriales bacterium TSM_CSS and Sphingobacteriales bacterium TSM_CSM and genomes from recently identified novel lineages in phyla Bacteroidetes, Chloroflexi and Chlorobi (see Supplementary Table 8 for full details of species-level identifications).
The ammonia-oxidizing bacteria (AOB), Nitrosomonas When we applied the more stringent form of the MIMAG criteria for high quality genomes (Bowers et al., 2017), that is, those with at least 18 tRNAs and the presence of a complete set of rRNA genes, only 58 HQ-MAGs were identified. In addition, 6 more high quality MAGs were recovered from RefineM pipeline, resulting in a total 64 high quality MAGs submitted to NCBI.

Comparison to Other MAG Catalogs Recovered From Activated Sludge
We systematically compared our MAG catalog to several others that have been recently obtained (Ye et al., 2020;Singleton et al., 2021), using genome de-replication (see Section 5.9) and same criteria recently used in a comparative analysis of MAG catalogs from multiple cow rumen microbiomes (Watson, 2021). Collectively, this analysis defined a total of 6,328 secondary clusters, containing on average 1.9 MAGs (median 1.0; range 1-54 MAG). We examined the membership of these secondary clusters in relation to catalog of origin (Figure 6). Only 7 secondary clusters contained genomes from all three source catalogs. A larger proportion of related genomes (n = 314) was observed between our catalog and that of Ye et al. (2020), than between our study and the catalog of Singleton et al. (2021), which may reflect the more diverse geographies and mixture of operational regimes incorporated in the former study. We highlight however, that our analysis is retrospective and thus should be interpreted with caution.

DISCUSSION
In this paper we undertake a comprehensive genome-resolved metagenome survey of an activated sludge microbial community from a full-scale, tropical climate wastewater treatment plant, based on a time-series survey design. We obtain a total of just under 9,100 draft genomes, which collapse to around 3,100 non-redundant genomic clusters (defined under a stringent degree of relatedness), and we estimate the latter to capture, ar most, just over 50% of the overall community composition. Around 1000 MAGs were candidates for being considered high quality, based on single-copy marker statistics (referred to pHQ in our analysis) but 58 MAGs formally meet the MIMAG criteria for being high quality draft genomes. In building these MAG catalogs, we undertake a systematic comparison of MAG recovery strategies, based on the use of individual-sample assemblies and two variations on the use of co-assemblies (using the combination of metaSPAdes for performing assemblies and MetaBAT2 for genome recovery). Additionally, we compared these results to those obtained from the use of a recently released deep learning variational autoencoder called VAMB (Nissen et al., 2021), which appears to convey some advantages in relation to control of MAG contamination. As discussed below, these findings carry broader implications for conducting genome-resolved metagenomics on highly complex communities. The genomes recovered at pHQ level in this study represented 11 phyla, captured at a relative frequency of above 1%, with just under half being members of Bacteriodota and Proteobacteria, and represent the most comprehensive catalog obtained from tropical climate activated sludge communities, building on our previous efforts (Law et al., 2016;Arumugam et al., 2019Arumugam et al., , 2021Qiu et al., 2020). Wastewater microbial communities from FIGURE 6 | Relationships between pHQ-MAGs recovered in this study and MAGs from extant activated sludge catalogs. Counts reference secondary clusters categorised by the presence of MAGs originating from one of the three MAG catalogs.
tropical climates are understudied relative to their temperate climate counterparts, as are the bioprocesses that they support. Given the urgent need to understand the impact of climate change at microbial scales of life (Cavicchioli et al., 2019), such communities will become an increasingly important target of study, given their role as mediators of the interface between human and natural ecosystems (McLellan et al., 2015). In the present case, we obtained pHQ or confirmed HQ MAGs for expected taxa conveying key functionality to activated sludge bioprocesses including the AOB Nitrosomonas, NOB Nitrospira, the PAO Candidatus Accumulibacter and the GAO Candidatus Competibacter. Notable unexpected findings included, but were not limited to, the cyanobacterial PAO species Obscuribacter phosphatis and Romboutsia timonensis, a microbe previously identified in the human gut and plausibly an immigrant species from that source.
The large proportion of recovered genomes that hold unremarkable quality is expected, given the recognized challenges of performing these analysis on highly complex microbial communities (Pasolli et al., 2019) and the known complexity of full-scale activated sludge microbial communities, which are estimated to be more complex than the human gut microbiome by around an order of magnitude (Wu et al., 2019). The complexity of the community in regards to taxonomic novelty is also seen in the fact that around 60% of the recovered pHQ hold no assignment at species or genus level, and by the relatively low degree of recapitulation of genomes from other activated sludge catalogs. Consistent with the recognized limitations of MAG analyses conducted from short-read sequence data, the recovered genomes are unlikely to be resolved to strain level, and the size and complexity of the dataset limited the use of a recent genome-bin workflow for strain deconvolution  (data not shown). Nonetheless, the kind of densely sampled longitudinal data collected here is ideally suited for developing such strain-aware genome recovery methods.
As part of this analysis, we have undertaken a comprehensive comparison of individual sample assembly and co-assembly approaches for genome recovery, which has been relatively unexplored in the literature to date. Current thinking on MAG analysis suggests that assembling data from individual samples will aid the recovery of higher quality, relatively abundant genomes, while co-assembly will assist in the recovery of lower abundance genomes with the trade-off of artifacts associated with multi-sample analysis (Pasolli et al., 2019;Hofmeyr et al., 2020), including cross-sample chimeras (Chen et al., 2020), split-bins (Arumugam et al., 2021) and increased probability of recovering pan-genomic level (Chen et al., 2020), although this will no doubt be dependent on, the nature of the co-assembled samples (longitudinal vs. cross-sectional; Pasolli et al., 2019), sample replication number, genetic diversity, community complexity and, of course, sequencing depth. In their comparative analysis on co-assembly and individual assembly of infant and maternal gut microbiomes, Pasolli et al. (2019) found little difference in number or quality of recovered genomes from either method, which included an analysis of both longitudinal and crosssectional sampling designs, concluding that application to longer time-series would likely result in higher MAG yields. The findings of the present study are consistent with that view, with substantially higher numbers of pHQ-level MAGs being recovered from co-assembly procedures. We find some clear indications that co-assembly is advantageous in regards to genome quality, and, at least in the subset of MAGs that are recovered at pHQ level by both approaches, there is clear evidence that co-assembly will provide cognate MAGs with higher completeness and lower contamination statistics, as defined by single copy marker gene analysis: the extent to which this is generalisable to other settings is unclear.
Unexpectedly, we find the two specific modes of co-assembly are each capable of high MAG yields suggesting that greater depth per se, as implemented in the single-BAM approach, will recover almost as many pHQ MAGs (285 vs. 303; Table 1) as the canonical differential coverage approach (multi-BAM). We would caution however, that the performance of the single-BAM co-assembly workflow may be a consequence of the high degree of relatedness between the longitudinally studied samples used in this study. In a setting where this is not the case, the rate of occurrence of artifactual and/or chimeric MAGs formation could be potentially far higher, and we would not recommend its routine application for this reason. Additionally, the computational overheads of co-assembly can be substantial, as seen in the present case, and may be untenable or impractical in some settings. Obviously this would also influence the choice of metagenome assembler, for example, MEGAHIT may be a more suitable choice of assembler than metaSPAdes for datasets at, or above, the scale of data employed here. Interestingly, as applied to this dataset, the deep-learning based VAMB workflow recovered pHQ MAGs that largely recapitulated those from the MetaBAT2 workflows. Collectively, these findings reinforce the view that MAG recovery is highly context-specific in relation to the community under study (Vollmers et al., 2017).
There remains an urgent need for methods to identify noncognate contigs in fractionated assemblies, with the impact of contamination on gene-level becoming more widely recognized (Arkhipova, 2020), and one recently published analysis suggests that up to 15-30% of publicly-available MAGs classified at pHQ level will harbor chimeric content (Orakov et al., 2021). In the present study, we have examined removal of possible contamination using the RefineM workflow (Parks et al., 2017). Our results shed light on the strengths and weaknesses of the different recovery workflows we employed. From the MetaBAT2 workflows, there was a high degree of robustness in the case of recovered genomes that were classified at pHQ level, with over 90% retaining their pHQ status upon the application of RefineM. In the case of draft genomes that held high levels of contamination upon a backbone of high completeness, most also remained within the same genome quality category following de-contamination, suggesting that these recovered sequence constructions are fundamentally flawed. In the case of the bins recovered from VAMB, while around one third of pHQ changed quality level to MQ, there was an underrepresentation of complete genomes initially showing high degrees of contamination, suggesting that VAMB may be quite robust to the formation of chimeras. Whether this is a general property, or a consequence of the high redundant nature of time-series data, is a subject for further study.
Collectively, our results reinforce the ongoing need for analysis procedures suitable for recovering high quality MAGs from metagenome data, also highlighted by recent calls for more careful manual curation of recovered genomes (Chen et al., 2020;Lui et al., 2021) and the use of complementary sequencing, including long read (Arumugam et al., 2021;Singleton et al., 2021), synthetic long read (Bishara et al., 2018) and chromosome confirmation capture methods (DeMaere and Darling, 2019; Bickhart et al., 2022). Another relevant development is the direct use of assembly graphs in MAG recovery, including for the recovery of strain level sequence Mallawaarachchi et al., 2020;Quince et al., 2021). Further attention could also be placed on the use of alternative feature representations for contig sequence and/or coverage data: most methods developed to date have used Euclidean space (of various dimensionality, ranging from two to several hundred), but other representations may hold substantive advantages, for example hyperspherical or hyperbolic embeddings (Ding and Regev, 2021) or related manifold learning methods e.g., as implicit in the use of VAMB (Nissen et al., 2021).

CONCLUSIONS
Using a time-series metagenome survey of activated sludge microbial communities, we evaluate several extant strategies for obtaining MAG catalogs, showing that co-assembly offers clear advantages over single-sample assembly. We obtain a total of just under 9,100 draft genomes, which collapse to around 3,100 non-redundant genomic clusters. One thousand MAGs were candidates for being considered high quality, based on single-copy marker gene occurrence statistics, however only 58 MAG formally meet the MIMAG criteria for being high quality draft genome. More broadly, our findings have a number of broader implications in regards performing genome-resolved metagenomics on highly complex communities, the design and implementation of genome recoverability strategies, MAG decontamination and the search for better binning methodology.

Metagenome Extraction and Sequencing
The field sampling methodology, sample handling, DNA extraction and sequencing methods have been previously described by us Law et al. (2016). At a full-scale operational wastewater treatment plant in Singapore, treating mostly waste of domestic origin, we sampled the aerobic stage of an activated sludge tank known to perform enhanced biological phosphate removal (EBPR). At each sampling event, we obtained multiple samples for DNA extraction from the aerobic treatment tank and collected a panel of relevant physico-chemical measurements (data not analyzed in this paper). Samples were snap frozen in a liquid nitrogen dry shipper immediately upon retrieval from the tank, and transported to the laboratory for subsequent genomic DNA extraction and sequencing on Illumina HiSeq2500 using a read length of 251 bp (paired end) (see Law et al., 2016 for details of all gDNA extraction, library preparation and sequencing protocols).

Genome-Resolved Metagenome Analysis
Unless otherwise stated data analysis was performed in the R Statistical Computing Environment (version 4.0.5) (R Core Team, 2021).

Genome Recovery From Individual Sample Assemblies
From the processed read data, we initially performed individual sample assemblies using SPAdes (Nurk et al., 2017) in -meta mode with maximum k-mer value of 127, and performed metagenome binning using MetaBAT2 version 2.12.1 (default settings) to obtain an initial set of MAGs from each sample. Coverage for each contigs were extracted from SPAdes kmer coverage, converted to log scale, and averaged per bin. To compare estimates of MAG coverage between samples, we normalized MAG coverages by centering using the per-sample mean per-MAG coverage, scaling by the per-sample standard deviation of coverage and then placing back on a positive scale by subtracting the smallest normalized coverage value across the entire set of MAGs.

Genome Recovery From Co-assemblies
Processed read data from the 24 samples were co-assembled with SPAdes-3.13.0 (Nurk et al., 2017) (default parameters except -meta -m 2,900 -k 21, 33, 55, 77, 99, 127 -t 50). Binning was performed on contigs over 2,500 bp in length with MetaBAT2 (Kang et al., 2019) (version 2.12.1 with default parameters except -d -t 40 -m 2,500 -v), employing two different approaches, namely: 1) using contigs from the co-assembly and 24 sorted. bam files made by aligning reads from each of the 24 datasets to the contigs from coassembly, referred to as the multi-BAM co-assembly and 2) using contigs from the co-assembly and a single sorted bam file made by aligning all reads from the 24 datasets to the contigs from co-assembly (referred to as a single-BAM co-assembly).

Genome Recovery Using Deep Variational Autoencoder
The recently published metagenome binner VAMB was employed on the 24 individual sample assemblies, following the described procedure in Nissen et al. (2021). Briefly, all assembled contigs with minimum length of 2.500 bp were compiled into a FASTA catalog (-m 2,500 -nozip). Processed read data from each of the 24 samples were mapped to this catalog using Bowtie2 (version 2.3.4.3) (Langmead and Salzberg, 2012) and Samtools (version 1.9) (Li et al., 2009), with read depth being calculated using the MetaBAT2 script jgi_summarize_bam_contig_depths; default settings). We then ran VAMB (version 3.0.2) on the catalog and read depth data using default parameters except for minimum total sequence length set at 200 kb (-o C as the sample separator and -minfasta 200,000).

Identification of 16S SSU-rRNA Genes From Metagenome Assemblies
All contigs from the each of the 24 single assemblies and those from the coassembly contigs were analyzed for the presence of the 16S SSU-rRNA gene. Contigs was indexed with Samtools faidx (default settings) (Li et al., 2009) and analyzed with Barrnap (https://github.com/tseemann/barrnap; version 0.9), both running default settings. Annotation as full length or partial sequence was made using the -lencutoff flag, which was set to 0.8. All detected16S SSU-rRNA sequence were then annotated against SILVA version 138.1 (Quast et al., 2013) with the SINA (version 1.7.1) (Pruesse et al., 2012) with a minimum similarity of 0.95. Sets of 16S SSU-rRNA sequences were clustered using CD-HIT (Fu et al., 2012) using default options except for parameters below: sequence identity cutoff (-s) of 0.95 and minimal length similarity (fraction) of 0.6.

Genome Quality Estimates
Genome quality estimation of all all bins obtained from all four different pipelines (individual sample assemblies, single-BAM coassembly, multi-BAM co-assembly and VAMB) was performed by running the CheckM (version 1.0.13) (Parks et al., 2015) lineage_wf workflow using default parameters (except -t 20 -x fa or -t 20 -x fna for VAMB bins). The output was then tabulated with the CheckM qa command using 20 threads (-t 20). MAG quality was then classified using the MIMAG criteria (Bowers et al., 2017) with modifications as follows: 1) MAGs with CheckM completeness (C p ) and CheckM contamination (C n ) values >90 and <5, respectively, were classified as candidates for being high quality (pHQ) genomes bins; 2) MAGs with C p ≥ 50 and C n < 10 were categorized as being of putatively medium quality (MQ); 3) MAGs with C p < 50 and C n < 10 were classified as candidate low quality (LQ) and 4) MAGs that did not fall into any of the above three categories were unclassified (UC). The N50 value (N 50 ) for each MAG was calculated using QUAST version 5.0.0 (Gurevich et al., 2013) with flags -mgm -rna-finding -min-contig 1 -max-ref-number 0. for each MAG, we computed an overall (univariate) quality statistic, Q d as defined by within the dRep workflow (Olm et al., 2017), defined as Q d = C p − 5C n + C n S h 100 + 0.5 log N 50 . MAGs defined as pHQ under the MIMAG criteria were further screened for the presence of tRNAs (minimum of 18) and a complete rRNA operon (defined as the presence of at least one copy of each of the 5, 16, and 23S SSU-rRNA genes, irrespective of whether they were harbored on a single contig or not), and if present were denoted as high quality (HQ) MAGs.

Genome De-replication Procedures
We identified putative sets of cognate genomes using the dRep (version 2.2.3) (Olm et al., 2017) compare workflow executed with default settings with 20 threads (-p 20). Four dereplication analyses were performed; 1) dereplication of the complete set of MAGs; 2) de-replication of the set of 9079 MAGs combining those identified in 1) with the additional MAGs recovered from using VAMB (Supplementary Table 7); 3) de-replication of the set of 142 HQ MAGs from our analyses combined with the set of 3139 MAGs available from previously published MAG analyses of activated sludge communities (see below) and 4) The entire set of MAGs from 2) combined with the 3139 MAGs from references in dRep compare workflow with the same parameters (-p 80 -S_algorithm fastANI -multiround_primary_clustering -sa 0.95 -nc 0.3) as used in a comparable recent de-replication analysis of rumen MAG catalogs (Watson, 2021).

MAG Refinement
We performed decontamination of MAG sequences using RefineM (version 0.0.24) (Parks et al., 2017). Briefly, tetranucleotide signature and coverage profiles for contigs were calculated using scaffold_stats workflow. Contigs with divergent genomic properties were identified with outliers (default settings) and removed using the filter_bins workflow. Genes were predicted using the call_genes workflow and annotated with DIAMOND (Buchfink et al., 2015) against the gtdb_r95_protein_db.2020-07-30.dmnd and gtdb_r95_taxonomy.2020-07-30.tsv databases, within the taxon_profile workflow. Contigs with divergent taxonomic assignments were then identified with taxon_filter and removed with filter_bins. After decontamination, genome quality was reanalysed using CheckM, as described above, and bins reclassified if indicated.

Publicly Available MAG Catalogs
We obtained the following MAG sequence data from the following published studies: 1) a set of 1083 MAGs based on long read metagenome data obtained from 23 wastewater treatment plants in Denmark (Singleton et al., 2021); 2) a set of 2045 MAGs recovered from meta-analysis of Ye et al. (2020), which includes WWTP samples from several locations in China (data collected by the authors of Ye et al., 2020), Singapore (data from Law et al., 2016), Denmark (data from Munck et al., 2015), USA (data from Chu et al., 2018), Argentina (data from Ibarbalz et al., 2016), Slovenia (data from McIlroy et al., 2016) and Switzerland (data from Ju et al., 2019); 3) one MAG sequence available from NCBI submitted from the time-series metagenome survey of a full-scale activated sludge community in Argentina (Buenos Aires) (Pérez et al., 2019); and 4) ten MAG sequences available from NCBI from the metagenome survey of three conventional WWTPs in Taiwan inoculated with exogenous anammox pellets (Yang et al., 2020). We re-estimated the genome quality of all MAGs using the CheckM based approach described above.

Data Visualization
We constructed unrooted phylograms from MAG sequence data using GTDB-Tk (version 0.3.2) based on bacterial single-copy gene sets (bac120_ms gene sets) and imported the .tree file into R using the read.tree function from ggtree package (version 2.4.2) (Yu et al., 2017) and subsequently rendered using the ggtree function. Venn diagrams were constructed using the R package VennDiagram (version 1.6.0) (Chen and Boutros, 2011).

DATA AVAILABILITY STATEMENT
The raw data and high quality MAG sequences analysed in this study are available NCBI (BioProject Accession PRJNA731554), and key data products, including metagenome assemblies and the complete set of recovered MAG sequence data, have been made publicly available on Zenodo (DOI 10.5281/zenodo.5215738). Data results relating to the composition of secondary cluster analyses from each comparative analysis are available from Supplementary Data Files 3, 4, 7, and 8.

AUTHOR CONTRIBUTIONS
The study was designed by YL, SW, and RW. YL and RW designed field sampling procedures, which was conducted by LL and led by YL. LL and TN performed gDNA extractions. DD-M and SS advised on gDNA extraction procedures and protocols and generated short read sequencing data. MH, KA, and RW designed and performed data analysis. The manuscript was primarily written by RW and MH with specific sections contributed by other authors. All authors contributed to data interpretation.

ACKNOWLEDGMENTS
We thank colleagues from the Public Utilities Board (Republic of Singapore) for access to facilities and biomass which permitted this work to be undertaken.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb.

2022.869135/full#supplementary-material
Supplementary Table 1 | Summary of sequencing read data from each of the 24 samples analysed in this study.
Supplementary Table 2 | Assembly and binning statistics of each of the 24 individual assemblies and the two co-assemblies (constructed with SPAdes and MetaBAT2).
Supplementary Table 3 | Summary statistics for the 2,912 secondary (non-redundant) clusters from MAGs recovered by 24 individual assemblies and two co-assemblies (constructed with SPAdes and MetaBAT2).
Supplementary Table 4 | Summary data for all 7,138 MAGs recovered from 24 individual assemblies and two co-assemblies (constructed with SPAdes and MetaBAT2).
Supplementary Table 5 | MAGs from 24 individual assemblies and two co-assemblies that possessed completeness of more than 90%.