Genomic Insights of Multidrug-Resistant Escherichia coli From Wastewater Sources and Their Association With Clinical Pathogens in South Africa

There is limited information on the comparative genomic diversity of antibiotic-resistant Escherichia coli from wastewater. We sought to characterize environmental E. coli isolates belonging to various pathotypes obtained from a wastewater treatment plant (WWTP) and its receiving waters using whole-genome sequencing (WGS) and an array of bioinformatics tools to elucidate the resistomes, virulomes, mobilomes, clonality, and phylogenies. Twelve multidrug-resistant (MDR) diarrheagenic E. coli isolates were obtained from the final effluent of a WWTP, and the receiving river upstream and downstream of the WWTP were sequenced on an Illumina MiSeq machine. The multilocus sequence typing (MLST) analysis revealed that multiple sequence types (STs), the most common of which was ST69 (n = 4) and ST10 (n = 2), followed by singletons belonging to ST372, ST101, ST569, ST218, and ST200. One isolate was assigned to a novel ST ST11351. A total of 66.7% isolates were positive for β-lactamase genes with 58.3% harboring the blaTEM1B gene and a single isolate the blaCTX−M−14 and blaCTX−M−55 extended-spectrum β-lactamase (ESBL) genes. One isolate was positive for the mcr-9 mobilized colistin resistance gene. Most antibiotic resistance genes (ARGs) were associated with mobile genetic support: class 1 integrons (In22, In54, In191, and In369), insertion sequences (ISs), and/or transposons (Tn402 or Tn21). A total of 31 virulence genes were identified across the study isolates, including those responsible for adhesion (lpfA, iha, and aggR), immunity (air, gad, and iss), and toxins (senB, vat, astA, and sat). The virulence genes were mostly associated with IS (IS1, IS3, IS91, IS66, IS630, and IS481) or prophages. Co-resistance to heavy metal/biocide, antibiotics were evident in several isolates. The phylogenomic analysis with South African E. coli isolates from different sources (animals, birds, and humans) revealed that isolates from this study mostly clustered with clinical isolates. Phylogenetics linked with metadata revealed that isolates did not cluster according to source but according to ST. The occurrence of pathogenic and MDR isolates in the WWTP effluent and the associated river is a public health concern.


INTRODUCTION
The role of the environment in the spread of antibiotic resistance is an evolving issue (1). Wastewater treatment plants (WWTPs) have received a lot of attention because of the central role they play in reducing pollutant loads that include antibiotic-resistant bacteria (ARB), antibiotic resistance genes (ARGs), virulence genes, and their associated mobile genetic elements to acceptable limits before the discharge of treated effluent into receiving water bodies.
With inadequately maintained sanitation infrastructure, lowand middle-income countries (LMICs) and emerging economies like South Africa face challenges with the release of untreated or poorly treated effluent into the environment, which may be a driver for the dissemination of antibiotic resistance in these settings (2). Constant monitoring of WWTPs for the release of multi-drug resistant (MDR) bacteria into receiving waters via their effluents is important as it indicates what is disseminated to the environment.
The WWTP investigated in this study is the largest in Pietermaritzburg, the provincial capital of KwaZulu-Natal in South Africa. Runoff from this WWTP is released into the Msunduzi River, a tributary that ultimately discharges into the Umgeni River (3). Upstream of the WWTP, the Msunduzi River receives runoff from rural communities, agricultural areas, urban municipalities (including several hospitals and community health centers), and numerous informal settlements along the river (4). The surface water is a key water source for domestic, agricultural, and recreational purposes to inhabitants of the several informal settlements along its banks. The river water has previously been considered to be polluted with fecal matter and unsuitable for anthropogenic activities (4).
Diarrheagenic Escherichia coli pathotypes are a public health concern (5). Pathogenic MDR E. coli that affects humans and animals have been reported in the water environment (6)(7)(8). However, studies that employ sequencing technologies to investigate environmental E. coli or any other bacteria are rare in Africa, including in South Africa. Consequently, there is little information regarding environmental isolates and their association with other isolates from clinical and agricultural sources. We sought to compare the genomics of MDR environmental E. coli isolates belonging to various pathotypes obtained from a WWTP and its receiving waters using whole-genome sequencing (WGS) and bioinformatics tools in terms of their lineages, resistomes, virulomes, mobilomes, clonality, and phylogenies to determine associations/correlations with clinical, animal, and environmental isolates.

Ethical Consideration
Ethical approval was received from the Biomedical Research Ethics Committee (Reference: BCA444/16) of the University of KwaZulu-Natal. Permission to collect water samples was sought and granted by Umgeni Water, which owns and operates the investigated WWTP.

Study Site and Sample Description
A longitudinal antibiotic resistance surveillance study was undertaken in the uMgungundlovu District, one of 11 districts in the coastal province of KwaZulu-Natal, South Africa. Water samples were collected fortnightly for 7 months from May to November 2018 at the largest urban WWTP in the district. Manual grab water samples were collected in sterile 500-ml containers according to Kalkhajeh et al. (9), upstream (

Bacterial Identification
A total of 580 E. coli isolates were putatively identified during enumeration using the Colilert R -18 Quanti-Tray R 2000 system, followed by phenotypic confirmation on eosin methylene blue (EMB) agar. Briefly, before analysis, bottles containing the water samples were thoroughly mixed and then serially diluted using 10-fold dilutions. Samples from upstream and downstream river water as well as final effluent were diluted 1 ml in 100 ml (0.01 dilution) using sterile water. The influent samples were also diluted by 0.05 ml in 100 ml (0.0005 dilution) using sterile water. The 100 ml from each sample was then analyzed using the Colilert R -18 Quanti-Tray R 2000 System (IDEXX Laboratories (Pty) Ltd., Johannesburg, South Africa). E. coli was obtained from positive Quanti-Trays, subcultured on EMB (Merck, Darmstadt, Germany) and incubated at 37 • C for 18-24 h. At least 10 distinct colonies representing each sampling site were randomly selected from the EMB and further subcultured onto the same medium to obtain pure colonies. Molecular confirmation of the selected E. coli isolates was accomplished using real-time PCR targeting the uidA (β-D-glucuronidase) gene, as was the delineation of E. coli into various diarrheagenic pathotypes [i.e., enterohemorrhagic E. coli (EHEC), enteropathogenic E. coli (EPEC), enteroaggregative E. coli (EAEC), enterotoxigenic E. coli (ETEC), and enteroinvasive E. coli (EIEC)]. All reactions included a no-template control consisting of the reaction mixture. The real-time PCR protocol was done according to Mbanga et al. (10). The primers, virulence genes, and reference strains used to determine pathotypes are shown in Supplementary Table 1. The WGS study sample consisted of a subset of 12 MDR diarrhoeagenic isolates obtained from the upstream, downstream, and effluent sites over the study period. The selection of isolates was based on their antibiograms and pathotypes.

Whole-Genome Sequencing and Bioinformatic Analysis
The genomic DNA was extracted from the E. coli isolates using the GenElute Bacterial Genomic DNA Kit (Sigma Aldrich, St. Louis, MO, USA) following the instructions of the manufacturer before quantification using the 260/280 nm wavelength on a Nanodrop 8000 (Thermo Fisher Scientific Waltham, MA, USA). Library preparation was done using the Nextera XT DNA Library Preparation Kit (Illumina, San Diego, CA, USA) followed by WGS using an Illumina MiSeq Machine (Illumina, USA). Quality trimming of raw reads was done using Sickle v1.33 (https://github.com/najoshi/sickle). The raw reads were then assembled spontaneously using the SPAdes v3.6.2 assembler (https://cab.spbu.ru/software/spades/). All contiguous sequences were subsequently submitted to GenBank and assigned accession numbers (Supplementary Table 2) under BioProject PRJNA609073.
Mutations conferring resistance to fluoroquinolones were determined from the assembled genomes using BLASTN (https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastSea rch). Briefly, DNA gyrase (gyrA and gyrB) and DNA topoisomerase IV (parC and parE) genes and the reference strain E. coli ATCC 25922 (Accession number: CP009072) were aligned with the genomes of this study using BLASTN. The mutations in the isolate of the genomes of this study were manually curated and tabulated.
Phylogenomic Analyses of the E. coli Isolates (n = 12) and Isolates From South Africa Whole-genome sequences of all isolates were uploaded and analyzed on the CSI Phylogeny 1.4 pipeline (https://cge.cbs. dtu.dk/services/CSIPhylogeny/). CSI Phylogeny recognizes, screens, and validates the location of single-nucleotide polymorphisms (SNPs), before deducing a phylogeny founded on the concatenated alignment of the high-quality SNPs. Selection of SNPs was based on default parameters on the CSI Phylogeny, which included: a minimum distance of 10 bp between each SNP, a minimum of 10% of the average depth, mapping quality was above 25, SNP quality above 30, and all insertions and deletions (INDELs) were excluded. The Morganella morganii subsp. morganii KT genome (Accession number: CP004345.1) served as the outgroup to root the tree enabling the easy configuration of the phylogenetic distance between the isolates on the branches. The phylogeny was visualized with annotations for isolate information and in-silico typing (ST) metadata using Phandango (https://jameshadfield.github.io/phandango/#/main) to provide insights into the generated tree.
Additionally, WGS of E. coli isolates from South Africa curated at the PATRIC website (https://www.patricbrc.org/) were downloaded and used alongside the isolates of this study for the whole-genome phylogeny analysis to ensure a current epidemiological and evolutionary analysis (Dataset 1). The generated phylogenetic trees were visualized, annotated, and edited using iTOL (https://itol.embl.de/) and Figtree (http://tree. bio.ed.ac.uk/software/figtree/). Isolates of the same host (human, animal, or bird) or from the environment were highlighted with the same color.

Isolate Characteristics
The 12 E. coli isolates investigated in this study were obtained from the WWTP and its associated waters. Seven isolates were from the downstream site, four were from the upstream site, and one isolate was obtained from the final effluent. The isolates belonged to the diarrheagenic group of E. coli; seven were EAEC, three were EIEC, with one EHEC, and one EPEC (Supplementary Table 2).

Antibiotic Susceptibility
The 12 isolates had varying phenotypic resistance patterns, with most being resistant to AMP (83.3%), SXT (75%), and TET (66.7%) ( Table 1). Some isolates had the same resistance profiles but were isolated at different times (months) from different sampling points. The resistance profiles AMP-TET-NAL-SXT, astA, eilA, lpfA, air, gad, iss TET-NAL-CIP-SXT, and FOX-AMP-AMC-TET-LEX were common to isolates obtained from the downstream and upstream sites of the WWTP. Three isolates, one from downstream and one from the final effluent, had the resistance profile AMP-AMC-SXT ( Table 1). The remaining four isolates had unique resistance profiles AMP-TET-AZM-SXT, AMP-TET-SXT, FOX-AMP-AMC, and AMP-AMC-LEX-CTX-CAZ-CRO-FEP-SXT but were all resistant to AMP.

Genome Characteristics
The genomic characteristics of the E. coli sequences are presented in Supplementary Table 2. The total assembled genome size ranged from 4.7 to 6.1 MB; the GC content ranged from 50.4 to 51.2; and the N50, L50, and the total number of contigs are also shown in Supplementary Table 2.
The ARGs were mostly co-carried on class 1 integrons or associated with insertion sequences and/or transposons ( Table 3). The bla TEM−1B gene was commonly associated with a recombinase, and the IS91 insertion sequence was the most common insertion sequence. The IS91 was also associated with aminoglycoside, trimethoprim, and sulfonamide resistance genes. Insertion sequences, IS5 and IS6, were also found associated with the bla CTX−M55 and mcr-9 genes, respectively. Tn3 transposons occurred either independently or with class 1 integrons ( Table 3). The resistance genes and MGEs in the E. coli isolates were closely related (98-100% similarity) with target sequences in the GenBank database. Most hits were for plasmids, with the most common being the E. coli EcPF40 plasmid p1 (CP054215.1). The rich diversity of ISs and transposons attests to the plasticity of the bacterial genomes and horizontal gene transfer (HGT) of ARGs within and between different isolates.
The co-carriage of heavy metal (mercury and chromate), disinfectant (quaternary ammonium compounds), and ARGs were evident in several isolates. The mercury resistance operon was found associated with a transposase, tetracycline resistance transcriptional repressor tetR(A), and the tetracycline resistance gene tet(A) in two isolates from the upstream (U40) and downstream (D16) sites of the WWTP ( Table 3). The class 1 integrons in isolates D47 (downstream site) and E13 (effluent site) had the PadR (a transcriptional regulator) and chrA (chromate transport protein) downstream and adjacent to the integrons ( Table 3). The synteny of heavy metal, disinfectant, and ARGs in these isolates consisted of the chrA (chromate resistance), qacE 1 gene (which is a disinfectant resistance gene), and the class 1 integron ARG cassette ( Table 3).
Phylogenetics linked with metadata revealed that isolates did not cluster according to source but according to ST (Figure 2). There was a clear association between the presence of the sulfonamide sul2 and the aminoglycoside aph-genes, and the sul1 gene and trimethoprim dfrA genes. The ESBL positive isolates had more resistance genes than the ESBL negative isolates; the number of resistance genes was not linked to the pathotype or clonality. Isolate D69 (ST218) was the only ESBL-negative EAEC isolate and had only one macrolide resistance gene (mdfA), while the other ESBL-positive EAEC isolates had more (Figure 2).
A total of 45 IS families were detected across the isolates (Supplementary Table 3). There was a great diversity of IS families, with only two occurring more than once.
A total of 19 intact prophages were found across all the investigated isolates (Supplementary Table 3). The Entero_mEp460 and Shigel_sfII were the most common prophages occurring in four different isolates each. The Entero_PsP3 (n = 3), Salmon_Fels_2 (n = 3), and Entero_fAA91_ss (n = 2) also occurred in several isolates. None of the prophages carried ARGs; however, some prophages carried virulence genes ( Table 4). The abundance of ISs and prophages in environmental E. coli isolates is evidence of a very flexible genome that is constantly gaining and losing genetic elements through mobilizable regions of the genome.

Virulome and Serotypes
A total of 31 virulence genes were identified across all isolates ( Table 1). Isolates obtained from downstream of the WWTP had the most virulence genes, including D77 (22 virulence genes), followed by D96 (10) and D69 (9) (Supplementary Figure 1). All isolates had at least one virulence gene, with isolates U69 and U88 (from the upstream site) having only one virulence gene each. The most common virulence genes were those encoding immunity gad (11 isolates), iss (8 isolates) and air (four isolates), and adhesion Ipf A (seven isolates) and eilA (five isolates) (Supplementary Figure 1).
The virulence genes were mostly associated with several insertion sequences, including IS1, IS3, IS91, IS66, IS630, and IS481, suggesting that insertion sequences play a prominent role in transferring virulence genes in environmental isolates ( Table 4). The vacuolating autotransporter protein (vat) gene encoding a cytotoxin was mostly found with a transposase and the insertion sequence IS1. The senB gene, which encodes an enterotoxin, was mostly associated with IS91. The insertion sequences IS3 (aafA, B, C, D, capU) and IS66 (nfaE, iha, pet) were associated with different virulence genes. The increased serum survival (iss) gene was bracketed by several prophage genes, including the RzoD (outer membrane lipoprotein), RrrD (lysozyme), and EssD (lysis protein), implying that it is carried on a prophage ( Table 4). Most of the virulence genes and their associated MGEs were similar (98-100%) to target sequences in GenBank, with the most hits being for chromosomal sequences. This indicates that E. coli virulence genes may mostly be carried on chromosomes.
The somatic (O) and flagellar (H) antigens were used for serotyping environmental E. coli isolates where nine different O antigens and 11 different H types were identified across all isolates. No O type was detected for isolate D18, which only had the H31 antigen (Supplementary Table 2). The complexity and diversity of the virulome coupled with the range of identified capsule types are worrying as they are associated with virulence. Environmental isolates have a rich repertoire of virulence genes mobilized mostly by ISs contributing to the dynamic milieu of resistance, virulence, and MGEs in the water environment.

Sequence Types and Phylogenomic Relationships
The MLST analysis revealed that the E. coli isolates belonged to multiple STs. The most common ST was ST69 (n = 4), followed by ST10 (n = 2), the rest had unique STs ST372, ST101, ST569, ST218, and ST200 ( Table 1). Isolate D106 was assigned a novel ST, ST 11351. The phylogenetic analysis combined with metadata revealed that isolates of the same MLST clustered together [e.g., isolates from downstream (D47 and D96) and upstream (U40 and U117) sites that all belonged to ST69 (Figure 1)]. However, it was interesting to note that the isolates clustered according to the isolation site, with the downstream isolates (D47 and D96) forming their subclade. Some single STs from the downstream and effluent sites also clustered together, including D18 (ST372) and E13 (ST569), and also D64 (ST101) and D77 (ST200) (Figure 1). Compared with South African E. coli isolates from different sources (animals, birds, and humans), the isolates from this study mostly clustered with clinical isolates (Figure 2). Isolates U40 (ST69), D96 (ST69), U117 (ST69), and DI06 (ST11351) clustered together and were closely related to a clinical isolate (ST648) obtained from a blood sample in Pretoria Hospital. Isolates E13 (ST569) and D18 (ST372) clustered together and with other clinical isolates (ST998) obtained from urine samples from hospital patients in Pretoria. D77 (ST200) and U69 (ST10) also clustered with clinical isolates obtained from hospital patients in the Western Cape and Pretoria, respectively, albeit in different clades. An isolate D64 (ST101) was closely related to an isolate from a wild bird obtained from Durban (Figure 2). The remaining three isolates, D69 (ST218), U88 (ST10), and D47 (ST69), were more closely related to each other and did not cluster with any isolates from animals, birds, or humans and may be considered a unique aquatic lineage.

DISCUSSION
Genomic insights reported in this study revealed the complexity and diversity of lineages, resistome, mobilome, and virulome of MDR E. coli found in wastewater and river water in Kwazulu Natal, South Africa, intimating that the aquatic environment contains a fluid and dynamic milieu of ARB and ARGs. The ARGs were mostly carried on plasmids, transposable elements, and integrons, and fewer were associated with IS. The virulence genes were mostly associated with IS, which are probably central in their rearrangement and transfer. The occurrence of heavy metal, disinfectant, and ARGs in bacterial isolates is a cause for concern as it may lead to co-selection of ARB. An assortment of ARGs and MGEs was detected among and within the sampled sites ( Table 1). The variation in the ARGs and associated MGEs may reflect numerous, distinct horizontal transfer events among environmental isolates. The occurrence of ARGs, most notably the ESBL, tetracycline, sulfonamide, and macrolide genes, was not dependent on the sample source or clonal type. This contrasts with a study done in the USA that studied ESBL and Klebsiella pneumoniae carbapenemase (KPC) producing E. coli from municipal wastewater, surface water, and a WWTP. WGS of E. coli isolates revealed an association between the sample source and the presence of specific ESBL genes (e.g., bla TEM was unique to municipal wastewater isolates, whereas bla CTX−M was unique to WWTP raw influent isolates (13)). Most ARGs and associated MGEs were carried on plasmids ( Table 3), signifying that plasmids play a central part in the resistome of environmental E. coli isolates. A few ARGs including those encoding tetracycline resistance [tet(A), tet(B), tet(C)], sulfonamides (sul1), and trimethoprim (dfrA7) (carried on a class 1 integron) were found on chromosomes. However, the integrons and transposons were largely associated with ARGs on plasmids, similar to findings in other studies (14). Che et al. (14) used WGS to investigate the ARGs in total DNA extracted from water samples from three WWTPs in Hong Kong and reported that ARGs carried on plasmids were dominant in the resistome of the WWTPs.
In this study, the investigated ARGs were mainly bracketed by transposons, insertion sequences, and class 1 integrons. A novel isolate, D106 (ST11351), had the bla CTX−M−14 , bla CTX−M−55 genes ( Table 1) (16) and have also been reported in several studies on clinical isolates in South Africa (17)(18)(19). However, there are no reports on the occurrence of bla CTX−M−55 in environmental E. coli in South Africa. The bla TEM−1B was found in the same genetic context IS91: bla TEM−1B :recombinase for isolates from the upstream (U40), effluent (E13), and downstream sites (D18 and D77) ( Table 4) and had high sequence similarity to E. coli EcPF40 plasmid p1 (CP054215.1). The bla TEM genes are often plasmid mediated and are the leading cause of AMP resistance in Gram-negative bacteria (20). The IS91 can mobilize adjacent sequences through a one-ended transposition process, and the association with p1 plasmids points to a plasmid-mediated circulation of these genes in the water environment (21,22).
An interesting finding in this study was the occurrence of the plasmid-borne mcr-9 gene in isolate D96 ( Table 1) that also had ESBL, macrolide, and tetracycline resistance genes. Phenotypic resistance to colistin was then determined using the colistin MIC, and the isolate was found to be susceptible (<4 mg/L). The phenotypic susceptibility to colistin in isolates carrying the mcr-9 gene was also reported in a study conducted in the USA where 100 mcr-9 positive Salmonella enterica and E. coli isolates from the National Antimicrobial Resistance Monitoring System (NARMS), which samples retail meat, reported that all 100 isolates were susceptible to colistin, suggesting that FIGURE 2 | Circular phylogenomic tree with color annotations depicting the relationship between E. coli isolates from this study and South African isolates from diverse sources in the one-health continuum. The strains used in this study (from environmental source colored in purple) were basically related to strains from human sources (colored in green).
the mcr-9 gene may not be associated with colistin resistance (23). In this study, the mcr-9 gene was found adjacent to an unknown function cupin fold metalloprotein gene (WbuC) in genetic context mcr-9:WbuC:IS6 (IS26). Similar genetic contexts have been reported in Enterobacter hormaechei, and Salmonella Typhimurium isolates from studies undertaken in the USA and China (24)(25)(26). This points to a stable mcr-9 locus, whose transfer is mediated by insertion sequences. The mcr-9 gene was first reported in a clinical Salmonella Typhimurium isolate in the USA in May 2019 (24). Isolates harboring mcr-9 were subsequently identified in 21 countries covering six continents, including Europe, Asia, America, Oceania, South America, and Africa (27). An mcr-9 harboring Enterobacter hormaechei isolate obtained from the sputum of a patient in Cairo, Egypt, is to date the only mcr-9 positive isolate reported in Africa (28). Thus, this is the second report of the mcr-9 positive isolate in Africa and the first in Southern Africa.
The co-occurrence of aminoglycoside resistance genes [aph(6)-Id: aph(3")-Ib] with sulfonamide and trimethoprim or tetracycline genes revealed the presence of resistance islands located on regions with high similarity to plasmids deposited in GenBank. This implies that the transmission of these resistance genes is plasmid mediated ( Table 3). The genetic context, aph(6)-Id: aph(3")-Ib:sul2, has been found complete or incomplete within plasmids, integrative conjugative elements (ICE), and chromosomes in both Gram-negative and Gram-positive organisms (29).
Most TET-resistant isolates harbored the tet(A) gene and were phenotypically resistant ( Table 1). Only one isolate had tet(B), and another isolate had tet(A) and tet(M), and both were phenotypically resistant to TET ( Table 1). In this study, the tet(A) gene was consistently found within a resistance operon adjacent to a transcriptional repressor gene tetR(A) and a transposase ( Table 3). The genetic context transposase:tetR(A):tet(A) had high similarity to plasmid sequences deposited in GenBank, especially E. coli strain CFS3292 plasmid pCFS3292-1 (CP026936.2). The tet(A) family has been constantly associated with conjugative plasmids, which mediate transfer (30). Isolate U117 had tet(B) and tet(C) flanked by an insertion sequence IS4 (SVa5), which may be important in mobilizing these resistance genes.
The genes encoding resistance to trimethoprim/sulfonamides, sul1, sul2, and dfrA gene cassettes, were detected in 8 (66.7%) isolates. The sul2 genes were consistently co-carried with the aminoglycoside resistance aph-genes [aph(6)-Id: aph(3")-Ib] with sul1 being co-carried with the dfrA and qacE 1 genes ( Table 1). There was concordance in the phenotypic and genotypic results in 7 (58.3%) isolates, with one isolate being phenotypically susceptible to SXT but possessing genotypic resistance traits ( Table 1). A total of 8 (66.7%) isolates harbored class 1 integrons with an array of gene cassettes ( Table 1). The class 1 integrons are directly linked with the Tn3 transposon family (Tn21 or Tn1697), mainly because of their inability to selftransfer; thus, they rely on conjugative plasmids and transposons for their horizontal or vertical transmission (31). Two isolates (E13 and D77) had class 1 integrons that were carried by the Tn21 transposon ( Table 1). U40 was, however, unique in that either side of its class 1 integron had different transposons (namely, Tn21 and Tn402). The Tn402 (Tn5090) may carry class 1 integrons or mercury resistance integrons (MerR) and are characterized by TniABQR genes (32). The TniA codes for a putative transposase, TniB is a nucleoside triphosphate (NTP) binding protein, TniR is a resolvase or integrase, and TniQ is required for transposition (32). The class 1 integrons in isolates D47 and E13 had the PadR (a transcriptional regulator) and the chrA (chromate transport protein) downstream and adjacent to the integrons ( Table 3). The ChrA gene is a heavy metal resistance gene (HMRG) that encodes resistance to chromate and is usually found on plasmids or chromosomes of bacteria (33). The qacE 1 gene is a disinfectant resistance gene (DRG) that encodes resistance to disinfectants of quaternary ammonium compounds (34). A mercury resistance operon was associated with tet(A) resistance genes and transposons in two isolates from upstream (U40) and downstream (D18) of the WWTP ( Table 3). The co-occurrence of HMRGs, DRGs, and ARGs was recently demonstrated in E. coli strains obtained from rivers, streams, and lakes in Brazil (8). The coexistence of HMRGs, DRGs, and ARGs in the studied integrons is important as disinfectants and heavy metals can co-select for ARGs (35). Altogether, these results revealed that ARGs carried on plasmids predominate the investigated water resistome; however, IS, transposable elements, and integrons accentuate the mobility of the plasmid-encoded ARGs and HMRGs.
A huge diversity of virulence genes often associated with pathogenic E. coli was found in the genomes of isolates in this study ( Table 1). Similar virulence factors have been identified in environmental E. coli isolates obtained from surface water and WWTPs in previous studies (8,36). Most virulence genes were associated with the insertion sequences, suggesting that these are important in the mobilization of the bacterial virulome ( Table 4). The iss gene is responsible for increased serum survival and mediates against phagocytosis enabling the evasion of the immune system (37). The genetic environment of the iss gene consisted of bacteriophage genes, implying that it is carried on a prophage in E. coli isolates ( Table 4). The iss gene is thought to have evolved from a λ phage gene called bor, which integrated into the genomes of different E. coli pathotypes (37). Virulence genes are frequently clustered together on the bacterial chromosome in pathogenicity islands (PAIs). In Gram-negative bacteria, the PAIs tend to contain insertion sequences that promote reorganizations and transfer of virulence genes (38). Several virulence genes, including senB (IS91), vat (IS1), and iha (IS66), were associated with IS ( Table 4). The virulome of the environmental isolates investigated in this study revealed a diverse assemblage of virulence genes that are mobilizable and not clone specific.
Phylogenomic analyses revealed that the environmental samples in this study clustered mainly with clinical isolates, mostly from hospital patients (Figure 2). Six EAEC and two EIEC isolates were closely related to clinical isolates, implying that they originated from clinical sources. The spread of EIEC and EAEC is frequently associated with food sources or polluted water (8,39,40) as was the case in this study. An EPEC isolate (D64) was closely related to an isolate from a wild bird (Figure 2). Typical EPEC isolates are rarely isolated from animals as humans are the major natural reservoir; however, atypical EPEC occurs in healthy and sickly animals and humans (5). The EPEC isolate from this study probably originated from an animal host.
This study focused on a small subset of MDR and diarrheagenic E. coli; thus, its findings may not be generalized for all E. coli pathotypes. Similar studies employing a larger sample size and covering greater geographical area and diversity of E. coli should be conducted. However, this study adds to the knowledge that pathogenic E. coli can survive and be disseminated in the water environment, which is a public health concern. CONCLUSIONS The occurrence of pathogenic and MDR isolates in the WWTP effluent and the associated river is a public health concern. E. coli isolates have a wealth of ARGs and virulence genes that have been mobilized on diverse MGEs as evident from the different permutations and combinations of ARGs, virulence genes, and MGEs in E. coli STs and pathotypes from the different water sources. The findings of this study may not be typical of all WWTPs and river systems in South Africa or beyond but form a basis of the need for surveillance systems that employ high-throughput technologies like WGS to gain genomic insights into the environmental dimensions of AMR. Surveillance of ARB in wastewater and associated surface waters could serve as a proxy for local antibiotic resistance and how this changes over time.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found at: https://www.ncbi.nlm.nih. gov/genbank/, PRJNA609073.

AUTHOR CONTRIBUTIONS
SYE, ALKA, and JM co-conceptualized the study. JM, ALKA, and DGA performed the experiments. JM, ALKA, DGA, MA, and AI analyzed the data. JM wrote the paper. SYE, ALKA, and DGA supervised. SYE involved in funding acquisition. All the authors undertook critical revision of the manuscript and also reviewed, edited, and approved the final manuscript.