Genomic Epidemiology and Antimicrobial Susceptibility Profile of Enterotoxigenic Escherichia coli From Outpatients With Diarrhea in Shenzhen, China, 2015–2020

Enterotoxigenic Escherichia coli (ETEC) is the leading cause of severe diarrhea in children and the most common cause of diarrhea in travelers. However, most ETEC infections in Shenzhen, China were from indigenous adults. In this study, we characterized 106 ETEC isolates from indigenous outpatients with diarrhea (77% were adults aged >20 years) in Shenzhen between 2015 and 2020 by whole-genome sequencing and antimicrobial susceptibility testing. Shenzhen ETEC isolates showed a remarkable high diversity, which belonged to four E. coli phylogroups (A: 71%, B1: 13%, E: 10%, and D: 6%) and 15 ETEC lineages, with L11 (25%, O159:H34/O159:H43, ST218/ST3153), novel L2/4 (21%, O6:H16, ST48), and L4 (15%, O25:H16, ST1491) being major lineages. Heat-stable toxin (ST) was most prevalent (76%, STh: 60% STp: 16%), followed by heat-labile toxin (LT, 17%) and ST + LT (7%). One or multiple colonization factors (CFs) were identified in 68 (64%) isolates, with the common CFs being CS21 (48%) and CS6 (34%). Antimicrobial resistance mutation/gene profiles of genomes were concordant with the phenotype testing results of 52 representative isolates, which revealed high resistance rate to nalidixic acid (71%), ampicillin (69%), and ampicillin/sulbactam (46%), and demonstrated that the novel L2/4 was a multidrug-resistant lineage. This study provides novel insight into the genomic epidemiology and antimicrobial susceptibility profile of ETEC infections in indigenous adults for the first time, which further improves our understanding on ETEC epidemiology and has implications for the development of vaccine and future surveillance and prevention of ETEC infections.


INTRODUCTION
Enterotoxigenic Escherichia coli (ETEC) is the leading cause of severe diarrhea particularly among young children aged less than five in developing countries, and is also the most common cause of diarrhea in travelers to ETEC-endemic areas, accounting for more than 200 million diarrheal cases and 50,000 deaths annually (Khalil et al., 2018;Fleckenstein and Kuhlmann, 2019). ETEC is defined by production of heat-stable toxin (ST) and/or heatlabile toxin (LT), and ST includes two subtypes, human ST (STh) and porcine ST (STp). STh is the most prevalent enterotoxin associated with human diarrhea, while STp is originally isolated from a porcine source and more prevalent in isolates from animals (So et al., 1976(So et al., , 1978. There is a remarkable genetic diversity of ST and LT, and multiple variants have been identified (Joffré et al., 2015(Joffré et al., , 2016. In addition to enterotoxins, most ETEC isolates express one or more plasmid-encoded colonization factors (CF), which are fimbrial or afimbrial surface structures that enable adherence to intestinal epithelium. At least 27 known or putative CFs have been identified to date, including a novel CF, CS30 identified by genomic analysis, and the most prevalent CFs are colonization factor antigen I (CFA/I) and coli surface antigens 1-6 (CS1-CS6) (Gaastra and Svennerholm, 1996;Nada et al., 2011;von Mentzer et al., 2017;Vidal et al., 2019). Besides the classic virulence factors of enterotoxins and CFs, multiple non-classic virulence factors have been identified in recent years, including a cytoplasmic protein (LeoA), three adhesins (Tia, TibA, and EtpA), an extracytoplasmic protein (CexE), a hemolysin (ClyA), two mucinases (EatA and YghJ), three iron acquisition systems (Irp1, Irp2, and FyuA), and an enteroaggregative heat-stable toxin 1 (EAST1) (Turner et al., 2006;Pilonieta et al., 2007;Del Canto et al., 2011;Gonzales et al., 2013;Luo et al., 2014;Sjöling et al., 2015).
Enterotoxigenic Escherichia coli is genetically highly diverse, and more than 100 serotypes have been identified in clinical isolates (Wolf, 1997;Isidean et al., 2011). Multi-locus sequence type (MLST)-based studies showed that ETEC isolates can be found across five E. coli phylogroups including A, B1, B2, D, and E, with A and B1 being more prevalent (Turner et al., 2006;Sahl et al., 2011). Whole genome sequencing (WGS) of a large collection of representative global ETEC isolates further identified 21 robust lineages (L1-L21) characterized by distinct enterotoxin and CF profiles, which belonged to phylogroups A (12 lineages), B1 (7 lineages), C (1 lineage), and E (1 lineage) (von Mentzer et al., 2014;Denamur et al., 2021). Despite the diversity, a clear association between lineage and enterotoxin, CF, serotype, and plasmid content was identified. For example, close related lineages L1 and L2 encoded CS1 + CS3 and CS2 + CS3 (with/without CS21), respectively, but shared common enterotoxin profiles (STh + LT) and O antigen (O6) (von Mentzer et al., 2014). In addition, WGS has been used to characterize the epidemiology of ETEC infections in Bangladesh and Chile, and identified remarkable diversity of local circulating phylogroups and lineages. However, the dominant pathogenic phylogroups were distinct, with more Bangladeshi isolates belonging to phylogroup B1 while most Chilean isolates were from phylogroup A (Sahl et al., 2017;Rasko et al., 2019).
With the widespread use of antimicrobial agents, antimicrobial resistance (AMR) has emerged in ETEC isolates from both children and travelers with diarrhea (Qadri et al., 2005). AMR to commonly used agents such as nalidixic acid (NAL), ampicillin (AMP), tetracycline (TET), and sulfonamides has been frequently detected in ETEC isolates in Peru (Rivera et al., 2010), Bangladesh (Begum et al., 2016), South Korea (Oh et al., 2014), and China (Li et al., 2017), and the emergence of extended-spectrum β-lactamase (ESBL)-producing ETEC poses a new challenge to clinical treatment and public health (Margulieux et al., 2018;Guiral et al., 2019). Moreover, high-level and multidrug-resistant (MDR) had developed in ETEC isolates, which might be related to heavy clinical use of antimicrobial agents (Begum et al., 2016;Li et al., 2017). Due to the increased AMR in many areas over time, azithromycin and fluoroquinolones have been used as the first-line drugs for ETEC infections (Xiang et al., 2020). However, azithromycin-resistant ETEC also emerged in multiple countries (Abraham et al., 2014;Begum et al., 2016;Xiang et al., 2020), and is highly prevalent in Shanghai, China (Xiang et al., 2020), highlighting the necessity of ongoing surveillance of AMR, especially to first-line drugs.
While ETEC mainly causes diarrhea among children and travelers, most ETEC infections in Shenzhen, a populous developed city in southern China, were from indigenous adults, and the genomic epidemiology of Shenzhen ETEC isolates remains unclear. In this study, we sequenced the whole genomes of 106 ETEC isolates from indigenous outpatients with diarrhea in Shenzhen between 2015 and 2020, and compared them to a global collection of representative E. coli and ETEC genomes (von Mentzer et al., 2014;Horesh et al., 2021), to characterize the genomic diversity and virulence factors of ETEC in Shenzhen. Moreover, we integrated WGS-based in silico AMR mutation/gene detection and antimicrobial susceptibility testing to characterize the AMR profiles of Shenzhen ETEC isolates.

Strain Collection
ETEC strains were isolated from stool samples of outpatients with diarrhea in 16 sentinel hospitals in Shenzhen, China during routine foodborne disease surveillance between 2015 and 2020 as previously described (Li et al., 2017). Stool samples were enriched or inoculated on selective medium to isolate common foodborne pathogens, including Salmonella, Shigella, Vibrio cholerae, Vibrio parahaemolyticus, Staphylococcus aureus, E. coli O157:H7, ETEC, enteropathogenic E. coli, enteroinvasive E. coli, enterohemorrhagic E. coli, Bacillus cereus, group A Streptococcus, and Listeria monocytogenes. For ETEC, stool samples were further inoculated on CHROMagar ECC plates and incubated at 37 • C overnight. Then, three colonies were randomly selected and inoculated on triple sugar iron agar for incubation and identification by screening ST and LT genes using a modified molecular beacon-based multiplex real-time PCR assay . ETEC isolates were defined by either ST or LT gene was positive, and a total of 106 ETEC isolates were included in this study.

Whole Genome Sequencing and Genome Dataset
Genomic DNA was extracted using the QIAamp DNA Mini Kit (QIAGEN, Hilden, Germany) according to manufacturer's instructions. Pair-end libraries with a mean insert size of 350 bp were prepared for sequencing using Illumina NovaSeq 6000 platforms. The average read length is 150 bp, and ∼1.2 Gb clean data were generated for each isolate on average. Shortread sequencing data of Shenzhen isolates have been deposited in the NCBI Sequence Read Archive under the BioProject PRJNA739477, and the accession numbers were listed in Supplementary Table 1. A total of 177 genomes were analyzed in this study, including 106 newly sequenced genomes of Shenzhen isolates and 71 publicly available representative genomes. These representative genomes were from the largest 50 E. coli lineages (Horesh et al., 2021) and 21 global ETEC lineages (von Mentzer et al., 2014), representing the phylogenetic diversity of E. coli and ETEC.
The assembled genomes were annotated using Prokka (Seemann, 2014) with default settings, and the gene annotation results (GFF3 files) were used in Panaroo (Tonkin-Hill et al., 2020) to identify the pan-genome and generate the matrix of accessory gene presence/absence. Genes presented in ≥99% isolates were defined as core genes, and the other genes were defined as accessory genes. EggNOG-mapper (Huerta-Cepas et al., 2019) was used to annotate the Clusters of Orthologous Groups (COGs) classifications of accessory genes.

Serotyping and Multi-Locus Sequence Typing
In silico serotyping was performed using ECTyper 1 based on assembled genomes. MLST sequence type (MLST-ST) was obtained by scanning the sequences of seven house-keeping genes (adk, fumC, gyrB, icd, mdh, purA, and recA) against PubMLST typing schemes using mlst. 2
The ST and LT nucleotide sequences of Shenzhen isolates were extracted from assembled genomes using BLASTN and were aligned against the sequences of previously reported ST and LT variants 5 (Joffré et al., 2015(Joffré et al., , 2016von Mentzer et al., 2021) using MAFFT v7.480 (Katoh and Standley, 2013). Multiple sequence alignments were used to construct the maximum likelihood phylogenetic trees, and the ST and LT variants of Shenzhen isolates were determined based on the phylogenetic trees (Supplementary Figure 1).
PlasmidFinder v2.1 (Carattoli et al., 2014) was used to detect and determine the incompatibility groups of plasmids based on assembled genomes with default settings.

Single-Nucleotide Polymorphism Calling and Phylogenetic Analyses
Core-genome (regions present in >99% strains) singlenucleotide polymorphisms (SNPs) were identified using Snippy v4.6.0 pipeline, 6 with E. coli K12 (accession number: NC_000913) as the reference genome. Briefly, for each isolate, clean sequencing reads were mapped to reference genome using BWA-mem (Li and Durbin, 2009), and SNPs were called using SAMtools  and FreeBayes (Garrison and Marth, 2012). Repetitive regions of the reference genome were identified using Tandem Repeats Finder (TRF) (Benson, 1999) and by self-aligning using BLASTN as previously described (Yang et al., 2019). SNPs located in repetitive regions were removed prior to phylogenetic analysis. The maximum likelihood trees based on non-repetitive core-genome SNPs and alignments of ST and LT nucleotide sequences were constructed using IQ-TREE v2.0.3 (Minh et al., 2020) with auto-detected best-fitting substitution model, respectively.
Phylogroup A has always been dominant during the 6-year sampling period, while the fractions of other phylogroups were variable. At the end of sampling in 2020, only two phylogroups A and B1 persisted (Figure 2A). The fractions of different ETEC lineages were also variable. Lineage L11 was dominant before 2016; since 2017, multiple lineages coexisted and no obvious dominant lineage was identified ( Figure 2B).

Plasmid Incompatibility Groups
In silico plasmid typing analysis identified replicons belonging to eight incompatibility (Inc) groups in all the 106 Shenzhen isolates, with one to four Inc groups present in each isolate ( Table 1 and Supplementary Table 1). The replicon IncFII was most prevalent and can be detected in all Shenzhen isolates, followed by IncFIB (67%, 71/106), IncI1 (31%, 33/106), and IncB/O/K/Z (20%, 21/106); the other replicons (IncFIA, IncI2, LncN, and LncY) were only detected in one isolate, respectively. There was an association between the replication IncFIB and virulence gene CS21, and 72% (51/71) IncFII-positive isolates also encode CS21. Only one type of replicon profile was identified in eight lineages, and the other seven lineages each contained two to four types of replicon profiles (

Pan-Genome Analysis
We performed pan-genome analysis of 127 ETEC genomes of Shenzhen isolates (n = 106) and globally representative isolates (n = 21) to identify genes unique to or significantly associated with Shenzhen isolates. A total of 11,813 pan-genes were identified in these 127 genomes, including 3,307 coregenes (present in ≥99% isolates) and 8,506 accessory genes. There were no genes unique to Shenzhen or non-Shenzhen isolates; however, we identified significant differences (p < 0.01, Fisher's exact tests) in the frequencies of 589 accessory genes between Shenzhen and non-Shenzhen isolates (Supplementary Table 2). There were 178 accessory genes significantly enriched in Shenzhen isolates, of which two were known virulence genes STh and etpA, and the other genes encoded hypothetical proteins with unknown functions (46%) or were associated with replication and repair (COG category L; 37%) and cell wall/membrane/envelop biogenesis (COG category M; 19%). For example, STh was present in 66% Shenzhen isolates and 29% non-Shenzhen isolates, respectively; the gene group_4545 encoding a hypothetical protein was present in 91% Shenzhen isolates but in only 19% non-Shenzhen isolates. In addition, there were 411 accessory genes significantly enriched in non-Shenzhen isolates including the virulence gene LT (Supplementary Table 2).
We further attempted to identify genes unique to two novel lineages L2/4 and L-N1. A total of 25 L2/4 lineage unique genes were identified (Supplementary Table 3), of which 76% encoded hypothetical proteins and 20% were associated with replication and repair (COG category L). There were 105 genes unique to L-N1 lineage (Supplementary Table 3), of which 72% encoded hypothetical proteins, 10% were associated with cell motility (COG category N) and intracellular trafficking and secretion (COG category U), and 5% were associated with transcription (COG category K). None of these L2/4 and L-N1 lineage unique genes were known ETEC virulence genes.

Antimicrobial Susceptibility
To characterize the AMR profiles of Shenzhen isolates, we firstly scanned the genomes to identify AMR related mutations and genes. A total of 21 AMR mutations/genes were detected (Figure 3), which were involved in resistance to five classes of antimicrobial agents, including fluoroquinolone (two mutations and two genes), beta-lactam (seven genes), tetracycline (two genes), macrolide (one gene), and folate pathway antagonist (seven genes).
Unlike the close association of lineage with serotype, MLST-ST and virulence factors, there was no obvious association between lineages and most AMR mutations/genes, with multiple AMR mutations/genes combinations being identified in isolates of a lineage (Figure 3). Notably, most isolates of two major lineages L11 and L2/4 had gyrA p.S83L mutations, and all the L2/4 lineage isolates also carried bla TEM−1B and mph(A) genes.

DISCUSSION
Despite the high prevalence in children and travelers, ETEC also lead to substantial infections in adults in endemic areas (Lamberti et al., 2014). Moreover, it is intriguing that most ETEC infections in Shenzhen (Li et al., 2017), as well as several other developed cities and province in China Qu et al., 2014;Pan et al., 2015), such as Beijing and Shanghai, were from indigenous adults. However, there is currently a lack of understanding on the epidemiology of ETEC infections in indigenous adults. Compared to traditional molecular subtyping methods, WGS not only provides the ultimate resolution on reconstruction the relationships between isolates, but also can be used for secondary in-depth virulence factors and AMR genes analysis, which has been one of the most powerful methods to characterize the epidemiology of pathogens (Didelot et al., 2012;Allard et al., 2016;Ronholm et al., 2016). We reconstructed the WGS-based population structure of Shenzhen isolates in the context of global ETEC and E. coli lineages, and linked the genomic diversity to traditional subtyping results based on serotyping and MLST-ST. In addition, we characterized the virulence factors, AMR mutation/gene and phenotype profiles of Shenzhen isolates. To our knowledge, this is the first study on the genomic epidemiology and AMR profile of ETEC infections in indigenous adults.
Shenzhen ETEC isolates showed a remarkable diversity, which can be attributed to four E. coli phylogroups, with the majority falling into phylogroup A (71%). The phylogroup distribution is different from that of global (von Mentzer et al., 2014) and Bangladeshi (Sahl et al., 2017) isolates [high fraction (>40%) of B1 isolates], and is similar to that of Chilean isolates (phylogroup A: 81%) (Rasko et al., 2019). More specifically, three major lineages, L11 (25%), L2/4 (21%), and L4 (15%), persisted in Shenzhen throughout the 6-year sampling period, together accounting for 60% cases. L11 can be detected in children aged <5 years and adult travelers from multiple countries, but at a low fraction (4%) in the global dataset (von Mentzer et al., 2014). If taken serotype and MLST-ST into consideration, L11 (O159:H34, ST218) isolates was only identified in local diarrheal cases of Korea (Chung et al., 2019) and China Pan et al., 2015;Li et al., 2017), and the virulence factor profiles (STh+, without known CS) of isolates from two countries were identical, indicating the possibility of recent transmission. L2/4 was a previously undefined lineage, which located between L2 and L4 in the phylogenic tree. However, it had a different virulence factor profile (STh: 21/22, STp: 1/22, CS21: 22/22) from L2 (global dataset: STh + LT, CS2 + CS3 ± CS21) or L4 (global dataset: STh/LT, CS6 ± CS21/CS6 + CS8/CS21). The serotype and MLST-ST of L2/4 was O6:H16 and ST48, respectively, and ETEC isolates with this subtyping combination were previously identified from pigs in Denmark (García et al., 2020) and from human cases in China Li et al., 2017). However, no CF was reported in Danish isolates whereas all Shenzhen isolates carried CS21. L4 lineage was also detected at low fraction (6%) in the global dataset (von Mentzer et al., 2014), and the subtyping combination of Shenzhen isolates, i.e., L4 (O25:H16, ST1491), was only identified in isolates from few travelers returning to the FIGURE 3 | Antimicrobial resistance related mutation and gene distribution in Shenzhen isolates. Maximum likelihood tree of 106 Shenzhen isolates was shown on the left, and the background colors indicated the E. coli phylogroups as in Figure 1. The distribution of representative isolates for phenotype testing (red bars), extended-spectrum β-lactamase (ESBL)-producing isolates (blue triangles), multidrug-resistant (MDR) isolates (blue stars), and the presence (gray bars) or absence (white bars) of antimicrobial resistance related mutation and gene were shown on the right.
UK (Boxall et al., 2020) and from local diarrheal cases of China Pan et al., 2015;Li et al., 2017). Taken together, these results revealed that the major circulating ETEC lineages in Shenzhen, as well as in several other developed regions of China were distinctive, which are not the commonly detected global or endemic lineages.
In addition to subtyping, the virulence factor profile of Shenzhen and other Chinese ETEC isolates was also distinctive. ST-positive ETEC has been dominant (76%) in Shenzhen throughout the sampling period and 80% circulating lineages were ST-positive (53%) or contained ST-positive isolates (27%), whereas the fractions of LT-positive and ST + LT-positive isolates and lineages were lower. CS21 was the most prevalent CF (48%) in Shenzhen isolates, followed by CS6 (34%). Similar virulence factor profile was also observed in ETEC isolates from diarrheal cases in Shanghai, China (ST: 74%, CS21: 63%, CS6: 41%) (Pan et al., 2015). In contrast, a systematic review showed that globally, ST-positive, LT-positive, and ST + LTpositive ETEC accounted for 50, 25, and 27% non-travel human infections, respectively, and the most prevalent CFs were CFA/I, followed by CS21 and CS6 (Isidean et al., 2011). There was a considerable variability of virulence factor profiles across regions and populations. High ST-prevalence was only observed among travelers in East Asia/Pacific (75%) and non-travelers in Europe/Central Asia (77%), however, the most prevalent CF in East Asia/Pacific was CFA/I and CS21 was rarely detected (CFprevalence in Europe/Central Asia is unavailable) (Isidean et al., 2011). Interestingly, CS21 was most prevalent CFs in ETECendemic regions including Latin America/Caribbean and the Middle East/North Africa, and among global travelers with a frequency of ∼22% (Isidean et al., 2011). Notably, the most commonly detected CF globally, CFA/I, was not identified in Shenzhen isolates. Moreover, a recent Global Enteric Multicenter Study (GEMS) report  showed that the prevalence of ST, LT, and ST + LT among ETEC isolates from children aged <5 years with moderate-to-severe diarrhea in Africa and Asia were 36, 32, and 32%, respectively; the prevalence from the matched controls were 21, 47, and 33%, respectively. The most commonly detected CFs were CFA/IV (CS6 alone or with CS4 or CS5), CS5 + CS6, and CFA/I. Among all the sampling sites of this study, ST-positive ETEC isolates were not dominant (<50%), and CS21 was rarely detected (<2%). More specifically, the most prevalent ST variants in Shenzhen isolates were STa3/4 and STa5, which were consistent with that in global isolates (Joffré et al., 2016). However, the most prevalent LT variant in Shenzhen isolates was LT17. By contrast, the most prevalent LT variant in global isolates was LT1 (41%), while LT17 was rarely detected (2.1%) (Joffré et al., 2015). The unique ETEC epidemiology in Shenzhen and China, i.e., most infections were from indigenous adults, might be related to several reasons. First, the pathogenicity of China ETEC isolates may be higher in adults than in children, given the distinctive lineage and virulence factor profiles. This hypothesis was supported by the observation in Guatemala (Torres et al., 2015), where ST-positive ETEC infections were significantly more prevalent in adult travelers compared to indigenous children, suggesting higher pathogenicity of STpositive ETEC in adults, whereas most Chinese ETEC isolates were ST-positive. Moreover, a recent study showed that a ST enterotoxin variant STa5 was associated with ETEC infections in adults, suggesting that specific type of ETEC may have higher pathogenicity in adults (Joffré et al., 2016). However, the prevalence of STa5 variant in Shenzhen isolates was only 15%, indicating the existence of other known/unknown virulence genes associated with ETEC infections in adults. We identified multiple accessory genes significantly enriched in Shenzhen isolates by pan-genome analysis including the known virulence genes STh and etpA, which provides candidate targets for further studies. Second, different characterization between children and adult populations, such as eating habits and immunity, may also be associated with the unique ETEC epidemiology. For example, a study showed that dietary calcium can improve human resistance to ETEC infection (Bovee-Oudenhoven et al., 2003), while children usually tend to consume more calcium than the adult population. Besides, it has been reported that breastfeeding may protect infants against severe ETEC infection (Clemens et al., 1997). Furthermore, eating out is a major risk factor of foodborne disease (Liu et al., 2018), and the frequency in the adult population is usually higher than that in children. In addition, the immunity provided by previous ETEC infection or vaccine may decrease over age, leading to more ETEC infections in adults.
Characterizing the antibiotic susceptibility profile of ETEC would be helpful to guide the clinical treatment. In this study, we investigated the AMR mutation/genes and phenotype of Shenzhen ETEC isolates by integrating WGS-based analysis and antimicrobial susceptibility testing. AMR mutation/gene profile was generally concordant with the phenotype testing results of 52 representative isolates, which revealed high resistance rate to NAL (71%), AMP (69%), and AMS (46%). In recent years, high resistance of ETEC to these commonly used agents was also reported in multiple other countries including Peru (Rivera et al., 2010), Bangladesh (Begum et al., 2016), South Korea (Oh et al., 2014), and in other cities of China Pan et al., 2015). Due to the increased AMR, new antimicrobial agents such as azithromycin have been used as the first-line agent for ETEC infection treatment. Azithromycin is a broadspectrum macrolide antimicrobial agent against several bacterial species, and is very effective for Enterobacteriaceae infection treatment (Gomes et al., 2019). However, azithromycin-resistant ETEC isolates from diarrheal patients have recently been reported in several countries at a moderate frequency (10-30%) (Begum et al., 2016;Guiral et al., 2019), and in Shanghai, China at a very high frequency (87%) (Xiang et al., 2020). Macrolides inactivation mediated by macrolide-resistant phosphotransferase mph(A) gene was the most common mechanism for the azithromycin resistance (Gomes et al., 2019;Xiang et al., 2020). We found that 38% representative Shenzhen ETEC isolates were resistant to azithromycin, and most of these isolates carried mph(A) gene. Moreover, we showed that the prevalence of ESBLproducing ETEC isolates from outpatients was 31% in Shenzhen, China, which is similar to that of diarrheal patients in Nepal post-2013 (30%) (Margulieux et al., 2018) and travel cases to Southeast Asia/India (43%) (Guiral et al., 2019). In addition, nearly all the isolates of a major circulating lineage L2/4 carried gyrA p.S83L mutation, bla TEM−1B and mph(A) genes, and AMR phenotype testing showed that all the representative L2/4 isolates were resistant to AMP, AMS, and AZI and most (11/12) of them were resistant to NAL, suggesting that L2/4 was a MDR lineage. The identification of azithromycinresistant, ESBL-producing ETEC and a major MDR lineage L2/4 in Shenzhen highlighted the importance of ongoing AMR surveillance.
In summary, during routine foodborne disease surveillance, we found that the epidemiology of ETEC infections in Shenzhen, China is distinctive, with most infections occurring in indigenous adults. By integrating WGS and antimicrobial susceptibility testing, we characterized the temporal dynamics of population structure and virulence factors, and the AMR mutation/gene and phenotype profile of Shenzhen ETEC isolates in 6 years. Shenzhen ETEC isolates showed a remarkable high diversity, which belonged to four E. coli phylogroups and 15 ETEC lineages, and the major virulence factors were enterotoxin ST and CF CS21 and CS6. Intriguingly, the major circulating lineages in Shenzhen and their virulence factor profiles were distinctive, which are different from the commonly detected global or endemic ETEC lineages. Furthermore, we showed that AMR mutation/gene profiles of genomes were concordant with the phenotype testing results, and revealed that Shenzhen isolates had high resistance rates to several commonly used antimicrobial agents and identified a MDR lineage. To our knowledge, our study provides novel insight into the genomic epidemiology and antimicrobial susceptibility profile of ETEC infections in indigenous adults for the first time, which will not only enhance our comprehensive understanding on ETEC epidemiology, but also have implications for the development of vaccine and future surveillance and prevention of ETEC infections.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/ Supplementary Material.