Enteroaggregative Escherichia coli as etiological agent of endemic diarrhea in Spain: A prospective multicenter prevalence study with molecular characterization of isolates

Background Enteroaggregative Escherichia coli (EAEC) is increasingly associated with domestically acquired diarrheal episodes in high-income countries, particularly among children. However, its specific role in endemic diarrhea in this setting remains under-recognized and information on molecular characteristics of such EAEC strains is limited. We aimed to investigate the occurrence of EAEC in patients with non-travel related diarrhea in Spain and molecularly characterize EAEC strains associated with illness acquired in this high-income setting. Methods In a prospective multicenter study, stool samples from diarrheal patients with no history of recent travel abroad (n = 1,769) were collected and processed for detection of EAEC and other diarrheagenic E. coli (DEC) pathotypes by PCR. An additional case–control study was conducted among children ≤5 years old. Whole-genome sequences (WGS) of the resulting EAEC isolates were obtained. Results Detection of DEC in the study population. DEC was detected in 23.2% of patients aged from 0 to 102 years, with EAEC being one of the most prevalent pathotypes (7.8%) and found in significantly more patients ≤5 years old (9.8% vs. 3.4%, p < 0.001). Although not statistically significant, EAEC was more frequent in cases than in controls. WGS-derived characterization of EAEC isolates. Sequence type (ST) 34, ST200, ST40, and ST10 were the predominant STs. O126:H27, O111:H21, and O92:H33 were the predominant serogenotypes. Evidence of a known variant of aggregative adherence fimbriae (AAF) was found in 89.2% of isolates, with AAF/V being the most frequent. Ten percent of isolates were additionally classified as presumptive extraintestinal pathogenic E. coli (ExPEC), uropathogenic E. coli (UPEC), or both, and belonged to clonal lineages that could be specifically associated with extraintestinal infections. Conclusion EAEC was the only bacterial enteric pathogen detected in a significant proportion of cases of endemic diarrhea in Spain, especially in children ≤5 years old. In particular, O126:H27-ST200, O111:H21-ST40, and O92:H33-ST34 were the most important subtypes, with all of them infecting both patients and asymptomatic individuals. Apart from this role as an enteric pathogen, a subset of these domestically acquired EAEC strains revealed an additional urinary/systemic pathogenic potential.


Introduction
Diarrheal disease is a significant cause of hospitalization and economic losses due to sick leave in developed countries (Guarino et al., 2012;Ridderstedt et al., 2018). The etiological agents include a wide range of bacteria, viruses, and parasites. Among bacterial pathogens, strains of Escherichia coli that cause diarrhea in humans are known collectively as diarrheagenic E. coli (DEC) and traditionally classified into individual pathotypes, with Shiga toxin-producing E. coli (STEC), enteroaggregative E. coli (EAEC), enteropathogenic E. coli (EPEC), enterotoxigenic E. coli (ETEC), and enteroinvasive E. coli (EIEC) being the most important ones (Kaper et al., 2004).
Concretely, EAEC strains colonize the intestinal mucosa via the aggregative adherence fimbriae (AAF), which include five variants (designated I-V) (Nataro et al., 1993;Czeczulin et al., 1997;Bernier et al., 2002;Boisen et al., 2008;Jønsson et al., 2015) and are transcriptionally regulated by the AggR activator (Elias et al., 1999;Boisen et al., 2008). AggR promotes the expression of both chromosomal and plasmid-encoded EAEC virulence factors and is therefore considered as the central regulator of virulence functions in EAEC Harrington et al., 2006). Two examples of genes commonly found in EAEC that are regulated by AggR include aatA, encoding a component of the dispersin transport system, and aaiC, encoding a type VI secretion system Harrington et al., 2006;Petro et al., 2020). Strains harboring the AggR regulon or its components have been termed typical EAEC (Nataro, 2003), and many studies have strongly associated them with diarrhea (Wilson et al., 2001;Pabst et al., 2003;Cohen et al., 2005;Denno et al., 2012). These strains were the focus of our work and therefore, from now on, the term EAEC will specifically refer to "typical EAEC. " Strains showing an aggregative-adherence pattern but not carrying AggR-regulated genes are termed atypical EAEC (Nataro, 2003), and they are considered of uncertain pathogenicity (Tokuda et al., 2010;Boisen et al., 2020), despite having been isolated from food-borne outbreaks of gastrointestinal illness (Itoh et al., 1997). Additionally, EAEC strains often harbor a variable number of serine protease autotransporters of the Enterobacteriaceae (SPATEs) (Boisen et al., 2009).
Although often underdiagnosed, EAEC is frequently detected in both symptomatic and asymptomatic children in developing countries (Rogawski et al., 2017;Manhique-Coutinho et al., 2022) and considered one of the leading causes of travelers' diarrhea (Estrada-Garcia and Navarro-Garcia, 2012). Additionally, there is increasing evidence that EAEC is also associated with domestically acquired diarrheal episodes in high-income countries, particularly among children (Pabst et al., 2003;Shazberg et al., 2003;Cohen et al., 2005;Tobias et al., 2015). However, the specific role of EAEC in endemic diarrhea in industrialized countries remains underrecognized. In Spain, as EAEC infections are not notifiable and no surveillance has been conducted to date, the actual burden of disease is unknown, apart from several studies dealing with the etiology of travelers' diarrhea (Vargas et al., 1998;Palmeiro et al., 2012). To better understand the importance of EAEC as etiological agent of endemic diarrhea in Spain we undertook a prospective study to investigate its occurrence in 1,769 patients with non-travel related diarrhea. Additionally, we investigated the clinical significance of EAEC infections especially among children ≤5 years old, by comparing EAEC prevalence in children with diarrhea (n = 256) and in healthy controls (n = 133). Furthermore, we performed whole-genome sequencing (WGS) on the resulting isolates (n = 120) to determine the molecular characteristics of EAEC strains associated with illness acquired in this setting.

Study design and sample collection
A prospective multicenter study was performed from June 2015 to December 2016 in collaboration with five public tertiary hospitals located in the provinces of Madrid (central Spain), Navarra (northern Spain), Cádiz (southern Spain), Valladolid (central-western Spain), and León (north-western Spain). The collaborating laboratories were asked to submit unduplicated fresh stools from patients of any age with diarrhea and no history of recent travel abroad testing negative to other bacterial enteric pathogens after microbiologic examination. Our case definition included patients with diarrhea, either acute (≥3 liquid or semiliquid stools in 24 h, or at least one with presence of mucus, blood, or pus for up to 2 weeks) or chronic (>4 weeks duration with decreased consistency and increased stool frequency). Cases were recruited from the emergency department, inpatient, and outpatient clinics. The samples were collected according to availability and submitted to the National Center for Microbiology (NCM, Majadahonda, Spain) once a week, up to a minimum of 350 samples per hospital. Before being shipped to NCM, all stools were routinely tested at their respective home laboratory for Salmonella spp., Shigella spp., Campylobacter spp., Frontiers in Microbiology 03 frontiersin.org Yersinia enterocolitica, and Aeromonas spp. (but not for DEC) using conventional microbiological methods. We conducted an additional case-control study among children ≤5 years old living in one of the provinces included in the prospective prevalence study (Madrid). The study population consisted of all the children ≤5 years old with diarrhea for whom occurrence of DEC had been previously investigated over the prospective prevalence study from June 2015 to February 2016 (cases) (n = 256), and a group of randomly selected children ≤5 years old with no history of diarrhea or use of antibiotics for at least 14 days and no history of recent travel abroad (controls) (n = 133). These unpaired control subjects were recruited from June 2016 to February 2017 during primary care pediatric consultations in a health center belonging to the same health care district than the hospital that had provided the cases.

Ethical statement
Since the prospective multicenter study was approved as a part of the routine diagnostic practice, neither specific approval of the respective hospital ethics committees nor informed consent from patients was needed. As for control subjects, ethical approval and permission for the study was obtained from the health care district management (Comisión Central de Investigación, Gerencia Asistencial de Atención Primaria, Servicio Madrileño de Salud; date: May 11, 2016/Reference: 03/2016) and written informed consent was obtained from parents/legal guardians.

Microbiological analysis
Upon receipt, a stool impregnated cotton swab was inoculated in 5 mL of buffered peptone water (BPW, Oxoid, Basingstoke, United Kingdom) and overnight incubated at 37°C. After this non-selective enrichment step, the BPW culture was subcultured on both MacConkey agar (MAC, Becton Dickinson, Franklin Lakes, NJ, United States) and tryptic soy agar (TSA, Becton Dickinson) and overnight incubated at 37°C. A loopful of bacterial growth taken from the first streaking area of the TSA plate was suspended in 0.5 mL of sterile distilled water, boiled for 5 min to release the DNA, and centrifuged at 10,000 rpm for 5 min.
The supernatant was used directly as a template in eight in-house conventional PCR assays for the specific amplification of genes defining each DEC pathotype (Table 1), using DreamTaq DNA Polymerase (Thermo Fisher Scientific, Waltham, MA, USA), according to the manufacturer's instructions. An additional gapAspecific PCR was also run concurrently with the diagnostic PCR assays to ensure that all samples had sufficient bacterial DNA present and no PCR inhibition occurred (Table 1). Thermal cycler conditions consisted of 25 cycles of denaturation at 94°C for 30 s, annealing at 56°C for 40 s, and extension at 72°C for 1 min. Our diagnostic criterion for EAEC infection was the presence of aatA gene, considering its historical specificity (Pabst et al., 2003;Beczkiewicz et al., 2019). According to this criterion, the study focused only on typical EAEC. The diagnostic criteria for other DEC infections were as follows: for STEC, the presence of stx1, and/or stx2, and/or stx2f, and possible additional gene eae, with STEC primers targeting the specific subtypes stx1a, stx1c, stx1d, stx2a, stx2b, stx2c, stx2d, stx2e, stx2f, and stx2g; for EPEC, the presence of eae and possible additional gene bfpA, with the absence of bfpA indicating atypical EPEC (aEPEC); for ETEC, the presence of eltA and/or estA; for EIEC, the presence of ipaH.
When culture tested EAEC-positive, up to 20 individual E. colilike colonies obtained from MAC plates were tested by PCR to obtain the isolate, which was further confirmed biochemically as E. coli by the API 20E system (bioMérieux, Marcy l'Etoile, France).

Whole-genome sequencing
Genomic DNA was purified from the EAEC isolates using the QIAamp DNA Mini Kit (QIAGEN, Hilden, Germany) according to the manufacturer's instructions. A DNA library was generated using the Nextera XT DNA Sample Preparation Kit (Illumina, San Diego, CA, United States) according to the manufacturer's instructions and run on a NextSeq 500 (Illumina) for generating paired-end 150 bp reads, aiming at a coverage of at least 200-fold. The reads were trimmed (FastP, 0.23.2) and filtered according to quality criteria (FastQC,0.11.9), and the quality-filtered reads were de novo assembled by using Unicycler (v0.4.8) (Wick et al., 2017).

Data analysis and molecular characterization
The O and H serogenotypes (in silico serotypes), virulence genes, sequence types (STs), and acquired antibiotic resistance genes, were identified by uploading the reads to SerotypeFinder 2.0, VirulenceFinder 2.0, MLST 2.0, and ResFinder 4.1, respectively, available on the Center for Genomic Epidemiology (CGE) website. 1 The threshold of sequence identity was set to 85% and the percentage of minimum overlapping gene length to 60%. MLST tool used the seven loci (adk, gyrB, fumC, icd, mdh, purA, and recA) scheme. When SerotypeFinder did not predict O antigen it was considered not typeable (ONT). The E. coli phylogroups were determined using the ClermonTyping tool available on the Iameresearch Center website. 2 The presence of the colonization factor CS22 structural gene (cseA, accession no. AF145205.1) was determined by searching the assembled contigs with BLASTn. The presumptive extraintestinal pathogenic E. coli (ExPEC) status was assigned to those isolates positive for ≥2 of the following virulence genes: papA and/or papC, sfa-focDE, afa-draBC, iutA, and kpsMII (Johnson et al., 2003). For this purpose, isolates were considered positive for afa-draBC if a combination of afaB or nfaE and also afaC was identified by WGS and positive for sfa-focDE if a combination of focC or sfaE and also focI or sfaD was identified (Malberg Tetzschner et al., 2020). Likewise, the uropathogenic E. coli (UPEC) status was assigned to those isolates positive for ≥2 of the following genes: chuA, fyuA, vat, and yfcV (Spurbeck et al., 2012;Malberg Tetzschner et al., 2020).

Phylogenomic analysis
The 120 EAEC genomes analyzed in this study were compared with 195 previously sequenced EAEC genomes originating from the United Kingdom, Egypt, Kenya, or Peru (Do Nascimento et al., 2017;Ellis et al., 2020;Petro et al., 2020), as well as the EAEC reference genomes 17-2, 042, and 55,989, and six ExPEC reference genomes (Supplementary Table 1) using Snippy 4.6.0 as previously described 3 . Snippy identified 690,639 conserved SNPs, compared against the reference genome of the E. coli strain IAI39 (accession no. NC_011750.1), that were used to infer a maximum likelihood phylogeny using IQ-Tree 2.1.4 (Minh et al., 2020) with a TVM model and 1,000 bootstrap iterations. The phylogeny was midpoint rooted and annotated with iTOL 6.6 (Letunic and Bork, 2019).
3 http://github.com/tseemann/snippy Additionally, a more specific SNP analysis was performed for each of the most important serogenotype-ST combinations identified in the present study, including isolates from both the present and previous studies (Do Nascimento et al., 2017;Ellis et al., 2020;Petro et al., 2020). The analysis was carried out by uploading the reads to CSI Phylogeny 1.4, available on the CGE website, with the following settings: a minimum depth of 10 at SNP positions, a minimum relative depth of 10% at SNP positions, a minimum distance of 10 bp between SNPs (prune), a minimum SNP quality of 30, a minimum read quality of 25, and a minimum Z-score of 1.96. According to KmerFinder 3.2 results, the published genomes of E. coli strains A41 (accession no. NZ_CP028735.1), ESBL 15 (accession no. NZ_CP041678.1), BR1220 (accession no. NZ_CP093068.1), and H3 (accession no. NZ_ CP028732.1) were used as a reference for EAEC strains belonging to O126:H27-ST200, O111:H21-ST40, O92:H33-ST34, and O3:H2-ST10, respectively. The percentage of the reference genome covered by all isolates of the same serogenotype-ST combination ranged between 82.3 and 90.3%. From the aligned sequences of concatenated SNPs,

Statistical analysis
The sample size for determining the DEC prevalence in the prospective multicenter study was calculated using the website tool www. openepi.com, with a confidence level of 95%, a precision value of 3%, and an anticipated frequency of 9% . The case-control study was conducted with as many healthy controls fulfilling the inclusion criteria as possible. Proportions were compared by a two-tailed chi-square test or Fisher's exact test and odds ratios with 95% confidence intervals were determined. A p value <0.05 was considered statistically significant.
Regarding the acquired antibiotic resistance genes profiles, 40 (33.3%) of 120 isolates harbored genes conferring resistance to at least one antibiotic category and 28 (23.3%) harbored genes conferring resistance to three or more categories and were therefore considered multidrug resistant (MDR) based on the WGS prediction (Supplementary Table 2). There were no common resistance determinant profiles and the highest number of isolates that shared the same genotypic resistance determinant profile (aph(3″)-Ib/aph(6)-Id/sul2/bla TEM-1C ) was seven. As for the presence of extended-spectrum beta-lactamases (ESBLs), conferring resistance to third generation cephalosporins, the bla CTX-M-15 and bla CTX-M-14 genes were harbored by two and one isolate, respectively (Supplementary Table 2).

Discussion
We undertook a large prospective study of diarrheal disease at five hospitals located in different Spanish provinces widely distributed geographically, with the purpose of determining the role of EAEC among patients seeking medical care. The study demonstrated that EAEC is frequently detected among patients with diarrhea in Spain (7.8%), especially in children ≤5 years old, among which EAEC prevalence reached 9.8%. This finding corresponds well with previous studies demonstrating a remarkable predisposition to EAEC infection in children ≤5 years of age and suggesting that the prevalence and significance of EAEC infections depend on age (Pabst et al., 2003;Cohen et al., 2005). Although it is possible that some EAEC detected in this study are not pathogenic or represent colonization rather than infection, the presence of other more established bacterial enteric pathogens (e.g., Salmonella spp., Campylobacter spp.) was ruled out per study protocol, and other clinically relevant DEC pathotypes (STEC, ETEC, and EIEC) were co-detected in only 4.3% of cases of EAEC infection. Furthermore, 90% of our EAEC isolates met the new molecular definition of EAEC comprising E. coli strains harboring AggR and its adhesin dependent factors (AAF(I-V) or CS22), recently proposed by Boisen et al. (2020), and could thus be considered as true EAEC. Therefore, EAEC was the only bacterial enteric pathogen detected in a significant proportion of cases of diarrhea, none of which had a history of recent travel abroad, thus suggesting that EAEC is an important domestically acquired bacterium responsible for endemic diarrhea in Spain. This proportion of patients with non-travel related diarrhea who were demonstrated to be infected with EAEC was unexpectedly high, as EAEC prevalence is expected to be higher among travelers from industrialized countries visiting less-developed regions. However, our findings are supported by studies conducted in other high-income countries that also showed relatively frequent detection of EAEC among patients with diarrhea (Supplementary Table 3), with detection rates ranging from 1.9 to 5.9% in the general population (Wilson et al., 2001;Hardegen et al., 2010;Tam et al., 2012;Cybulski et al., 2018;Hebbelstrup Jensen et al., 2018) and up to 11.9% in children ≤5 years old (Pabst et al., 2003;Tobias et al., 2015). In particular, Pabst et al. (2003) and Cohen et al. (2005) detected EAEC from children ≤5 years with diarrhea significantly more frequently   than from healthy children (11.9% vs. 2.2 and 9.2% vs. 3.3%, respectively), although the association of EAEC with diarrhea did not achieve statistical significance in our case-control study. On the contrary, as expected, our detection rates are substantially lower than those generally reported in developing countries, with EAEC prevalences up to 39% in this setting (Gonzalez et al., 1997;Okeke et al., 2000;Modgil et al., 2021;Manhique-Coutinho et al., 2022). Although there were 26 different serogenotypes among the 120 isolates, 82% were restricted to only nine serogenotypes, very homogeneous with respect to ST, with some particularly common serogenotype-ST combinations, such as O126:H27-ST200, O111:H21-ST40, and O92:H33-ST34, comprising 50% of isolates. This finding is in contrast to previous studies reporting a higher diversity in terms of serotyping and MLST among EAEC clinical isolates from the United States or the United Kingdom (Do Nascimento et al., 2017;Beczkiewicz et al., 2019), probably because such studies did not rule out travel-related infections. These same serotypes are among the most frequently reported in EAEC strains from other high-income countries (Shazberg et al., 2003;Tobias et al., 2015;Hebbelstrup Jensen et al., 2016;Imuta et al., 2016;Do Nascimento et al., 2017;Beczkiewicz et al., 2019), and even linked to outbreaks of gastroenteritis (Yatsuyanagi et al., 2002;Harada et al., 2007;Scavia et al., 2008;Dallman et al., 2012), although rarely identified among EAEC strains from developing countries Petro et al., 2020). Indeed, in the whole-genome phylogeny including 324 EAEC genomes from developing (Egypt, Kenya, and Peru) and highincome (Spain and the United Kingdom) countries (Figure 2), isolates of the most common serogenotype-ST combinations clustered together in independent groups consisting exclusively (O126:H27-ST200 and O111:H21-ST40) or almost exclusively (O92:H33-ST34) of isolates originating from high-income countries. Moreover, as revealed in the whole-genome phylogenies specific for each of these subtypes ( Supplementary Figures 1-3), isolates from the United Kingdom were interleaved with those from Spain belonging to the same serogenotype-ST combination. As isolates in the present study were not travel-related, this suggests that O126:H27-ST200, O111:H21-ST40, and O92:H33-ST34 are the most important domestically acquired EAEC subtypes in Spain, and probably also in other high-income countries. In particular, O111:H21-ST40 strains have been recently proposed to have a higher intrinsic potential to cause diarrheal disease in the United Kingdom (Ellis et al., 2020). Nevertheless, we found O111:H21-ST40, and all the aforementioned Fisher's exact test was applied when one of the observations was less than 5. A p value <0.05 was considered statistically significant. None of the differences observed between isolates from cases and controls was statistically significant. b Per study protocol, all EAEC isolates in this study possessed aatA, as was demonstrated by the diagnostic PCR assay, in spite of the lack of aatA detection by VirulenceFinder in one isolate. c Both AAF/I and AAF/V genes were detected by VirulenceFinder in one isolate. d Including mayor pilin subunits of types F16 (5 isolates), F7-2, and F13 (one isolate each).
Frontiers in Microbiology 10 frontiersin.org common EAEC subtypes, both in isolates obtained from patients and those from asymptomatic controls. Indeed, most of the isolates originating from asymptomatic carriage in the present study showed combinations of phylogroup, serogenotype, ST, virulence genes, and antibiotic resistance genes already found among clinical isolates (Supplementary Table 2). This similarity between isolates from cases and controls was also revealed in the O126:H27-ST200, O111:H21-ST40, and O92:H33-ST34 specific phylogenies ( Supplementary Figures 1-3), in which isolates from controls were interleaved with those from cases belonging to the same serogenotype-ST combination. Although these findings could be influenced by the scarce number of isolates obtained from asymptomatic controls in this study (n = 10), they suggest that the same EAEC strains infected both patients and asymptomatic individuals. This is in contrast to the hypothesis that EAEC strains isolated from patients with diarrhea would belong to different subtypes and/or harbor putative virulence factors distinct from, or more commonly than, those isolated from asymptomatic controls (Boisen et al., 2012. The molecular characterization of isolates together with their origin and collection date could suggest possible unnoticed episodes of transmission of the most important domestically acquired EAEC subtypes (Supplementary Table 2) and this could be assessed from the whole-genome phylogenies specific for each of these subtypes. According to criteria proposed by Pightling et al. (2018) for interpreting WGS analyses of foodborne bacteria for outbreak investigations, monophyletic groups of E. coli isolates with a median pairwise distance of 20 or fewer SNPs, a bootstrap support of 90 or higher, and some epidemiological evidence support transmission episodes. In this study, such analyses revealed four possible episodes of EAEC O126:H27-ST200 transmission involving 2-6 patients (Supplementary Figure 1) AAF/V was the most prevalent variant in our collection, as previously reported in Denmark (Hebbelstrup Jensen et al., 2017), with 70% of our AAF/V-harboring isolates belonging to the predominant EAEC subtype O111:H21-ST40 and thus explaining its predominance in our setting. Likewise, 92% of our AAF/II-harboring isolates and 68% of our AAF/I-harboring isolates belonged to the predominant EAEC subtypes O126:H27-ST200 and O92:H33-ST34, respectively, also explaining their predominance in our setting. Of particular interest is the frequent detection of AAF/IV in our isolates, all of them lacking aar and harboring sepA, as such strains have been recently proposed to be more diarrheagenic than other EAEC Petro et al., 2020). However, certain AAF/IV-harboring isolates in our study showed what appears to be a novel AAF/IV fimbrial cluster where the minor pilin subunit gene agg4B has been replaced by afaD. In particular, this apparently new organization of AAF/IV was found in all O55:H21-ST4213 isolates (n = 5), all O44:H18-ST2959 isolates (n = 2), all O121:H27-ST1891 isolates (n = 2), and some O99:H4-ST10 isolates (n = 3). This finding was also confirmed in five O55:H21-ST4213 isolates and one O44:H18-ST2959 isolate originating from the United Kingdom (Do Nascimento et al., 2017;Supplementary Table 4), thus supporting the idea that the epidemiological scenario of endemic EAEC infections would be very similar in different industrialized countries. The identification and characterization of the genetic environment of this apparently novel AAF/IV fimbrial cluster warrants further investigation. Notably, one isolate was found to harbor the genes for both AAF/I and AAF/V, a phenomenon described previously only for AAF/III and AAF/V (Jønsson et al., 2017;Petro et al., 2020). Again, the identification and characterization of the genetic environment of both AAF variants in this particular EAEC isolate warrants further investigation. One of the isolates without a known AAF variant harbored the cseA gene, indicative of the presence of the non-fimbrial ETEC colonization factor CS22 (Pichel et al., 2000), recently identified in strains lacking an identifiable AAF but harboring different putative EAEC virulence factors and being considered typical EAEC by genomic criteria Petro et al., 2020). This cseA-positive isolate belonged to O9:H21-ST155, which has been recently identified among CS22-like harboring EAEC strains originating from Kenya (Petro et al., 2020) and Mozambique .
Apart from its role as an enteric pathogen, EAEC has emerged as a causative agent of urinary tract infection (UTI) and bacteremia in the last years (Boll et al., 2020;Mandomando et al., 2020). In particular, phylogroup A and AAF/I have been associated with uropathogenicity and AAF/V with bacteremia (Nunes et al., 2017;Mandomando et al., 2020). In the present study, 11 EAEC isolates were classified as presumptive ExPEC and four of them specifically belonged to serotype O3:H2, phylogroup A, and ST10 and harbored AAF/I. Additionally, we analyzed four previously sequenced EAEC O3:H2-ST10 genomes, including the EAEC reference strain 17-2, and three of them were also classified as presumptive ExPEC (Supplementary Table 5). It should be noted that EAEC O3:H2-ST10 isolates classified as ExPEC consistently harbored AAF/I (with the only exception of one isolate harboring AAF/V), whereas isolates not classified as ExPEC harbored AAF/III and clustered together, far from the ExPEC isolates harboring AAF/I and AAF/V, in the specific phylogeny ( Supplementary Figure 4), thus suggesting the importance of AAF/I in extraintestinal EAEC infections. Although the EAEC prototype strain 17-2 had been previously proposed to present some ExPEC/UPEC characteristics (Gomes et al., 1995;Schüroff et al., 2021), to the best of our knowledge, this is the first study to reveal that phylogroup A EAEC O3:H2-ST10 harboring AAF/I is a clonal lineage of EAEC that could be specifically associated with extraintestinal infections. Furthermore, one of the two EAEC isolates classified as ExPEC/UPEC belonged to serotype O153:H4, phylogroup B2, and ST131 and harbored AAF/V and fimH27 (data not shown). It was the only EAEC isolate that belonged to phylogroup B2 in our collection and clustered together with ExPEC/ UPEC reference strains belonging to phylogroup B2 but far from the rest of diarrheagenic EAEC isolates and reference strains (Figure 2), thus supporting the idea that authentic enteric and urinary/systemic pathogens can be found among strains meeting the definition of EAEC . Indeed, the ST131 H27 sublineage is a novel subclone of E. coli ST131 that has acquired the EAEC diarrheagenic phenotype, spread across multiple continents, and caused multiple outbreaks of communityacquired bacteremia and recurrent UTIs (Boll et al., 2020;Mandomando et al., 2020).
Multidrug-resistance defined as antibiotic resistance to at least three antibiotic categories (Magiorakos et al., 2012) is widespread among foodborne and waterborne enteric pathogens, including EAEC (Hebbelstrup Jensen et al., 2014Do Nascimento et al., 2017;Beczkiewicz et al., 2019;Boisen et al., 2020). In our study, 23.3% of the EAEC isolates were considered MDR based on the WGS prediction, and the majority of them originated from children ≤5 years old. As expected, this level of MDR was much lower than that detected in Frontiers in Microbiology 11 frontiersin.org previous studies conducted in developing countries, with MDR detection rates close to 80% . However, this finding is in contrast to previous studies also based on the WGS prediction of antibiotic resistance and conducted in high-income countries like the United Kingdom and the United States (Do Nascimento et al., 2017;Beczkiewicz et al., 2019), with MDR detection rates of 56.8 and 51.6%, respectively, again probably because such studies did not rule out travel-associated infections. Of special concern are the abundant ESBL production and the increased resistance to quinolones in EAEC strains (Herrera-León et al., 2015;Imuta et al., 2016;Guiral et al., 2019). In particular, the presence of CTX-M ESBL variants (bla CTX-M-15 and bla CTX-M-14 genes) was detected only in 2.5% of EAEC isolates obtained from cases of endemic diarrhea in Spain. Again, this finding is in contrast to the 20% detected among EAEC isolates from patients with gastrointestinal symptoms in the United Kingdom, and this is probably due to the extremely high percentage of patients reporting travel abroad within 7 days of onset of symptoms in that study (Do Nascimento et al., 2017). While treatment of EAEC infection is not based on antibiotics in the majority of cases, as many EAEC infections are self-limited, evaluating antibiotic susceptibility is important in cases where antibiotic use is clinically indicated. Our study had several strengths. It is the first Spanish study to explore the role of EAEC in endemic diarrhea and one of the largest studies conducted in an industrialized country to date. As samples were collected from five provinces widely distributed geographically, our results might be representative of the whole country. Unlike most previous studies, we ruled out travel-related diarrheal episodes and those in which other bacterial pathogens were present. We generated one of the most complete characterizations of EAEC strains associated with illness acquired Phylogenomic analysis of the enteroaggregative Escherichia coli genomes. The whole-genome phylogeny was constructed from 690,639 conserved SNP sites per genome that were identified compared against the reference genome of the E. coli strain IAI39 (GenBank accession no. NC_011750.1). The isolates from Spain (this study) are colored in red, those from Egypt, Kenya, or Peru are colored in purple, and those from the United Kingdom are not colored. EAEC and ExPEC reference genomes are colored in gray and yellow, respectively. Isolates obtained from asymptomatic controls are in bold and indicated by a star in the outer ring of labels. The most important serogenotype-ST combinations identified in this study are highlighted in green. The E. coli phylogroups are designated by letters (A, B1, B2, and D) on the interior of the phylogeny based on the inclusion of both EAEC isolates sequenced in this study and reference strains. The tree scale indicates the distance of 0.01 nucleotide changes per site.
Frontiers in Microbiology 12 frontiersin.org in industrialized countries to date. However, it also had several limitations. Detection of EAEC was based on PCR amplification of a well-known EAEC target but functional testing using the Hep-2 adherence assay was not performed. Although the adherence test remains the "gold standard" for diagnosing EAEC infection, it is resource intensive and requires strict adherence to protocol and specialized facilities. This issue could have underestimated the EAEC prevalence in our study and could be especially significant for isolates negative for both AAF and CS22 not meeting the new molecular definition of EAEC. It was not possible to elucidate the exact etiology of the disease outcome, as samples were not tested for the presence of Clostridioides difficile toxins, parasites, or viruses. The scarce number of control subjects fulfilling the inclusion criteria may have compromised the accuracy of some of our results. We did not collect comprehensive data on symptoms, treatments, outcomes, or risk factors. Finally, phenotypic resistance profile information was not available as conventional antimicrobial susceptibility testing was not performed.

Concluding remarks
EAEC was the only bacterial enteric pathogen detected in a significant proportion of cases of endemic diarrhea in Spain, especially in children ≤5 years old. In particular, O126:H27-ST200, O111:H21-ST40, and O92:H33-ST34 were the most important subtypes, with all of them infecting both patients and asymptomatic individuals. A subset of these domestically acquired EAEC strains were additionally classified as ExPEC, UPEC, or both, and belonged to clonal lineages that could be specifically associated with extraintestinal infections, thus revealing an additional urinary/ systemic pathogenic potential. These data highlight the convenience of routinely testing for EAEC especially for children ≤5 years old with diarrheal disease and those patients in which no other pathogen can be identified.

Data availability statement
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary material.

Ethics statement
The studies involving human participants were reviewed and approved by Comisión Central de Investigación, Gerencia Asistencial de Atención Primaria, Servicio Madrileño de Salud. Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin.

Author contributions
SS and RE conceived and designed the study. Material preparation, data collection and analysis were performed by MR, RM-R, FG-S, MF, ME, IO, RR, and ML. The first draft of the manuscript was written by SS and all authors commented on previous versions of the manuscript. All authors contributed to the article and approved the submitted version.