Genotyping Reveals High Clonal Diversity and Widespread Genotypes of Candida Causing Candidemia at Distant Geographical Areas

The objectives of this study were to gain further insight on Candida genotype distribution and percentage of clustered isolates between hospitals and to identify potential clusters involving different hospitals and cities. We aim to genotype Candida spp. isolates causing candidemia in patients admitted to 16 hospitals in Spain, Italy, Denmark, and Brazil. Eight hundred and eighty-four isolates (Candida albicans, n = 534; C. parapsilosis, n = 282; and C. tropicalis, n = 68) were genotyped using species-specific microsatellite markers. CDC3, EF3, HIS3, CAI, CAIII, and CAVI were used for C. albicans, Ctrm1, Ctrm10, Ctrm12, Ctrm21, Ctrm24, and Ctrm28 for C. tropicalis, and CP1, CP4a, CP6, and B for C. parapsilosis. Genotypes were classified as singletons (genotype only found once) or clusters (same genotype infecting two or more patients). Clusters were defined as intra-hospital (involving patients admitted to a single hospital), intra-ward (involving patients admitted to the same hospital ward) or widespread (involving patients admitted to different hospitals). The percentage of clusters and the proportion of patients involved in clusters among species, genotypic diversity and distribution of genetic diversity were assessed. Seven hundred and twenty-three genotypes were detected, 78 (11%) being clusters, most of which (57.7%; n = 45/78) were intra-hospital clusters including intra-ward ones (42.2%; n = 19/45). The proportion of clusters was not statistically different between species, but the percentage of patients in clusters varied among hospitals. A number of genotypes (7.2%; 52/723) were widespread (found at different hospitals), comprising 66.7% (52/78) of clusters, and involved patients at hospitals in the same city (n = 21) or in different cities (n = 31). Only one C. parapsilosis cluster was a widespread genotype found in all four countries. Around 11% of C. albicans and C. parapsilosis isolates causing candidemia are clusters that may result from patient-to-patient transmission, widespread genotypes commonly found in unrelated patients, or insufficient microsatellite typing genetic discrimination.

The objectives of this study were to gain further insight on Candida genotype distribution and percentage of clustered isolates between hospitals and to identify potential clusters involving different hospitals and cities. We aim to genotype Candida spp. isolates causing candidemia in patients admitted to 16 hospitals in Spain, Italy, Denmark, and Brazil. Eight hundred and eighty-four isolates (Candida albicans, n = 534; C. parapsilosis, n = 282; and C. tropicalis, n = 68) were genotyped using species-specific microsatellite markers. CDC3, EF3, HIS3, CAI, CAIII, and CAVI were used for C. albicans, Ctrm1, Ctrm10, Ctrm12, Ctrm21, Ctrm24, and Ctrm28 for C. tropicalis, and CP1, CP4a, CP6, and B for C. parapsilosis. Genotypes were classified as singletons (genotype only found once) or clusters (same genotype infecting two or more patients). Clusters were defined as intra-hospital (involving patients admitted to a single hospital), intra-ward (involving patients admitted to the same hospital ward) or widespread (involving patients admitted to different hospitals). The percentage of clusters and the proportion of patients involved in clusters among species, genotypic diversity and distribution of genetic diversity were assessed. Seven hundred and twenty-three genotypes were detected, 78 (11%) being clusters, most of which (57.7%; n = 45/78) were intra-hospital clusters including intra-ward ones (42.2%; n = 19/45). The proportion of clusters was not statistically different between species, but the percentage of patients in clusters varied among hospitals.
A number of genotypes (7.2%; 52/723) were widespread (found at different hospitals), comprising 66.7% (52/78) of clusters, and involved patients at hospitals in the same city (n = 21) or in different cities (n = 31). Only one C. parapsilosis cluster was a widespread genotype found in all four countries. Around 11% of C. albicans and C. parapsilosis isolates causing candidemia are clusters that may result from patient-to-patient transmission, widespread genotypes commonly found in unrelated patients, or insufficient microsatellite typing genetic discrimination.

INTRODUCTION
Among invasive fungal infections, candidemia is the most common condition in hospitalized patients, representing an overall incidence rate of 3.88 cases per 100,000 hospital admissions (Koehler et al., 2019). With Candida albicans on the top of etiological agents of candidemia, infections with other species, including Candida parapsilosis, Candida tropicalis, and Candida glabrata are on the rise (Lamoth et al., 2018;Pappas et al., 2018). Prevalence of candidemia in high-risk patients such as critical care patients, oncology, and solid organ transplant recipients ranges from 2 to 11% (Ostrosky-Zeichner et al., 2007;Eggimann et al., 2011). Candidemia is associated with high mortality and high inpatient cost (Wan Ismail et al., 2019). Attributable mortality ranges from 11 up to 47% (Bilir et al., 2015).
Candidemia is a nosocomial infection that affects patients with diverse underlying conditions often caused by the use of intravascular catheters or translocation of endogenous isolates from the gut microbiota to the bloodstream (Puig-Asensio et al., 2014). Exogenous patient-to-patient transmission may represent an alternative route of infection. Genotyping may help clarify potential Candida outbreaks and transmission in hospitalized patients (Escribano et al., 2013(Escribano et al., , 2018 and clarify if certain genotypes occur in different patients-namely clusterspotentially suggesting a common isolate niche (Escribano et al., 2013;Hammarskjold et al., 2013).
We have previously shown different percentages of clustered isolates in Spanish and Italian hospitals, suggesting dissimilar infection control policies (Marcos-Zambrano et al., 2015). For example, campaigns to reduce the number of catheterrelated infections led to a decrease in the number of clusters (Escribano et al., 2013(Escribano et al., , 2018. However, some clusters involved patients who were either admitted to the same hospital but without a clear epidemiological link (Escribano et al., 2013(Escribano et al., , 2018 or admitted to different hospitals that in occasions were located in different countries (Marcos-Zambrano et al., 2015). Such "epidemiologically unexplainable" clusters may indicate a limited discriminatory potential of the typing method or common widespread clones rather than active patient-topatient transmission.
The aim of this study is to genotype Candida spp. isolates causing candidemia in subjects admitted to 16 hospitals in Spain, Italy, Denmark, and Brazil with high or very low likelihood of having been in contact. We provide insight into the genotypes and percentages of clustered isolates between these hospitals, and identify clusters involving different hospitals and cities.

Participating Hospitals and Clinical Isolates
Sixteen tertiary hospitals located in Spain (Madrid, n = 4; Valencia, n = 1), Italy (Rome, n = 2), Brazil (São Paulo, n = 1), and Denmark (three clinical microbiological laboratories that together served 8 university hospitals in Copenhagen) participated in this multicentre study. Isolates were collected from consecutive episodes of candidemia (one isolate per patient; n = 884) caused by C. albicans (n = 534), C. parapsilosis (n = 282), or C. tropicalis (n = 68) diagnosed in patients admitted to the above-mentioned hospitals between January 2014 and December 2015. Isolates were identified by MALDI-TOF mass spectrometry or molecular methods (White et al., 1990;Marklein et al., 2009;De Carolis et al., 2014;Normand et al., 2019). The number of admissions, incidence and number of episodes per species during the study period are shown in Table 1. All participating hospitals have active taskforces, including microbiologists, infectious disease specialists and nurses, to monitor and control infections, and have implemented antimicrobial stewardship programs or, so called, zero bacteraemia programs. Details of such campaigns at each hospital were not available.
Capillary electrophoresis using the ABI 3130xl analyser (Applied Biosystems-Life Technologies Corporation, Carlsbad, California, USA) and the GeneScan ROX 500 bp marker (Applied Biosystems-Life Technologies Corporation, Carlsbad, California, USA) was performed on the PCR products. Electropherograms were analyzed with the GeneMapper R v.4.0 software (Applied Biosystems-Life Technologies Corporation, California). A control strain from each species was used in each run to ensure size accuracy and avoid run-to-run variations. The number of base pairs determined the size of alleles in each locus.

Genotype and Cluster Analysis
Allele results were converted to binary data by scoring the presence or absence of each allele. Data were treated as categorical, and the genetic relationship between genotypes was examined by constructing a minimum spanning tree (BioNumerics version 6.6, Applied Maths, Sint-Martens-Latem, Belgium). Isolates were considered to have identical genotypes when they showed the same alleles in all loci. Different genotypes were codified as follows: CA-X (C. albicans), CP-X (C. parapsilosis), and CT-X (C. tropicalis), X representing the internal code of the genotype in our collection. Definitions of singleton genotypes and clusters are summarized in Table 2.
We compared the percentage of clusters and the proportion of patients presenting clusters among species and hospitals using a standard binomial method (95% confidence intervals) (Epidat 3.1 software, Servicio de Información sobre Saúde Pública de la Dirección Xeral de Saúde Pública de la Consellería de Sanidade, Xunta de Galicia, Spain).

Genotypic Diversity Analysis
We assessed the following diversity parameters: (a) number of alleles per locus; (b) observed (direct count) heterozygosity (Ho), calculated as the number of heterozygous genotypes over the total number of genotypes analyzed for each locus; (c) expected heterozygosity (He; calculated as He = 1-Σpi 2 , where pi represents the frequency of the i th allele) (Nei, 1973); Wright's fixation index [F = 1-(Ho/He)] shows the relationship between Ho and He and detects heterozygote excess or deficiency (Lenardon and Nantel, 2012); and (d) the probability of identity among unrelated individuals [PI= 1-Σpi 4 + ΣΣ(2pipj) 2 ], where pi and pj represent the frequencies of the i th and j th alleles, respectively and measures the likelihood that two randomly drawn diploid genotypes will be identical, assuming the observed allele frequencies and random assortment (Paetkau et al., 1995). Significant deviations from the Hardy-Weinberg equilibrium (HWE) at individual loci were tested using the Markov chain method to determine if a population was clonal. For the analyses the IDENTITY 1.0 (Wagner et al., 1999) and ARLEQUIN version 3.01 (Excoffier et al., 2005) programs were used.

Distribution Analysis of Genetic Diversity
Molecular variance analysis AMOVA) was performed to determine the distribution of genetic diversity based on the number of alleles with 10,000 permutations (ARLEQUIN version 3.01) (Excoffier et al., 2005). AMOVA (term and model inspired by ANOVA but adapted to molecular data) is a statistical model that allows estimating population differentiation directly from molecular data. It tests hypotheses on such differentiations for the molecular algorithm in a single species, usually biological. Different hierarchical levels-populationswere established among countries, between singleton genotypes and clusters (overall and clusters grouped by countries). F statistics for each hierarchical level were computed. Pairwise FST values obtained from AMOVA were used to measure the genetic differentiation between populations and its significance (P < 0.05) was assessed using a non-parametric permutation  (Excoffier et al., 2005). Pairwise FST values between countries were represented by Principal Coordinate Analysis (PCoA) with the GENEALEX software (Peakall and Smouse, 2012). PCoA allows exploring and visualizing data similarities or dissimilarities.
A number of genotypes (7.2%; 52/723) were widespread (C. albicans, n = 32/453; C. parapsilosis, n = 18/206; C. tropicalis, n = 2/64) and were occasionally found as intrahospital clusters at some hospitals (Table 4). Overall, widespread genotypes comprised 52 of the 78 detected clusters and involved patients at hospitals in the same (n = 21) or in different cities (n = 31) ( Figure 2); C. tropicalis widespread genotypes (n = 2) were found exclusively in two hospitals in Copenhagen. A number of widespread C. albicans clusters involved 2-9 patients each and five clonal complexes were found. Interestingly, at least one genotype fulfilling the definition of widespread and intra-hospital cluster was detected in four of the five clonal complexes (Figure 2A). Clonal complex number 5 included the highest number of clusters (n = 5) and patients (n = 25). It is worth noting that four out of the five clusters were widespread and intra-hospital clusters and many of them were found in Fondazione Policlinico A. Gemelli hospital. As for C. parapsilosis widespread genotypes, we found some clusters affecting several patients each (2-12 patients), whereas the number of clonal complexes was lower than those found for C. albicans. Again, at least one genotype fulfilling the definition of widespread and intra-hospital cluster was detected in each clonal complex (Figure 2B). Clonal complex number 2 was particularly significant due to the high number of genotypes (n = 4) and patients (n = 28) involved. Three out of the four clusters were widespread and intra-hospital clusters in different hospitals ( Table 4). Only one C. parapsilosis cluster was a widespread genotype found in the four countries.

Genetic Diversity and Analysis of Molecular Variance
Diversity parameters are shown in Table S2. The used microsatellite marker appeared to be highly discriminatory since the probability of identity for C. albicans, C. parapsilosis and C. tropicalis was very low (7.7 × 10 −10 , 1.2 × 10 −6 , and 4 × 10 −8 , respectively). Probability of identity values close to zero indicates high discriminative power of the used marker. In this work, total diversity of C. albicans isolates was elevated as shown by the high number of alleles found (N = 195; mean of 32.5 alleles/marker), the low frequency of null alleles (Na = 0.08), and the observed and expected heterozygosity (Ho = 0.67 and He = 0.84, respectively). Contrarily, the diversity of C. parapsilosis and C. tropicalis isolates was lower.
Wright's index (F) indicates a deficiency (positive values) or excess (negative values) heterozygosity. Heterozygote deficiency TABLE 3 | Number of singleton genotypes and intra-hospital/intra-ward clusters found in each participating hospital. Allele frequencies of all loci differed significantly (P < 0.05) from what was expected in a population in Hardy-Weinberg equilibrium for the three species. This suggests clonal expansion due to migrations between countries/hospitals, although we should not exclude other factors including genetic drift, natural selection, sexual selection, mutation, gene flow, meiotic drive, genetic hitchhiking, population bottleneck, founder effect and inbreeding. Diversity parameters were assessed by country with no significant differences ( Table S2).
The AMOVA analysis per species showed that most estimated variations were found within the strains of a given population; however, pooling isolates by geographical regions showed that a small but significant proportion of variations may be attributed to differences among countries (3% C. albicans, 2.22% C. parapsilosis, and 5.63% C. tropicalis, P < 0.001) ( Table 5). Likewise, this analysis revealed differences between singleton genotypes and clusters (1.5% C. albicans, 1% C. parapsilosis, and 5.96% C. tropicalis, P < 0.05) ( Table 6).
FST measures the genetic differentiation between populations with lower values indicating scarce differentiation. C. tropicalis showed the highest differentiation among countries (FST = 0.05, P<0.001) and between singletons and clusters (FST = 0.05, P < 0.05). Principal coordinate representation of FST values for the three species shows low but significant differentiation between countries, with isolates from Spain and Italy being more closely related (Figure 3).

DISCUSSION
The genotyping analysis performed in this study reveals widespread Candida genotypes causing candidemia at distant geographic areas.
Previous studies suggest that Candida genotyping is useful to study hospital infection outbreaks under the umbrella of clinical suspicion (Diekema et al., 1997;Kuhn et al., 2004;Lasheras et al., 2007). Conversely, blindly genotyping of consecutive candidemia isolates unveiled lurking genotypes actively causing infections at some wards (Escribano et al., 2013) where implementation of campaigns to prevent catheter-related infections had a positive impact (Escribano et al., 2018). Moreover, differences in the number of clusters among hospitals may indicate dissimilar infection control policies with potential room for improvement (Marcos-Zambrano et al., 2015). Results from our previous abovementioned studies led us to conclude that the higher the incidence of candidemia in a given period of time, the higher the percentage of clusters. Additionally, some clusters involve epidemiologically linked patients (same ward within 12 months) suggesting either nosocomial transmission between patients or outbreaks, whereas other clusters involve patients who are apparently epidemiologically unrelated. The presence of these unexplainable clusters led us to coin the term "widespread genotypes, " making reference to patients admitted to different hospitals. Here, we enlarge the number of isolates and participating hospitals located at various geographic regions that Cluster code in bold indicates intra-ward cluster and the hospital at which they were found; genotypes CP-023 and CP-065 are widespread clusters and also intra-ward clusters in Fondazione A. Policlinico Gemelli and Puerta de Hierro hospitals, respectively. would resemble differences in hospital management and variable candidemia incidence. The prevalence of all three Candida species is consistent with previously reported candidemia epidemiology (Colombo et al., 2017). Even though most patients in this study were infected by singleton genotypes (89%), clusters were detected in around 11% of them. Notably, the proportion of patients with clusters and incidence of candidemia differed between hospitals, with the lowest incidence found in Denmark and the highest in Brazil, which may indicate distinct campaigns to control hospital infections, adherence to infection control guidelines, type of population at risk of acquiring candidemia in each hospital, and/or differences in patient length of stay at the health care center. The result of candidemia incidence does not necessarily match with the number/proportion of clusters. C. tropicalis and C. albicans are, respectively, the species with the lowest and highest number of clusters, and C. parapsilosis (n = 17) presenting an intermediate number.
When intra-ward clusters are analyzed, the number of C. albicans clusters outnumbers those of C. parapsilosis (n = 7 vs. n = 10). This indicates that C. parapsilosis clusters resist a more stringent definition of cluster and may show patient-to-patient transmission. Candidemia-related outbreaks by C. parapsilosis are relatively common (Diab-Elschahawi et al., 2012;Singh et al., 2019;Toth et al., 2019).
The genetic definition of cluster may be misleading when trying to disentangle hospital patient-to-patient transmission, particularly in apparently unrelated patients. When epidemiological information is considered, clusters may be tagged differently. The most restrictive definition of intra-ward cluster requires that the affected patients be admitted to the same hospital ward thereby increasing the likelihood of patient-to-patient transmission. Contrarily, intra-hospital clusters may indicate the presence of an endemic genotype in the hospital that infects patients admitted to different wards. An intra-hospital cluster not meeting the definition of intra-ward cluster is an enigma and may result from a lack of microsatellite discriminatory power. However, there are other possible explanations. First, given the retrospective nature of most studies, patients may have acquired the infection not in the niche of the genotype, as shown by Clostridioides difficile (Tarrant et al., 2018). Second, some genotypes may be more frequent and better adapted than others, persisting in hospital facilities for a longer time. Finally, we speculate that the higher the number of patients colonized by certain genotypes, the higher the chances to find the genotypes causing candidemia. Furthermore, the presence of widespread genotypes is even more difficult to interpret and we cannot rule out that certain clones may have spread worldwide. Some genotypes can be found in different countries, as reported with Candida glabrata (Al-Yasiri et al., 2016;Adams et al., 2018) or more recently on C. auris (de Groot et al., 2020).
C. auris was first detected in 2009 and since then different local clades have been identified. These clades are thought to be diverging locally over time. Coexistence of several geographic clades have identified at hospitals admitting patients from separate endemic areas of C. auris infections (Borman et al., 2017;de Groot et al., 2020). With this scenario, similar clones infecting different patients may not be an indication of patient-to-patient transmission but a previous colonization by the specific clade even before the admission of the patient to the hospital. Migration from endemic areas may facilitate clone dissemination (Al-Yasiri et al., 2016;Carrete et al., 2018;de Groot et al., 2020). As C. albicans and C. parapsilosis have had more time to evolve than C. auris, the number of genotypes found for these two species is higher than for C. auris. Widespread genotypes may resemble what is seen for the different C. auris linages. That is, widespread genotypes resemble clonal genotypes with longer time to spread worldwide by colonizing many individuals (de Groot et al., 2020), demonstrated the presence of identical C. auris genotypes in different countries.
The lack of geographic-specific genotypes shown by the AMOVA analysis points to a similar genetic background and a common clonal ancestor with an effective spread pattern. C. albicans and C. parapsilosis populations are not in Hardy-Weinberg equilibrium, suggesting that the genetic background in these species is mostly clonal or followed by selection pressure by interaction with the host or the environment (Sabino et al., 2010). Other markers, such as multilocus sequence typing, have previously suggested a clonal origin of C. albicans isolates (Tavanti et al., 2005).
There is no neat genotype geographic-specific distribution. However, genotypes from Italy and Spain are more similar among them in comparison to the others. This observation, FIGURE 2 | Minimum spanning tree showing the C. albicans (A) and C. parapsilosis (B) widespread genotypes. Circles represent different genotypes and circle size the number of isolates belonging to the same genotype. Colors indicate the country from which the isolate was obtained. Connecting lines between the circles show profile similarities; solid bold lines indicate differences in only 1 marker (clonal complexes are shadowed in gray), the solid line indicates differences in 2 markers, long dashes in dashed line indicate differences in 3 markers, and short dashes indicate differences in 4 or more markers. Genotypes in bold indicate widespread genotypes and intra-hospital clusters. Denmark (red), Spain (purple), Italy (yellow), and Brazil (green). which in turn may suggest a more frequent dissemination of genotypes between two neighboring countries, may be a consequence of the large number of isolates from the these two countries.
Our study has some limitations: First, we cannot rule out that some clusters may be a consequence of insufficient microsatellite typing discriminatory power-although diversity parameters do not support this-due to events of genomic recombination or partial chromosomal aneuploidy (Hirakawa et al., 2015). Whole genome sequencing of isolates of the largest clonal complexes are justified. Second, we missed some isolates causing candidemia during the study period, which prevented us from confirming all potential clusters. Third, we had an asymmetric collection of isolates as some countries contributed with more isolates than *AMOVA values measure the genetic differentiation between populations. Numbers shown in bold indicate statistically significant differences (P < 0.05).
FIGURE 3 | Principal coordinates analysis showing the genetic differentiation of C. albicans (A), C. parapsilosis (B), and C. tropicalis (C) among isolates grouped by country. Symbols represent the variability found within each country; symbols in the same square are more likely to be related among them rather than to those located in other squares.
others. Fourth, we did not genotype environmental isolates that may be prevalent in the hospitals or assessed their frequency in those wards of hospitalization before the study period.
Fifth, culture-negative cases of invasive candidiasis may lead to underestimate the number of clusters. Finally, clinical data was not collected, preventing us to decipher if clusters could be associated with specific clinical conditions or if they represented patient-to-patient transmission.
In conclusion, we show that around 11% of C. albicans and C. parapsilosis isolates causing candidemia are clusters originated from either patient-to-patient transmissions, widespread genotypes commonly found in unrelated patients, or lack of microsatellite genetic discrimination. Further studies using whole genome sequencing will help decipher the usefulness of microsatellite markers to conduct epidemiological studies and redefine the definition of cluster.

DATA AVAILABILITY STATEMENT
All relevant data is contained within the article. The original contributions presented in the study are included in the article/Supplementary Files, further inquiries can be directed to the corresponding authors.

ETHICS STATEMENT
This study was approved by the Ethics Committee of Hospital Gregorio Marañón (CEIC-A1; study no. 108/16).

FUNDING
This work was supported by grants CP15/00115 and PI16/01012 from the Fondo de Investigación Sanitaria (FIS. Instituto de Salud Carlos III; Plan Nacional de I+D+I 2013-2016). This study was co-funded by the European Regional Development Fund (FEDER) A way of making Europe. In Brazil, this work was supported by a grant 2017/02203-7 from Fundação de Amparo a Pesquisa-São Paulo (FAPESP). The funders had no role in the study design, data collection, analysis, decision to publish, or preparation/content of the manuscript. PE (MSI15/00115) and JG (MSII15/00006) are recipients of a Miguel Servet contract supported by the FIS.

ACKNOWLEDGMENTS
The authors are grateful to Ciencia Traducida for editing and proofreading assistance.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fcimb. 2020.00166/full#supplementary-material Table S1 | Genotype code, type of genotype, and microsatellite allelic composition in base pairs. Table S2 | Diversity parameters calculated for each country as mean values. a Observed and expected heterozygosity ranged from 0 (no heterozygosity) to 1(highest heterozygosity). b Wright's index indicates a deficiency of heterozygosity (positive values) or excess heterozygosity (negative values). c The probability of identity indicates the likelihood of finding two identical genotypes after randomly selecting two isolates.