Understanding the Emergence of Multidrug-Resistant Candida: Using Whole-Genome Sequencing to Describe the Population Structure of Candida haemulonii Species Complex

The recent emergence of a multidrug-resistant yeast, Candida auris, has drawn attention to the closely related species from the Candida haemulonii complex that include C. haemulonii, Candida duobushaemulonii, Candida pseudohaemulonii, and the recently identified Candida vulturna. Here, we used antifungal susceptibility testing and whole-genome sequencing (WGS) to investigate drug resistance and genetic diversity among isolates of C. haemulonii complex from different geographic areas in order to assess population structure and the extent of clonality among strains. Although most isolates of all four species were genetically distinct, we detected evidence of the in-hospital transmission of C. haemulonii and C. duobushaemulonii in one hospital in Panama, indicating that these species are also capable of causing outbreaks in healthcare settings. We also detected evidence of the rising azole resistance among isolates of C. haemulonii and C. duobushaemulonii in Colombia, Panama, and Venezuela linked to substitutions in ERG11 gene as well as amplification of this gene in C. haemulonii in isolates in Colombia suggesting the presence of evolutionary pressure for developing azole resistance in this region. Our results demonstrate that these species need to be monitored as possible causes of outbreaks of invasive infection.


INTRODUCTION
Yeasts from Candida haemulonii complex that include C. haemulonii, Candida duobushaemulonii, Candida pseudohaemulonii, and the recently identified Candida vulturna (Sipiczki and Tap, 2016) are often misidentified as Candida auris, especially in laboratories that do not have access to DNA sequencing and matrix-assisted laser desorption/ionization time of flight mass spectroscopy (MALDI-TOF MS) (Hou et al., 2016;Araúz et al., 2018). Together with C. lusitaniae, another infrequent cause of candidemia, these four species belong to the Metschnikowiaceae family, which includes species that are often resistant to antifungal drugs (Jackson et al., 2019). Recent comparative genomic analysis identified the substantial amount of conservation and synteny among genomes of C. auris, C. haemulonii, C. duobushaemulonii, and C. pseudohaemulonii as well as similar expansions of the oligopeptide transporters and lipase gene families (Muñoz et al., 2018). These shared genomic features suggest similarities in ecology and physiology of these species and raise a possibility that they may also start emerging as drug-resistant pathogens in human populations.
In contrast to C. auris, which was described in 2009 from an external ear canal from a Japanese patient (Satoh et al., 2009;Chowdhary et al., 2013), C. haemulonii was first isolated from a fish off the coast of the Bahamas in 1962 and later isolated from a dolphin and seawater off the coast of Portugal (Van Uden and Kolipinski, 1962). The first clinical isolate was described in 1984 from blood of a patient with renal failure, and since then, isolates of C. haemulonii have been infrequently but regularly reported in patients causing wound and other types of infections (Gargeya et al., 1991;Cendejas-Bueno et al., 2012;Hou et al., 2016). C. pseudohaemulonii and C. duobushaemulonii were identified in 2006 and 2012, respectively, as the distinct lineages within C. haemulonii species complex based on the phylogenetic analysis of rDNA intragenic spacer region (ITS) (Cendejas-Bueno et al., 2012). Finally, in 2016, C. vulturna was identified as a distinct species most closely related to C. duobushaemulonii based on the phylogenetic analysis of rDNA locus (Sipiczki and Tap, 2016). The first isolate of this species was isolated from flowers in the Philippines, and later it was isolated as a cause of human candidemia (Sipiczki and Tap, 2016).
Subsequent studies identified phenotypical and clinical features associated with different species. Specifically, most strains of C. haemulonii cannot grow at 37 • C, whereas most C. duobushaemulonii and C. pseudohaemulonii grow well at body temperature, and most strains of C. auris can grow up to 42 • C (Ben-Ami et al., 2017). Compared with C. auris and C. albicans, C. haemulonii is less virulent in a murine model of infection (Ben-Ami et al., 2017). In humans, yeasts from C. haemulonii complex are primarily known to cause wound infections or colonization although a few cases of invasive blood infections have also been described (Ramos et al., 2015;Hou et al., 2016;Ben-Ami et al., 2017). Conversely, C. auris frequently causes invasive infections although the first isolates of C. auris were isolated from ear infections, and the East Asian clade of C. auris (Clade II) are commonly isolated from the external ear canal (Jackson et al., 2019). C. auris and the four species from the C. haemulonii complex are known for their resistance to antifungal drugs, especially to amphotericin B and azoles; however, resistance to echinocandins is rare (Ramos et al., 2015;Hou et al., 2016).
Unlike C. auris, which is known to colonize human skin and cause rampant healthcare-associated outbreaks in numerous countries, other species from C. haemulonii species complex have been associated with relatively few suspected outbreaks and clusters. In 2007, an outbreak of fungemia caused by C. haemulonii resistant to amphotericin B, fluconazole, and itraconazole was described among four patients in a neonatal unit in Kuwait (Khan et al., 2007). Although this outbreak occurred before C. auris and C. duobushaemulonii species were described, ITS sequences from the isolates deposited into NCBI identify the isolates from this outbreak as C. haemulonii sensu stricto (Isla et al., 2017). In 2016, a cluster of three C. haemulonii wound infections was identified in an Israeli hospital overlapping in time with a C. auris outbreak (Ben-Ami et al., 2017). In 2017, an unusual number of C. duobushaemulonii and C. auris infections were identified in Panama City, Panama, and a subsequent epidemiological investigation confirmed outbreaks of both species (Araúz et al., 2018). The goal of this study was to investigate the genetic relationships and drug-resistance profiles among isolates of the C. haemulonii complex from different countries and healthcare facilities to determine genetic diversities in different populations and to examine possible evidence of transmission if it existed.

Isolates
Isolates were received in the Mycotic Diseases Branch Reference Laboratory at the CDC for routine fungal identification or as part of ongoing fungal disease surveillance. Upon arrival, isolates were identified by sequencing of the ITS2 region of the rDNA and MALDI-TOF MS (Bruker, Bremen, Germany) using a CDC-developed database MicrobeNet (https://www.cdc.gov/microbenet/index.html). Isolates were stored in 20% glycerol at −70 • C. When available, limited metadata, such as geographic region or site of the infection, were collected.

Whole-Genome Sequencing (WGS)
DNA was extracted using the ZR Fungal/Bacterial DNA MiniPrep kit (Zymo Research, Irvine, CA, USA) according to the manufacturer's instructions. Genomic libraries were constructed and barcoded using the NEBNext Ultra DNA Library Prep kit for Illumina (New England Biolabs, Ipswich, MA, USA) following the manufacturer's instructions. Libraries were sequenced on either the Illumina HiSeq 2500 platform (Illumina, San Diego, CA, USA) using the HiSeq Rapid SBS Kit v2 500 cycles or the MiSeq platform using the MiSeq Reagent Kit v2 500 cycles or Illumina MiSeq Reagent Kit v3 (600 cycles). HiSeq and MiSeq v2 500-cycle kits generated 251 bp paired reads, whereas MiSeq v3 600-cycle kits generated 301 bp paired reads.

Phylogenetic and Population Genetic Analyses
For phylogenetic analysis, maximum parsimony phylogenies were constructed using the subtree-pruning-regrafting (SPR) algorithm and bootstrapped using 500 reiterations in MEGA7.0 (Kumar et al., 2018) and visualized in Interactive Tree of Life (iTOL) v4 (Letunic and Bork, 2019). All positions containing gaps and missing data were eliminated. The average genome-wide nucleotide diversity (π) and Tajima's D were calculated from the GATK SNPs set using PopGenome v2.6.1 R package (https://www.rdocumentation.org/packages/ PopGenome/versions/2.7.2) using 5 kb sliding windows. Mating type locus (MTLa and MTLα) was determined using the normalized average read depth at the locus from aligned BAM files for each isolate mapped to the corresponding reference genomes. Genomic regions that exhibit copy number variation (CNV) were identified using normalized read depth.
In the United States, 91% of C. haemulonii and 87% of C. duobushaemulonii were isolated from wounds, bones, and other non-invasive sites, and the remaining isolates were from blood ( Table 2). A different distribution of cases was observed in Latin America: 77% of C. haemulonii and 67% of C. duobushaemulonii from Latin American countries (Colombia, Panama, and Venezuela combined) were from blood, and the remaining isolates were from wounds and non-invasive sites. All isolates of C. haemulonii from Israel were from wounds. All six C. pseudohaemulonii isolates were from blood ( Table 2). One isolate of C. vulturna was from blood, and the other four were wounds and other non-invasive sites ( Table 2).
Frontiers in Genetics | www.frontiersin.org Latin America (4) 1 (25%) 3 (75%) ∧ Five C. haemulonii and 3 C. duobushaemulonii isolates were excluded, because for 4 isolates, the infection site information was unavailable, one isolate was from fish and only 3 isolates were available from Israel.

Phylogenetic Relationships Within the Species Complex
Phylogenetic analysis using SNPs called against the C. auris reference strain B8441 (Muñoz et al., 2018) confirmed previous observations that C. duobushaemulonii is closely related to C. pseudohaemulonii. This analysis also identified a subgroup of isolates from Colombia, Panama, and the Unites States, which was identified by MALDI-TOF as closely resembling C. duobushaemulonii, which formed a distinct monophyletic branch on the phylogenetic tree (Figure 1). Comparison with a recently assembled genome identified these strains as C. vulturna (Navarro-Muñoz et al., 2019) ( Figure 4B). Conversely, isolates identified by MALDI-TOF as C. haemulonii var. vulnera, a previously recognized variety within the C. haemulonii complex (Cendejas-Bueno et al., 2012), were intermixed with other C. haemulonii strains and did not form a phylogenetically distinct cluster (Figures 1, 2).
Genetic Diversity Among C. haemulonii Genetic relationships among C. haemulonii isolates are shown in Figure 2 and at https://microreact.org/project/Candida_ haemulonii. The average pairwise difference between the isolates was 214 SNPs (range 0-399), and there was no distinct phylogeographic population structure. Most isolates were genetically distinct; however, three isolates, B13067, B13068, and B13081, were different from each other by fewer than 27 SNPs and formed a small, well-supported cluster in the phylogenetic tree based on bootstrap analysis (Figure 2). These three isolates were recovered from different patients treated at the same hospital in Panama City, Panama, which reported contemporaneous outbreaks of C. duobushaemulonii and C. auris (Araúz et al., 2018). B13067 was isolated from a toenail of a patient in November 2016, and B13068 was from the blood of another patient isolated in December 2016; the two isolates were genetically identical (0 SNPs). The other isolate from this cluster, B13081, was recovered from a central venous catheter in March 2017. In addition, two isolates from Venezuela, B12112 and B12127, differed by only 32 SNPs although they were isolated from patients treated in different healthcare facilities in the city of Valencia (Figure 2).
The rest of the isolates from these and other countries were different from each other by >69 SNPs. All isolates from the United States were genetically distinct. Interestingly, B10441, recovered in 1962 from a seawater fish, was only 176 SNP different from B13909, isolated in 2017 from a patient's blood. All C. haemulonii isolates were mating type alpha. The average genome-wide nucleotide diversity (π) was 2.59e-05, and the average Tajima's D estimate was −0.97, which is consistent with recent population expansion.
Genetic Diversity Among C. duobushaemulonii The average pairwise distance between C. duobushaemulonii isolates was 458 SNPs (range 0-1,243). The isolates can be separated into two genetically distinct clades separated by more than 600 SNPs. One of these clades included isolates from Panama, Colombia, Guatemala, Venezuela, and the United States, and the other included primarily isolates from the United States, one isolate from Venezuela, and one from Panama (Figure 3 and https://microreact.org/project/Candida_ duobushaemulonii). Nine of the 14 isolates from Panama were from a previously reported outbreak in a large hospital in Panama City (Araúz et al., 2018); five of those were genetically distinct and intermixed with isolates from other regions, and four others, each isolated from a different patient, formed a tight cluster on the phylogenetic tree (Figure 3). B13058 from a toenail and B13059 from blood were isolated in November 2016 and were identical (0 SNPs), and the other two, B13056 from the catheter isolated in November 2016 and B13088 from blood isolated in April 2017, were separated by <42 SNPs. Clusters of C. duobushaemulonii and C. haemulonii overlapped in time. All C. duobushaemulonii isolates were mating type alpha. The genome-wide nucleotide diversity (π) was 4.86e-05, which was almost twice that observed in C. haemulonii; however, the average Tajima's D estimate was −1.17, which was similar to that of C. haemulonii and consistent with recent population expansion.

Genetic Relationships Among Isolates of C. pseudohaemulonii and C. vulturna
Five of the six tested isolates of C. pseudohaemulonii clustered together (441-845 SNPs), and the sixth isolate, B12108 from blood of a Venezuelan patient, was >200,000 SNPs different from the other group ( Figure 4A and https://microreact.org/ project/Candida_pseudohaemulonii). Similarly, four isolates of C. vulturna from Colombia, Panama, and Venezuela were different from each other and the type isolate, CBS14366, isolated from a flower in the Philippines (Sipiczki and Tap, 2016) by 104-209 SNPs. However, the fifth isolate from a U.S. patient was more than 81,000 SNPs different from the other four, suggesting multiple clades of C. vulturna ( Figure 4B and https://microreact.org/project/Candida_vulturna) although more isolates are needed to test this hypothesis. All C. pseudohaemulonii isolates were mating type a. In C. vulturna, four isolates where mating type a, and the fifth isolate from the United States (B14309) was mating type alpha. FIGURE 1 | Maximum parsimony tree illustrating the phylogenetic relationships among species from C. haemulonii complex constructed using 56,023 SNPs called against B8441 C. auris reference genome (GenBank accession PEKT00000000.2). Numbers show bootstrap support for the branches. Scale bar shows pairwise SNPs.

Antifungal Susceptibility Testing and Mutations Linked to Resistance
We observed variable levels of susceptibility to amphotericin B among 38 tested C. haemulonii isolates: 29 (76%) had elevated MICs from 2 µg/mL to >32 µg/mL, and the rest had MICs below 2 µg/mL ( Table 3). Seven (18%) isolates had elevated MIC of fluconazole ranging from 32 to 256 µg/mL, and all had the Y132F mutation in ERG11. However, one isolate, B13067, with this mutation had MIC of fluconazole 8 µg/mL (https://microreact. org/project/Candida_haemulonii). Three C. haemulonii isolates, B11786, B11792, and B11803, had aneuploidy of chromosome 4 (scaffold 4) that harbors ERG11: B11786 and B11792 have complete duplication of chromosome 4, and B11803 has a 300-kb duplication or a region encompassing ERG11 (Supplementary Figure 1). In addition to this duplication, both B11786 and B11803 had Y132F substitution, and their MIC of fluconazole was 64 µg/mL. However, the MIC of fluconazole of B11792 that had wild-type ERG11 copy was 4 µg/mL, suggesting that the duplication alone did not increase resistance ( Table 3).
All five tested C. vulturna isolates had elevated MICs of amphotericin B with MICs ranging from 8 µg/mL to more than 16 µg/mL ( Table 3). All five isolates were susceptible to fluconazole with MIC of 8 to 16 µg/mL, and no known substitutions in the ERG11 gene were detected in these isolates ( Table 3). All tested isolates of the four species had low MICs of echinocandins.
In addition, we examined other genes that have been implicated in azole and amphotericin B resistance in Candida spp. (Arendrup and Patterson, 2017); the nonsynonymous substitutions identified in these genes are listed in Supplementary Table 2. The annotated VCF files for all isolates from this study that can be used to browse for substitutions in other genes of interest are available as https://figshare. com/projects/Genome_sequencing_of_Candida_haemulonii_ species_complex/80150.

DISCUSSION
We present results of the population genetics and phylogenetic analyses of isolates from the Candida haemulonii complex that are closely related to the globally emerging pathogen, C. auris. Phylogenetic analysis based on WGS confirmed previously described relationships among these species (Muñoz et al., 2018;Navarro-Muñoz et al., 2019) and demonstrated that the newly described C. vulturna forms a distinct, well-supported clade related to C. duobushaemulonii and C. pseudohaemulonii. Here, we describe five additional clinical isolates of this new species and demonstrate that it can cause both invasive as well as noninvasive infections as isolates of this species were isolated from blood, wounds, and other body fluids. This new species was found in Panama, Colombia, and in the United States.
Isolates of C. haemulonii and C. duobushaemulonii were recovered from blood as well as from wounds and other noninvasive sites. In the United States, the vast majority of C. haemulonii and C. duobushaemulonii were from non-invasive sites, such as lower extremities or wounds, and most isolates FIGURE 4 | Genetic relationships among C. pseudohaemulonii (A) and C. vulturna (B) isolates. Maximum parsimony trees were constructed using a total of 238,026 SNPs called against B12108 C. pseudohaemulonii reference genome (GenBank accession PKFO000000000) and 81,034 SNPs called against CBS14366 C. vulturna reference assembly Bio project PRJNA560499. Scale bar shows pairwise SNPs.
from Latin America were from blood. Many of the isolates from the United States were described as from foot or toe wounds, so it is possible that these species may be associated with diabetic foot infections although more epidemiological data is needed to show a definitive link. The reasons for this observed difference between the clinical presentations of the infections in the United States and Latin America are not entirely clear. One possible explanation may be that, compared to the United States, fewer Candida from non-invasive sites are collected and identified in Latin America; therefore, isolates from the non-invasive sites may be less likely to be reported. However, it is also possible that different infection control or clinical practices in Latin America led to more C. haemulonii and C. duobushaemulonii invasive infections. Notably, almost all azoleresistant isolates of C. haemulonii and C. duobushaemulonii were found in Latin America. Regardless of their geographic origin, all isolates of C. pseudohaemulonii were from blood. Whether this is truly a difference in the epidemiology among the species remains to be determined as only a few isolates of this species were available.
Population genetic analysis demonstrated that most isolates of C. haemulonii, C. duobushaemulonii, C. pseudohaemulonii, and C. vulturna were genetically distinct and do not yet show evidence of the extensive clonality that can be a sign of global emergence. However, we identified several nearly identical isolates of C. duobushaemulonii and C. haemulonii in different patients in one hospital in Panama, which reported outbreaks of C. duobushaemulonii and C. auris (Araúz et al., 2018). This finding indicates that both species are capable of transmission and causing outbreaks in healthcare settings. The reasons for the simultaneous outbreaks of C. auris, C. haemulonii, and C. duobushaemulonii in this hospital are unknown. Likely factors include a large patient population and poor infection control practices. It is also possible that some unique clinical practices in this facility, such as extensive use of azoles, might have contributed to multiple outbreaks. Notably, all C. haemulonii and C. duobushaemulonii from the outbreaks in this hospital were carrying ERG11 mutations, and most had elevated MICs to fluconazole.
The overall genomic diversity (π) in the C. haemulonii population was relatively low and comparable with that observed in different clades of C. auris; no phylogeographic population substructure was observed. Although isolates of C. haemulonii var. vulnera were identified by others based on sequencing the ITS region of rDNA (Cendejas-Bueno et al., 2012), we were unable to confirm this observation with genomic data. Indeed, three isolates identified by MALDI-TOF as C. haemulonii var. vulnera were indistinguishable   Compared with C. haemulonii, more diversity was observed among C. duobushaemulonii isolates: Two subclades separated by more than 600 SNPs were observed. The genomic diversity (π) in C. duobushaemulonii was almost twice that of C. haemulonii, which was probably reflective of the more complex population structure. However, the genome-wide Tajima's D estimates for both species were negative and consistent with clonally expanding populations. Comparable estimates of Tajima's D were obtained for three rapidly expanding clades of C. auris (Chow et al., 2020). Furthermore, similarly to C. haemulonii, all C. duobushaemulonii isolates in our study were also mating type alpha.
Antifungal susceptibility testing indicated that most isolates from C. haemulonii species complex had highly elevated MICs of amphotericin B. Specifically, 100% of C. vulturna, 93% of C. duobushaemulonii, 76% of C. haemulonii, and 33% of C. pseudohaemulonii showed elevated MICs of at least 2 µg/mL. Elevated MIC values of fluconazole were also common but not nearly as widespread as in C. auris. Approximately 13% of C. duobushaemulonii, 18% of C. haemulonii, and 33% of C. pseudohaemulonii had MICs of fluconazole of 32 µg/mL or higher, and all isolates of C. vulturna were susceptible to fluconazole. All isolates of all four species had low MIC values of echinocandins.
The majority of isolates with elevated MICs of fluconazole in C. haemulonii, C. duobushaemulonii, and C. pseudohaemulonii contained the Y132F substitution in ERG11, which is associated with azole resistance in C. albicans and C. auris Clades I and IV (Berkow and Lockhart, 2017;Muñoz et al., 2018). Furthermore, additional substitutions linked to azole resistance in C. albicans were identified in C. duobushaemulonii isolates from Panama that also had Y132F/H substitutions; however, it was unclear whether these additional substitutions further affected MIC levels of azoles. No previously described drugrelated mutations in the ERG11 gene (Berkow and Lockhart, 2017) were detected in C. vulturna isolates with elevated MICs, suggesting a different mechanism of resistance. All C. haemulonii and C. duobushaemulonii with elevated MICs of fluconazole were found in Colombia, Venezuela, and Panama. Only a single isolate of C. pseudohaemulonii with a moderately high MIC of fluconazole of 32 µg/mL was isolated in the U.S. state of Georgia, and no other fluconazole-resistant isolates were found in the United States despite testing a comparable number of isolates from the United States and Latin America. In addition, all isolates with the corresponding ERG11 mutations and chromosomal duplications that can be linked to azole resistance were found only in Latin America. Additional non-synonymous substitutions were detected in other genes implicated in antifungal resistance in Candida (Arendrup and Patterson, 2017); however, more research is needed to evaluate the role of these substitutions in resistance to antifungal drugs. The list of the non-synonymous substitutions and annotated VCF files is provided in our study for other investigators who might be interested in addressing these questions (https://figshare. com/projects/Genome_sequencing_of_Candida_haemulonii_ species_complex/80150 and Supplementary Table 2).
Our results indicate that, although we are not yet observing the widespread emergence of fungi for the C. haemulonii species complex as human pathogens, at least two of these species can be transmitted within a healthcare facility and may cause healthcareassociated outbreaks. Outbreak isolates of both species also had elevated MICs of azoles and had corresponding ERG11 mutations and chromosomal duplications linked to resistance. The prevalence of these mutations and duplications among C. haemulonii and C. duobushaemulonii isolates from Latin America suggests that these species may be exposed to the same evolutionary pressure from azole drugs that may have contributed to the emergence of C. auris. Continued surveillance to monitor azole resistance and epidemiology of these species is warranted.

DATA AVAILABILITY STATEMENT
All whole genome sequence raw reads for this study can be found in the NCBI under the following Bio Project PRJNA606185.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by CDC's IRB-Committee 2. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.