Comparative Genomic and Pan-Genomic Characterization of Staphylococcus epidermidis From Different Sources Unveils the Molecular Basis and Potential Biomarkers of Pathogenic Strains

Coagulase-negative Staphylococcus (CoNS) is the most common pathogen causing traumatic endophthalmitis. Among which, Staphylococcus epidermidis is the most common species that colonizes human skin, eye surfaces, and nasal cavity. It is also the main cause of nosocomial infection, specially foreign body-related bloodstream infections (FBR-BSIs). Although some studies have reported the genome characteristics of S. epidermidis, the genome of ocular trauma-sourced S. epidermidis strain and a comprehensive understanding of its pathogenicity are still lacking. Our study sequenced, analyzed, and reported the whole genomes of 11 ocular trauma-sourced samples of S. epidermidis that caused traumatic endophthalmitis. By integrating publicly available genomes, we obtained a total of 187 S. epidermidis samples from healthy and diseased eyes, skin, respiratory tract, and blood. Combined with pan-genome, phylogenetic, and comparative genomic analyses, our study showed that S. epidermidis, regardless of niche source, exhibits two founder lineages with different pathogenicity. Moreover, we identified several potential biomarkers associated with the virulence of S. epidermidis, including essD, uhpt, sdrF, sdrG, fbe, and icaABCDR. EssD and uhpt have high homology with esaD and hpt in Staphylococcus aureus, showing that the genomes of S. epidermidis and S. aureus may have communicated during evolution. SdrF, sdrG, fbe, and icaABCDR are related to biofilm formation. Compared to S. epidermidis from blood sources, ocular-sourced strains causing intraocular infection had no direct relationship with biofilm formation. In conclusion, this study provided additional data resources for studies on S. epidermidis and improved our understanding of the evolution and pathogenicity among strains of different sources.


INTRODUCTION
Coagulase-negative staphylococci (CoNS) usually live on human skin (Piette and Verschraegen, 2009), nasal cavity (97%; Wos-Oxley et al., 2010), and ocular surface (60%; Graham et al., 2007;Willcox, 2013) and is the most common isolate recovered from nosocomial bloodstream infections (31%; Wisplinghoff et al., 2004). Staphylococcus epidermidis is the most common CoNS species (Otto, 2004), and it plays a central role in the skin microbiota; for example, it can protect against colonization by skin pathogens (Cogen et al., 2010a,b), maintain the ecological balance of human skin flora (Schommer and Gallo, 2013), and modulate the immune system (Egert et al., 2017;Linehan et al., 2018). Once it breaches the skin surface and enters the bloodstream, however, it is considered pathogenic. As the second most common cause of nosocomial infections (Otto, 2009), S. epidermidis not only substantially accounts for foreign body-related infections (Becker et al., 2014) but also causes many eye infections (Keay et al., 2006;Schimel et al., 2013;Bispo et al., 2014;Park et al., 2015b). Previous study showed that CoNS, including S. epidermidis, was the most common pathogen causing traumatic endophthalmitis accompanied by intraocular foreign body (Al-Omran et al., 2007), and the average visual acuity of an eye is usually at the low level (Bhagat et al., 2011;Cornut et al., 2013). The different roles of S. epidermidis in health and disease make it as one of the essential bacterial species of the human microbiota. Although a high positive rate of S. epidermidis was detected in clinical specimens, whether they represent actual infection or only colonization/contamination remains to be discussed.
Although much less is known regarding the potential risk of S. epidermidis to cause outbreaks, the biofilm formation and antibiotic resistance of S. epidermidis contribute to the occurrence and persistence of clinical infections (Schoenfelder et al., 2010), strongly suggesting that modern medicine has facilitated the selection process, mainly by the (over)use of antibiotics and the insertion of foreign body devices (Becker et al., 2014). Detachment of bacterial cells from the biofilm on the medical devices can lead to bacteremia and increase morbidity and potential mortality (Conlan et al., 2012). It is well-known that biofilms are resistant to antibacterial drugs (Conlan et al., 2012). Antibiotic resistance significantly complicates treatment and increases medical costs (Foster, 2017;Lee et al., 2018). The gene mecA is present on staphylococcal chromosome cassette mec (SCCmec) and encodes the penicillin-binding protein PBP2a (Katayama et al., 2000), which may confer resistance to methicillin (Hartman and Tomasz, 1984). Although methicillin is not used to treat eye infections, it is known that increasing resistance to methicillin can significantly promote the spread and persistence of multidrug-resistant strains in specific environments (Alekshun and Levy, 2007;Asbell et al., 2008;Lichtinger et al., 2012).
Recently, with the development of high-throughput sequencing technology, the genome of S. epidermidis has been extensively studied, including full pan-genome analysis of both commensal and nosocomial isolates (Conlan et al., 2012), identification of the presence of S. epidermidis lineages in healthy individuals from two geographical locations (Sharma et al., 2018), genomic determinants associated with their adaptation to various environments (Su et al., 2020), and the evolutionary trajectory and functional distribution of S. epidermidis . Microorganisms can be shaped by different host-specific factors, such as disease and health status. However, it remains unclear whether the detected S. epidermidis strains were true infectious pathogen or contaminant, especially in the case of respiratory tract samples, which were generally considered as contamination and were not reported. Thus, the pathogenic nature of S. epidermidis from various host health conditions should be fully investigated. Moreover, although Kirstahler et al. (2018) reported six whole genomes of S. epidermidis from vitreous humor, a comprehensive understanding of the genome of ocular strains, particularly ocular trauma-sourced strains, is still lacking, and the genetic differences among strains isolated from different host niches remained unclear.
In this study, we sequenced the whole genomes of 11 ocular trauma-sourced S. epidermidis isolates. Along with incorporating publicly available genomes, we obtained a total of 187 genomic sequences of S. epidermidis isolates from eyes, skin, respiratory tract, and blood of healthy and diseased hosts. Through wholegenome sequence (WGS) analysis and pan-genome analysis, we performed a detailed understanding among S. epidermidis isolates from diverse host health status and within-individual sources. We comprehensively analyzed and reported the WGS of S. epidermidis that causes traumatic endophthalmitis. With phylogenetic analysis and comparative genomics, we revealed the evolutionary relationships of different sources, exploring how different host niches may shape the genetic diversity of S. epidermidis. Our study demonstrated a marked association between evolutionary lineage and host health states. Regardless of niche source, S. epidermidis showed two founder lineages with different pathogenicity. We also identified several potential biomarkers related to S. epidermidis pathogenicity. Compared to blood-sourced S. epidermidis, traumatic endophthalmitis strains carried different virulence genes and causing intraocular infection which may be independent of biofilm formation. Overall, our study revealed the genetic diversity and pathogenicity of S. epidermidis and reported a comprehensive comparative analysis of different source genomes.

Strains
This study analyzed the whole genomes of 187 S. epidermidis isolates, including 11 ocular strain isolates from the Department of Laboratories, Eye Hospital of Wenzhou Medical University, Wenzhou, China, and 176 available genome sequences downloaded from the GenBank database (Sayers et al., 2020) of the National Center for Biotechnology Information (NCBI) 1 and PATRIC v3.6.10, 2 with detailed information showed in Supplementary Table 1. The 187 S. epidermidis strains were selected to represent known diverse niches and the host Frontiers in Microbiology | www.frontiersin.org health states. We collected 17 ocular S. epidermidis strains, 46 blood-sourced strains, 22 respiratory strains, 45 skin-sourced strains, and 46 clinical strains with unknown sources of host niches defined as "group clinics, " of the remaining S. epidermidis strains belonged to the "Others" group. The "Respiratory" group consists of isolates from the nares, lungs, pharyngeal exudate, sputum, and bronchoalveolar lavage. Catheter and oral isolates and the reference genomes of strains ATCC14990 and RP62A were classified in the "Others" group. A total of 33 and 75 strains had a definite health or disease state, respectively.

Pan-Genome Analysis
Pan-genome analysis was carried out for all 187 genomes and the genomes of S. epidermidis isolate from different niches by BPGA v1.3 (Chaudhari et al., 2016) using default parameters, and 50% sequence identity as the cutoff for clustering identity was applied to USEARCH. The annotation files generated by Prokka were provided to BPGA as input. The clusters of orthologous groups (COGs) of the core sequence, accessory sequence, and unique sequence after Pan-genome analysis were further verified by performing functional annotations in the EggNOG Database Characterization of Staphylococcus epidermidis Strains kSNP3.1 (v3.1.2; Gardner et al., 2015) was used to build the phylogenetic tree, a validated method without alignment by 19 k-mer length, and a total of 8,760 core SNPs were identified. Then, a maximum likelihood core SNP tree was constructed by RAxML (v8.2.12;Stamatakis, 2014) with 100 bootstraps. The phylogenetic tree was visualized and beautified by online itol (v6.0; Letunic and Bork, 2021). 3 Multilocus sequence typing (MLST) of 187 S. epidermidis strains was performed with the MLST 2.0 online server (Larsen et al., 2012). 4 Online SCCmecFinder 1.2 5 was used to identify SCCmec elements in sequenced S. epidermidis isolates.
ResFinder (v2.1; Zankari et al., 2012) and Resistance Gene Identifier (RGI; v5.1.1) of the comprehensive antibiotic resistance database (CARD; 2020; Alcock et al., 2020) were employed to identify antimicrobial resistance genes (AMRs). The resistance genes were verified by the paper diffusion method (MHA) and broth dilution method. The BLASTp program was used to search all protein sequences of 187 S. epidermidis strains against the Virulence Factor Database (VFDB; Feb, 2021; Liu et al., 2019). 6 Compared with the virulence genes in the database at an e-value <1e-10, only query genes with an identity higher than 40% and a coverage higher than 70% were considered potential virulence genes (Nourdin-Galindo et al., 2017). Functional annotations of virulence factor were based on the categories and subcategories presented in VFDB. An alignment for each of the extracted candidate virulence genes was constructed using Clustal Omega (v1.2.4; Sievers et al., 2011) and visualized by ESPript (v3.0; Robert and Gouet, 2014).

Statistical Analyses
The significance of core gene and accessory gene abundance in COG categories was examined using Fisher's exact test. The disease-associated and ocular-associated predicted accessory genes with known function and annotated virulence genes were analyzed using Fisher's exact tests and the FDR correction of values of p. All statistical analyses were carried out using the R package (version: 4.0.2). A value of p < 0.05 was regarded as statistically significant.

Pan-Genome and Functional Characterization of Staphylococcus epidermidis From Different Sources
Although several studies reported pan-genomes characterization of S. epidermidis (Conlan et al., 2012), the core and pan-genome features of ocular strains are lacking. To identify the pan-genomic characteristics of S. epidermidis from ocular sources and discrepancies among different sources, we determined overall genetic similarities and differences among ocular S. epidermidis and all 187 strains. Gene accumulation curves (Tettelin et al., 2005; Figures 2A,B) showed that the number of core genomes fits an exponential decay curve that plateaus at 749 and 1,931 genes, respectively, while the pan-genome data fit a power law curve (y = a·x b ), indicating an open pan-genome in which each genome sequence added several new genes, as reported (Conlan et al., 2012). To a certain degree, the capacity of acquiring exogenous DNA of the organism partially determines the pan-genome state ("open" or "close"; Diene et al., 2013), especially for species living within a bacterial community, such as skin inhabitants.
The function of the genes within the pan-genome of all strains from different sources was investigated by assigning all gene clusters to COGs (Galperin et al., 2021; Figure 2C). For the entirety, the results revealed a total of 685/749 core genes, 1,792/2,783 accessory genes, and 234/468 unique genes among all 187 S. epidermidis strains annotated to COG categories. Specifically, by Fisher's exact test (FDR < 0.05), 5 of 22 COG categories were significantly enriched in core genes and almost associated with metabolism and biogenesis: energy production and conversion, nucleotide transport and metabolism, coenzyme transport and metabolism, translation, ribosomal structure and biogenesis, and inorganic ion transport and metabolism. Additionally, while replication, recombination, and repair were enriched in both accessory and unique genes, secondary metabolite biosynthesis, transport, and catabolism and defense mechanisms were enriched in unique genes. Second, we compared the enrichment of COG function among strains from different sources, including the blood, eyes, respiratory tract, and skin, and showed similar results: core genes related to metabolism and biogenesis. However, genes associated with defense mechanisms were enriched in accessory genes of all strains except eye-sourced ones. In addition, transcription was significantly enriched in accessory genes of blood-sourced strains, while other sources were enriched in unique genes. In summary, compared to the core genes enriched in biogenesis and metabolism, the enrichment of replication, recombination and repair, defense mechanisms, and transcription among accessory genes and unique genes were driven by diversity in recombinase and integrase, ABC-type multidrug transporters, and transcriptional regulators, respectively. These abundant genes often appear to transfer horizontally between strains, leading to the spread of virulence and resistance genes between strains, which affects the pathogenicity of bacteria (Jackson et al., 2011). (2) assigned clusters of orthologous group (COG) classes of protein-coding genes (CDSs) on the forward strand; (3) reverse strand CDSs; (4) tRNA (blue) and rRNA (red) genes on the forward strand; (5) tRNA (blue) and rRNA (red) genes on the reversed strand; (6) GC content (swell fire red/sky blue indicates higher/lower G + C compared with the average G + C content); (7)

Phylogenetic Relationship and Associated Typing Among Staphylococcus epidermidis
To infer the phylogenetic relationship of the S. epidermidis strains from different sources, we used 8,760 core SNPs to build a single-nucleotide polymorphism-based phylogenetic tree. As expected, the 187 S. epidermidis isolates formed two distinct groups termed I and II, as previously reported (Conlan et al., 2012;Zhou et al., 2020; Figure 2D), suggesting the presence of two founder lineages. Moreover, we found that ST2 and ST5 strains were present in Cluster II and correlated with definite disease hosts or clinical strains, while ST691 strains were present in Cluster I associated with healthy skin with the same genetic distance. It is worth mentioning that although the ocular isolates varied in phylogeny, they all had a close distance to different sources from diseased hosts in both clades, which confirmed that the ocular strains that we collected were pathogenic. However, the ST typing of ocular strains was highly diverse and included rare specimens: 4/11 strains that we collected were unknown.
Staphylococcus epidermidis contains SCCmec, called methicillin-resistant S. epidermidis (MRSE; Conlan et al., 2012). SCCmec is a mobile genetic element defined by combinations of mec gene complexes, cassette recombinases, and accessory genes and carries the key determinant for broad-spectrum beta-lactam resistant mecA gene of Staphylococcus species (Conlan et al., 2012;McManus et al., 2015). We found that the SCCmec element (SCCmec type) was present in 94/187 strains. Interestingly, almost all SCCmec-positive strains (96.8%, 91/94) fell in Cluster II, indicating that S. epidermidis carrying the SCCmec element may be more pathogenic relative to SCCmecnegative strains.

The Virulence Characteristics of Staphylococcus epidermidis
Staphylococcus epidermidis is an opportunistic pathogen that harbors many virulence genes. To analyze the difference in pathogenic potentials among S. epidermidis isolates from different sources, 99 genetic loci related to virulence were identified based on the VFDB. These loci were grouped into 18 categories (Figure 3), including 21 (21/99, 21.2%) virulence genes that coexisted in all S. epidermidis genomes. Surprisingly, we identified several virulence genes, including icaABCDR, hpt, and esaD, that were strongly associated with S. epidermidis isolates from disease sources (Fisher's exact test, FDR < 0.01). Polysaccharide intercellular adhesion (icaABCD) genes that encode biofilmassociated genes for poly-N-acetylglucosamine synthesis were found in 50.7% (38/75) of isolates from disease sources. Notably, only 2 of 33 isolates from healthy tissues contained the icaABCD gene, an observation different from previous studies showing that this gene was present in 60% of commensal isolates. Pairwise comparison of the enrichment of virulence genes between ocular-sourced strains and strains from other niches showed 30 icaABCD genes in all 46 blood-sourced strains and 3 out of 17 ocular-sourced strains. We also found that the sdrE gene encoding the Ser-Asp-rich protein was enriched in ocular-sourced strains (16/17) compared to blood-sourced strains (24/46; Fisher's exact test, FDR < 0.05). Two toxin genes, sell and sec, which were present only in two S. epidermidis strains isolated from blood (SE90 and SE95), which reported previously , were also identified in two ocular strains. Together with the results from phylogenetic analysis showing that these strains had a close evolutionary distance, implying that they may have the similar founder lineages.

Gene Differences Between Genomes of Isolates From Healthy and From Disease Sources
Staphylococcus epidermidis is a coagulase-negative and Grampositive staphylococcus found in skin and mucosa microflora (Oh et al., 2016). It is also relatively highly abundant with high positive rates on the ocular surface and the second leading cause of nosocomial infections (Ziebuhr et al., 2006). We further investigated whether the presence of specific genes was significantly correlated with different sources of strains including isolates from different healthy hosts. We devoted to accessory genes defined as gene clusters neither present in all genomes nor exit in only a single genome. Distinctly, the genes were divided into two clusters, which strictly differentiated the disease group and healthy isolates (Fisher's exact test, FDR < 0.01; Figure 4A). The two groups of genes were then functionally classified by KEGG function (Figure 4B). For both groups, metabolism genes accounted for the largest proportion, followed by signaling and cellular processes. Comparing the two groups in the subcategory of KEGG function, disease-associated genes had a larger proportion involved in amino acid metabolism, antimicrobial resistance, and transcription factors, while carbohydrate metabolism had a high ratio in healthyrelated genes.
Among the differentially expressed genes between strains from diseased and healthy sources, we found some key diseaserelated genes, including the virulence genes essD, uhpt, sdrF, sdrG, fbe, and icaABCDR and the transcriptional regulators lrpC and cysL ( Figure 5A). SdrF, sdrG, and fbe, which are microbial surface components recognizing adhesive matrix molecules (MSCRAMMs), showed homology with the virulence gene sdrE. Still, not all of the gene sequences were identified as virulence factors. By multiple alignment with Clustal omega, we found that compared to both the virulence-associated sdrG of S. epidermidis from and sdrG of Staphylococcus aureus, the non-virulence-associated sdrG of S. epidermidis had mutation in the region encoding a β-strand structure (Figure 5C), which may have reduced virulence.
Interestingly, comparing the genes between the isolates from ocular sites and other sites, we found that biofilm-related genes icaABCDR and pgaC were markedly enriched in isolates from blood but not in isolates from ocular tissue ( Figure 5B). Moreover, although sdrG seemingly was enriched in isolated from ocular sites, the mutation rate within the β-strand reached 42.86% (6/14), which may result in a reduced adherence of the microbes to the extracellular matrix of the host, further suggesting that biofilm formation may not be a direct factor for intraocular infection of S. epidermidis.

Antimicrobial Resistance Across Staphylococcus epidermidis
Antimicrobial resistance is very common among S. epidermidis isolates and contributes to the persistence of clinical infection and often limits treatment options (Kleinschmidt et al., 2015). To investigate antimicrobial resistance across S. epidermidis, we analyzed all known AMR genes within our 187 genomic data sets. Our analysis of ResFinder and CARD databases found 41 different genes involved in resistance to 19 antibiotics (Figure 6). Nearly, all isolates carried at least three antibiotic resistance genes. Among the genes involved in AMR, our data showed that two genes were conserved in all strains, norA, which is associated with fluoroquinolone antibiotics that belongs to the AMR gene family with major facilitator superfamily (MFS) antibiotic efflux pumps, and dfrC, a diaminopyrimidine antibiotic-associated gene. This result was consistent with the data in the CARD. 7,8 Based on Fisher's exact test of strains from different host health states and niches, we found that the methicillin resistance genes mecA and mecR1 (FDR < 0.001), fluoroquinolone antibiotic gene qacA (FDR < 0.005), and aminoglycoside antibiotic gene aac(6′)-Ie-aph(2″)-Ia (FDR < 0.001) were enriched in strains isolated from disease hosts. The presence of mecA was consistent 7 norA: https://card.mcmaster.ca/ontology/36530 8 dfrC: https://card.mcmaster.ca/ontology/39299 with the SCCmec-positive strains because the mobile genetic element SCCmec carried mecA. Antibiotic susceptibility testing using oxacillin and cefoxitin to assess resistance to methicillin. The results showed that the intraocular strain oxacillin resistance rate was 5/11 (Table 1), which was consistent with the rate of the results for cefoxitin and mecA gene carriers. Moreover, strains from healthy skin had significant enrichment of the msrA and mgrA genes involved in multidrug resistance (FDR < 0.001), while isolates from the ocular, blood, and respiratory tracts had no significantly enriched antibiotic resistance genes, suggesting that regardless of the isolation niches, none of these isolates were resistant to specific antibiotics at the genetic level.

DISCUSSION
Coagulase-negative Staphylococcus is the most common pathogen in traumatic endophthalmitis (Durand, 2013). Staphylococcus epidermidis, the most common coagulase-negative Gram-positive Staphylococcus, colonizes the normal mucosa, skin flora, and intraocular tissue of humans and other mammals and is one of the major leading causes of clinical infections (Oh et al., 2016). What still needs to be discussed is whether S. epidermidis isolated from the tissues is a pathogen or a contaminant. Although the genome characteristics of S. epidermidis have been studied in recent years, a comprehensive understanding of the genomes of ocular-sourced strains is lacking. To the best of our knowledge, this is the first and largest collection of bacterial genome sequences isolated from patients with S. epidermidis intraocularly. By collecting, sequencing, and analyzing the genomes of 11 intraocular isolates and incorporating publicly available genomes of S. epidermidis, we gained insight into the phylogenetic and molecular characteristics of intraocular and other niche pathogens. The genetic differences between pathogenic and commensal S. epidermidis were also investigated comprehensively.
In this study, we compared the phylogenetic diversity and genome characteristics of S. epidermidis from different niches and different host health states. The host niches, including the Frontiers in Microbiology | www.frontiersin.org eye, blood, skin, and respiratory tract, were from individuals with different healthy state. The 11 ocular-sourced S. epidermidis genomes sequenced revealed similar high-quality benchmark data, including genome size, GC content, and the number of predicted genes, similar to the data collected from public databases with a relatively compact genome with an average size of approximately 2.58 Mb. Furthermore, the completeness of each whole-genome sequencing exceeded 99%, and the contamination level of the sequencing library was also very low. The high coverage of genome assembly allowed us to obtain a complete and accurate genome sequence. Consistent with the results of previous studies (Conlan et al., 2012;    Zhou et al., 2020), all 187 genomes, whether in the pan-genome analysis of all strains from various niches, including ocular sources, showed that the size of the S. epidermidis genome was relatively constant, extracted from a larger gene pool, indicating an increase in the "open" pan-genome, and some new genes were added in each genome sequence. To some extent, the pan-genomic state ("open" or "closed") of an organism depends in part on its ability to obtain exogenous DNA (Diene et al., 2013). For example, the large number of genes involved in the mobile group results in frequent horizontal gene transfer (HGT) between staphylococcal stains, and the presence of mobile genetic elements such as SCCmec, ACME, and plasmids leads to an increase in the "open" pan-genome (Georgiades and Raoult, 2010). The phylogenetic tree constructed based on genome-wide core SNPs reveals important details that were not able to observe with traditional single-gene markers (16S rDNA) or MLST (Su et al., 2020). From the phylogenetic tree analysis, we found that 187 strains of S. epidermidis formed two distinct clusters with different pathogenic capacities. The clinically pathogenic strains were generally ST5 and ST2. In contrast, the ST691 strains were derived from healthy skin. The previously reported evolutionary distance between ST2 S. epidermidis was extremely short (Su et al., 2020) and was also observed in the ST5 and ST691 strains. It is well established that S. epidermidis is a common human skin commensal. It is considered an opportunistic pathogen and causes infection when it breaks through the surface of the skin and enters the blood (Otto, 2009). Staphylococcus epidermidis is also an vital commensal on the ocular surface (Zhang et al., 2013), but it can enter the eyes to cause intraocular infection when injury occurs usually caused by trauma. By whole-genome analysis in this study, we have provided a powerful framework for redefining species clustering in the genus, locating genetic traits, and rating the importance of disease-causing genes based on their presence or absence in the genomes. By analyzing the genomes of S. epidermidis from different sources using comparative genomics methods, we identified the putative pathogenic marker genes lrpC, cysL, essD, uhpt, sdrF, sdrG, fbe, and icaABCDR. lrpC and cysL both encode helix-turn-helix (HTH)-type transcriptional regulators. The lrp-like regulatory factor consisted of a helix-turn-helix (HTH)-type n-terminal DNA-binding domain, which is connected to the C-terminal RAM domain (amino acid metabolism regulation) responsible for cofactor binding and oligomerization (Ettema et al., 2002). Thus, lrpC is a transcriptional regulator with a possible role in the regulating amino acid metabolism and the growth phase transition. CysL belongs to the LysR family transcriptional regulator. The LysR is a DNA-binding protein with a winged helix-turn-helix (wHTH) domain consisting of approximately 60 residues in the LysRtype transcription regulator (LTTR). LTTR is one of the most common regulatory factor families in prokaryotes. The c-terminus of the LysR protein often contains a regulatory domain with two subdomains, which participate in (1) coinducer recognition/ reaction and (2) DNA binding and response (Henikoff et al., 1988). LTTRs can activate the transcription of operons and regulons involved in the regulation of various functions, such as amino acid biosynthesis, CO 2 fixation, antibiotic resistance, virulence factor regulation, nitrogen-fixing bacterial nodulation, oxidative stress response, or aromatic compound catabolism. However, the specific functions of the transcriptional regulators CysL and LrpC in S. epidermidis remain to be determined.
In this work, the pathogenic marker genes essD, uhpt, sdrF, sdrG, fbe, and icaABCDR were identified as potential virulence genes. Interestingly, we found that the potential virulence genes essD and uhpt had high homology with esaD and hpt in S. aureus. EsaD is a nuclease toxin secreted by type VII secretion system, which may play a key role in bacterial competition (Cao et al., 2016). Hexose phosphate is an important carbon source in the cytoplasm of host cells (Park et al., 2015a). Bacterial pathogens invade, survive, and replicate in different host epithelial cells, utilizing hexose phosphate in the host cytoplasm to obtain energy and synthesize cell components through the hexose phosphate transport (HPT) system (Park et al., 2015a). The HPT system of S. aureus, which includes the hptRS (a new type of two-component regulatory system), hptA (a phosphate sensor), and uhpT (a hexose phosphate transporter) genes, allowing it survive in the host cell and may be an important target for the development of new antistaphylococcal therapies (Park et al., 2015a). The potential virulence genes essD and uhpt of S. epidermidis identified in this study could be due to HGT between Staphylococcus strains; for instance, the genomes exchange between S. epidermidis and S. aureus during evolution. In fact, it was reported that multiple accessory genes of S. epidermidis were significantly associated with features of the contextual microbiome and could be generalized to new hosts . Recently, Du et al. (2021) reported that some S. epidermidis may have gained the capacity to exchange DNA, such as an accessory tarIJLM gene cluster, via S. aureus phage.
SdrF, sdrG, fbe, and polysaccharide intercellular adhesion gene (icaABCD) are related to biofilm formation (Foster et al., 2014). Biofilm formation is the main virulence mechanism of S. epidermidis contributed to the persistence of clinical infections (Schoenfelder et al., 2010). Here, all seven genes encode adhesive molecules, which are well-known factors involved in biofilm formation. SdrF, sdrG, and fbe are a subset of MSCRAMMs (Foster et al., 2014). They are covalently anchored to the cell wall and characterized by a segment composed of repeated serine aspartate (SD) dipeptides (Vazquez et al., 2011). MSCRAMMs are bacterial surface proteins that mediate the adhesion of microorganisms to the host's extracellular matrix components (Vazquez et al., 2011). Our sequence analysis showed that not all of the sdr gene sequences were identified as virulence factors. The sequence of sdrF in strains associated with virulence was very different from that of non-virulence strains, including the difference in sequence length, while only the β chain mutation occurred between the two sdrG genes. We speculated that the occurrence of these mutations may reduce the virulence of sdrG, and it was clear that mutagenesis is overrepresented in intraocular isolates compared to blood isolates. Our enrichment analysis indicated that it was possible to differentiate the intralocular pathogenic strains from those from blood, with biomarkers related to biofilm formation. For example, polysaccharide intercellular adhesion icaABCD was enriched in blood but not in ocular tissue. This may be due to the fact that S. epidermidis is more likely to form biofilms on medical devices such as catheters and artificial heart valves (Wisplinghoff et al., 2004). Bacterial cells on these devices can break away from the biofilm and enter the bloodstream, leading to bacteremia, increasing morbidity, and potential mortality (Wisplinghoff et al., 2004). On the other hand, this may also suggest that the intraocular infection of S. epidermidis is not directly related to its ability to form biofilm. A larger data set and further experiments are required to test this hypothesis.
Staphylococcus epidermidis has been found to be a treasure trove of antibiotic resistance (Conlan et al., 2012). Through a rare HGT event, its determinants of toxicity were shared with other more pathogenic species, such as S. aureus, as reported in previous studies (Qin et al., 2016). In our study, we found that 94/187 (50.2%) of the collected S. epidermidis species were MRSE with the mecA gene and SCCmec elements. In particular, the SCCmec cassette conferring β-lactam resistance was often transferred between staphylococcal strains, enabling them to rapidly evolve and adapt to antibiotic selection pressures to gain additional competitive advantages. This provided strong support to the notion that pathogens influence the risk of infection by the background microbiota through HGT of the pathogenic host, thereby increasing the risk of infection in other parts of the body, such as methicillin-resistant S. aureus in the nasal cavity affecting S. epidermidis-infected endophthalmitis (Su et al., 2020).

CONCLUSION
Our study provided information on the molecular characteristics of different pathogenic S. epidermidis isolated from different host niches around the world, including the ocular strains that had been overlooked previously. Pan-genome and phylogenetic analyses demonstrated that S. epidermidis had an open pan-genome, and the two founder lineages had different pathogenicity. Interestingly, MRSE strains were concentrated in the pathogenic clade. Although the endophthalmitis-associated S. epidermidis isolated in this study was relatively dispersed in evolution, they are closer to the clinical pathogenic strains, which demonstrate the nature of their pathogenicity. Based on comparative genomics, we identified eight potential biomarkers related to strain pathogenicity and provided evidence that HGT may occur between Staphylococcus strains. Moreover, we reported the complete genome sequence of S. epidermidis that caused traumatic endophthalmitis and found that those strains causing intraocular infection may be independent of biofilm formation. Overall, this study revealed genetic diversity and pathogenic differences in different sources of S. epidermidis.

DATA AVAILABILITY STATEMENT
The sample and sequence data obtained in this study have been submitted to the NCBI BioSample and Sequence Read Archive (SRA) under BioProject accession number PRJNA753005.

AUTHOR CONTRIBUTIONS
SL, BS, JS, YL, and MZ contributed to conception and design of the study. BS, XM, TW, and JX collected the strains and extracted DNA. SL and XG collected databases and performed the bioinformatics analysis. YX and YG performed the drug sensitive test. SL, XS, and MZ wrote the manuscript. All authors contributed to the article and approved the submitted version.