Identification of potential pathogenic targets and survival strategies of Vibrio vulnificus through population genomics

Vibrio vulnificus, a foodborne pathogen, has a high mortality rate. Despite its relevance to public health, the identification of virulence genes associated with the pathogenicity of currently known clinical isolates of V. vulnificus is incomplete and its synergistic pathogenesis remains unclear. Here, we integrate whole genome sequencing (WGS), genome-wide association studies (GWAS), and genome-wide epistasis studies (GWES), along with phenotype characterization to investigate the pathogenesis and survival strategies of V. vulnificus. GWAS and GWES identified a total of six genes (purH, gmr, yiaV, dsbD, ramA, and wbpA) associated with the pathogenicity of clinical isolates related to nucleotide/amino acid transport and metabolism, cell membrane biogenesis, signal transduction mechanisms, and protein turnover. Of these, five were newly discovered potential specific virulence genes of V. vulnificus in this study. Furthermore, GWES combined with phenotype experiments indicated that V. vulnificus isolates were clustered into two ecological groups (EGs) that shared distinct biotic and abiotic factors, and ecological strategies. Our study reveals pathogenic mechanisms and their evolution in V. vulnificus to provide a solid foundation for designing new vaccines and therapeutic targets.


Introduction
Vibrio vulnificus is an opportunistic Gram-negative pathogenic bacterium broadly distributed in estuarine and coastal waters that typically infects people through consumption of tainted raw seafood or direct contact with seawater through open wounds.Primary septicemia is the most lethal consequence of V. vulnificus foodborne infection, with a mortality rate of more than 50% in immunocompromised patients, while wound infection has a mortality rate of 17% (Jones and Oliver, 2009).The distribution of V. vulnificus is related to temperature.With global climate change and the increase in seawater temperature, the number of infection cases and the geographical distribution of the pathogen are expanding (Motes et al., 1998).
Given its potential as an emerging infectious disease, an understanding of the variation in and distribution of V. vulnificus strains is of high priority.Molecular typing methods used to date include the polymerase chain reaction (PCR) analysis of variations in the virulence-correlated gene (vcg) (Warner and Oliver, 2008), 16S rRNA gene sequence analysis (Nilsson et al., 2003), multi-locus enzyme electrophoresis (MLEE), random amplification of polymorphic DNA (RAPD) (Gutacker et al., 2003), multi-locus sequence typing (MLST) (Bisharat et al., 2005;Sanjuań et al., 2011), and the genome-wide core-single nucleotide polymorphism (SNP) phylogenetic tree (Roig et al., 2018).Traditional typing methods have classified V. vulnificus into three biotypes.However, the genome-wide core-SNP phylogenetic tree provides better resolution for reconstructing relationships among samples than traditional methods, has classified V. vulnificus isolates fall into 5 lineages (Roig et al., 2018).However, currently published data on the phylogenetic structure of V. vulnificus have a small sample size and sampling bias (e.g., no Chinese mainland isolates are included).Thus, expanding the V. vulnificus genome dataset is necessary for a comprehensive reconstruction of the phylogenetic structure of the species.
While some virulence factors have been shown to be crucial for V. vulnificus pathogenicity (Jones and Oliver, 2009;Pettis and Mukerji, 2020;Yuan et al., 2020), the specific virulence genes of clinical strains of V. vulnificus remain unknown.Genome-wide association studies (GWAS) aim to capture associations of genotypes with phenotypes by testing hundreds of thousands of genetic variants across many genomes.GWAS have been successfully used in bacterial research to uncover antibiotic resistance, virulence, host specificity, and prognosis and can potentially be applied to any heritable bacterial traits (Falush and Bowden, 2006).However, bacterial GWAS are also challenging.For example, factors such as the clonal population structure caused by the mode of division and reproduction of bacteria, the vast differences in recombination rates between species, and the high frequency of gene deletions can make it difficult to identify specific variants responsible for phenotypes.In addition, phenotypes are complex traits that are not determined by monogenetic features but rather by the functional interactions of larger groups of gene products.Therefore, it is essential to explore both full sets of causal genetic variants and the complex interactions between genes (epistasis) to increase our understanding of bacterial pathophysiology.
The abundant genome-wide linkage disequilibrium patterns in bacterial genomes make the identification of causal variants problematic.A thorough study of these patterns of genetic variation in populations thus makes it possible to locate complex disease gene complexes, explore the phenotypic diversity of populations, and acquire new insights into the evolutionary history of the species.To date, few studies have analyzed hypothesis-free co-selection of gene variants, known as genomewide epistasis studies (GWES), to detect co-selection signals in a group (Skwark et al., 2017;Schubert et al., 2019;Cui et al., 2020;Zeng et al., 2020;Chewapreecha et al., 2022), and no data exist for V. vulnificus.
In this study, we first analyze the whole genome sequences of 518 V. vulnificus isolates and their isolation sources by GWAS.Then, we systematically examine co-adaptation patterns among V. vulnificus genetic variants, including core and accessory genome variants using GWES.We identify six genes linked to the pathogenicity of V. vulnificus clinical isolates, of which five were newly discovered in this study.We also uncover complicated gene interactions indicating that core and accessory genomes have co-evolved to produce coadapted gene complexes that encode distinct ecological strategies.Our findings indicate that these isolates and the genetic variants encoding them can be characterized by hierarchical clustering into groups that reflect patterns in the evolution of V. vulnificus.Finally, we explore the conditions that likely shaped V. vulnificus evolution by characterizing the phenotypes of isolates under different environmental conditions.Our results suggest that V. vulnificus has gradually altered its fitness landscape through co-adaptation.

Bacterial isolates
In total, 518 isolates (from 14 countries, 1964 to 2018) were used in this study, including 325 newly sequenced isolates (29 clinical and 296 environmental) and 193 publicly available isolates (58 clinical, 124 environmental, and 11 unspecified).Detailed sampling information for the V. vulnificus isolates used in this study is listed in Supplementary Table S1.The genomes of the publicly available isolates were downloaded from the NCBI database (https:// www.ncbi.nlm.nih.gov/genome/browse#!/prokaryotes/189/).

Culture conditions
All isolates were taken from the freezer (-80°C), streaked on Columbia blood agar (CBA) plates, and grown at 37°C for 24 hours.A single colony of a strain was transferred to 5 mL Luria-Bertani (LB) medium containing 2% NaCl and cultured at 37°C to the exponential phase with shaking.

DNA preparation
DNA was extracted from samples using the QIAGEN UltraClean® Microbial DNA Isolation kit, Catalog no.12224-50, as per the manufacturer's instructions.

Whole-genome sequencing and assembly
Sequencing was performed on the Illumina MiSeq platform with 150 bp paired-end read length.Raw reads with low quality were trimmed with the FASTQ Quality Filter (FASTX-Toolkit) (Pearson et al., 1997).De novo assembly of the filtered reads was performed using Shovill version 1.0.4(Bankevich et al., 2012) with standard parameters.Sequenced isolates had an average genome size of 5.03 Mb and GC content of 46.71%.

Variation detection, phylogenomic analysis, and annotation
SNPs identified from aligning the assemblies to the reference genome YJ016 with the NUCmer module in the software MUMmer 3.0 (Kurtz et al., 2004) were used to describe genetic relationships between isolates.SNPs located in repeated regions with low sequence quality (quality score <20 or covered by <10 reads) were excluded to identify SNPs in the core genome (regions presented in all isolates).After filtering, core-genome SNPs were then used in the maximum likelihood tree (MLTree) construction by RaxML (Stamatakis, 2006), and the MLTree was visualized with iTOL (Letunic and Bork, 2016).
All genomic sequences were annotated using Prokka (Seemann, 2014).We used GFF3 files generated by Prokka passed to Roary (Page et al., 2015) to create a pangenome and output gene presence/ absence for each isolate.To obtain additional annotation, we used the pan-gene protein sequences of Roary to BLAST (BLASTP) against the COG and KEGG databases.

Genome-wide association studies
The 518 V. vulnificus isolates, including 87 clinical and 420 environmental isolates, were analyzed using the software Pyseer (Lees et al., 2018) and DBGWAS (Jaillard et al., 2018), using the isolate source (clinical or environmental) of V. vulnificus as a phenotype.
In Pyseer, the effects of SNPs, insertion and deletion of accessory genes, or k-mers on phenotype was evaluated, and the corresponding P-value calculated.SNP-based GWAS captures variants in the bacterial core genome.Insertion and deletion of accessory genes-based GWAS captures variants in the bacterial accessory genes insertions and deletions.Finally, k-mers are DNA sub-sequences of length k (typically 3-100 base pairs), and k-mersbased GWAS can reflect diverse genetic events, including SNPs, insertion and deletion of accessory genes, and cover the noncoding regions, including those related to transcriptional and translational regulation, overcoming limitations of analysis at the level of SNPs and non-core gene insertions/deletions (Jaillard et al., 2018).
DBGWAS is a k-mer-based GWAS approach that generates interpretable genetic variants linked to diverse phenotypes.Utilizing compacted De Bruijn graphs (cDBG), this method groups cDBG nodes pinpointed by the association model into subgraphs defined by their neighborhood in the initial cDBG.DBGWAS does not require prior annotation or reference genomes.Importantly, it is also computationally efficient.

Genome-wide epistasis and co-selection studies (GWES)
Co-selection analysis was performed separately on the sequence alignment of 518 V. vulnificus isolates using the GWES tool SpydrPick (Pensar et al., 2019).SpydrPick hinges on mutual information (MI), which is a general measure of the dependence between two variants.Pairwise analysis of variants by SpydrPick can be performed using core-genome SNPs and pan-genomewide analyses.

Bacterial isolates and culture conditions
For the phenotype experiments, five isolates were randomly selected from two EGs (H1 and S1 from EG1; S8, S9, and S13 from EG2).Bacteria were cultured as described above ("Culture Conditions").

Motility assay
Isolates were inoculated on a swimming plate (LB media containing 0.3% agar) and a swarming plate (LB agar with 3% NaCl) after being cultivated at 37°C overnight to an A 600 of 0.8.Swimming ability was recorded by measuring the diameter of the colony at 12, 24, 36, 48, and 72 h after inoculation onto a swimming plate at 37°C.Swarming ability was recorded after 72 h on a swarming plate at 37°C.

Biofilm formation
The overnight bacterium liquid was added to a 24-well cell culture plate at 37°C for 24 h without shaking; 2% NaCl-LB medium served as a negative control.Each well was rinsed three times with 1× phosphate-buffered saline (PBS) after the supernatant was removed.Adherent bacteria were fixed with methanol for 15 min and then methane was aspirated from the culture wells, leaving them to air dry naturally.Plate cells were then stained with 0.1% crystal violet for 5 min, followed by washing with 1× PBS three times.The bound dye was dissolved with 33% glacial acetic acid.Finally, the optical density of the solution was measured using a Multiskan Spectrum at 590 nm (Narisawa et al., 2005;Yao et al., 2012;Naparstek et al., 2014;Tango et al., 2018;Fan et al., 2020).The experiments were conducted in triplicate, with three repetitions.Biofilm formation was classified as highly positive (A 590 >0.06), low-grade positive (0.03<A 590 ≤0.06), or negative (A 590 ≤0.03).

Anaerobic culture
An anaerobic chamber (Gene Science) filled with a gas mixture comprised of 90% N 2 , 5% CO 2 , and 5% H 2 was used to create anaerobic conditions.The isolates were inoculated on a Columbia blood agar plate and cultivated at 37°C for 5 days under anaerobic conditions.

Biochemical and antimicrobial susceptibility tests
The VITEK-2 automated microbial identification and drug sensitivity analyzer (BioMeŕieux) was employed for measuring biochemical parameters and antimicrobial susceptibility.

Genetic diversity of Vibrio vulnificus isolates
We performed whole genome variation detection of 518 V. vulnificus isolates from 14 countries between 1964 and 2018.Isolates included 193 isolates from the NCBI database -58 clinical, 124 environmental, and 11 unspecified; and 325 isolates from newly collected samples -29 clinical and 296 environmental (Supplementary Table S1).From the analysis, 373,704 core-genome SNPs were identified, including 37,386 intergenic SNPs, 211,731 synonymous SNPs, 121,162 nonsynonymous SNPs, and 3,425 nonsense SNPs.The median pairwise SNP distance between all 518 isolates was 24,403, implying a remarkably high genetic diversity among V. vulnificus isolates.
A maximum likelihood (ML) tree of the 518 isolates based on core-genome SNPs with V. vulnificus YJ016 as the reference sequence divided the species into six well-defined lineages (Figure 1).The isolates were mostly in lineages L1 and L2 (93.1%).L1 primarily contained isolates from China (86.8%),India, Mexico, and Bangladesh (100%), with environmental isolates accounting for a large proportion (85.9%).L2 primarily contained isolates from America (50.7%),Denmark, Australia, and France (100%), with environmental isolates accounting for a large proportion (75.4%).L3 only contained biotype 3 isolates from Israel.The number of L3, L4, and L5 isolates was small, but these lineages had a high proportion of clinical isolates (75%).L6 only contained three isolates, of which two were environmental isolates and one was unknown.The overlap in the geographic distributions of the six evolutionary lineages, for both environmental and clinical isolates was considerable, indicating lineages are not type-or geographic location-specific.

Novel pathogenicity-associated variants captured through GWAS
To explore novel genetic variants of V. vulnificus, we analyzed GWAS results of the 518 V. vulnificus isolates.We used Pyseer to identify variants in the whole genome sequences of V. vulnificus associated with clinical and environmental phenotypes based on SNPs, insertion and deletion of accessory genes, and k-mers, respectively.To reduce the false positive rate, we used a strict threshold to judge the analysis results in combination with prior knowledge (Lees et al., 2018), i.e., only genetic variation with a statistical test P-value below 1.74 x 10 -9 was included [ -log 10 (Pvalue) >8.76; Figure 2].We discovered 567 genetic variants related to the clinical phenotypes of V. vulnificus, located on 567 coding genes at the k-mers level.Twenty-eight genetic variants were identified at both the SNPs level and k-mers level, and eleven genetic variants were identified at both the insertion and deletion of accessory genes level and k-mers level.These variants involved genes that encode structural proteins, transport proteins, metabolic proteins, and signal proteins; thus, they can directly or indirectly affect the pathogenicity of V. vulnificus through their roles in energy production and conversion, cell cycle control, translation, ribosomal structure and biogenesis, replication, recombination, and repair, cell wall/membrane/envelope biogenesis, cell motility, intracellular transport, nucleotide/amino acid/carbohydrate/ coenzyme/lipid transport and metabolism, inorganic ion transport and metabolism, post-translational modification, protein turnover, chaperone functions, and signal transduction mechanisms.Moreover, the purH (encoding bifunctional purine biosynthesis protein PurH) (Kim et al., 2003), and pldA (encoding phospholipase A1) (Koo et al., 2007) genes identified in this analysis have previously been experimentally verified as associated with the pathogenicity of V. vulnificus, thus confirming the effectiveness of our method.
We also used DBGWAS to test the association between k-mers and clinical vs. environmental phenotypes.Overall, we found 100 nodes related to the clinical phenotypes of V. vulnificus, located on 20 coding genes [P-value = 2.10 x 10 -7 , -log 10 (P-value) >6.68; Figure 3].These variants also involved genes that encode structural proteins, transport proteins, metabolic proteins, and signal proteins.The gene products in this group are involved in energy production and conversion, cell membrane biogenesis, nucleotide/amino acid transport and metabolism, protein turnover, and signal transduction mechanisms.In addition, purH (Kim et al., 2003), and pldA (Koo et al., 2007) genes identified by Pyseer in this analysis were also identified by DBGWAS, which have previously been experimentally verified as associated with the pathogenicity of V. vulnificus, thus confirming the effectiveness of our method.

Detection of co-selection signals
Current GWAS methods are not accurate enough to identify causal variants, due to extensive genome-wide linkage disequilibrium and they ignore the effect of mutations on protein function (Chen and Shapiro, 2015).Therefore, we performed GWES to identify mutations throughout the genome that are Pyseer results for V. vulnificus.The X-axis is average effect size, and Y-axis is the -log 10 (P-value) score of Pyseer.Labeled black dots identify genes that carried significant k-mers detected in current study.Labeled red dots identify genes that carried both significant k-mers and SNPs detected in current study.Labeled yellow dots identify genes that carried both significant k-mers and SNPs detected in current study.DBGWAS results for V. vulnificus.Each subgraph represents a unique genetic and its mapped genes.Colors are continuously interpolated between blue for clinical phenotype and red for environmental phenotype.Gray indicates untested unitigs (present in > 99% or < 1% of the strains).Insignificant nodes are represented with a degree of transparency.The size of a node is proportional to its allele frequency: the larger the node, the higher the allele frequency.

likely to have co-evolved, i.e., potential epistatic interactions between
We used SpydrPick on the whole genome sequences of 518 V. vulnificus isolates to assess potential epistatic interactions between genes.Most strong associations occurred on the chromosome between sites within 3 kb.We thus eliminated all sets of associations that spanned less than 3 kb, including those between core/accessory genome elements, to remove associations caused solely by physical linkage, which mask the co-adaptive signals we seek.After screening (extreme outlier threshold ≥0.28), 28,508 coadaptation groups were identified that could be further subdivided into 166 co-adaptation networks containing 464 core-genome SNPs (53 intergenic SNPs, 221 synonymous SNPs, 177 nonsynonymous SNPs, and 13 nonsense SNPs) on 34 core genes and 750 accessory genes (Supplementary Table S3; Supplementary Figure 1).The genes included in each co-adaptation group, as well as the corresponding annotation information, are available in Supplementary Table S4.
The network of epistatic interactions was examined to identify potential V. vulnificus virulence genes.The high-frequency genes in the largest co-adaptation network (N1) were flgK, flgE, and flgL (Supplementary Table S4).As a result, we considered flgK, flgL, and flgE genes to be relevant prospective targets of co-selection, i.e., potential V. vulnificus virulence genes.The genes flgK (Kim et al., 2008), flgL (Kim et al., 2008), and flgE (Lee et al., 2004) identified in our co-adaptation network have previously been shown (through deletion mutants) to affect the lethality of V. vulnificus in mice, lose the ability to form flagella, lose mobility, and exhibit serious defects in cell adhesion and biofilm formation, thus confirming the effectiveness of our method.Our GWES results combined with GWAS results showed that a total of 6 genes, namely purH, gmr, yiaV, dsbD, ramA, and wbpA, were associated with the pathogenicity of V. vulnificus clinical isolates (Figure 4).The purH gene encodes PurH, which is involved in purine nucleotide biosynthesis and catalyzes the last two steps in the de novo biosynthesis of IMP (the first nucleotide in the de novo purine biosynthesis pathway).The purH gene has previously been shown (by deletion mutants) to alter the lethality and cytotoxic activity of V. vulnificus in mice and is a virulence gene of V. vulnificus (Kim et al., 2003).Five genes, namely purH, gmr, yiaV, dsbD, ramA, and Networks of interacting co-selected gen-gene pairs.Each node denotes a gene under co-selection.Zhang et al. 10.3389/fcimb.2023.1254379Frontiers in Cellular and Infection Microbiology frontiersin.orgwbpA, were newly discovered in this study.The gene gmr Cyclic di-GMP phosphodiesterase regulates the enzyme for synthesis of cyclic di-GMP.Cyclic di-GMP has emerged as one of the most common and essential bacterial second messengers, with the ability to influence biofilm formation, motility, virulence, the cell cycle, differentiation, and other processes, hence influencing V. vulnificus pathogenicity (Römling and Amikam, 2006;Römling et al., 2013;Jenal et al., 2017).The gene yiaV (encodes inner membrane protein yiaV precursor) (Hu and Coates, 2005;Shimada et al., 2022) and dsbD (encodes disulfide interchange protein dsbD precursor) affect antimicrobial drug tolerance of Escherichia coli (Missiakas et al., 1995).The gene ramA (encodes R-stereoselective amidase) affects antimicrobial drug tolerance of Klebsiella pneumoniae (Veleba and Schneiders, 2012).The gene wbpA encodes UDP-N-acetyl-D-glucosamine 6-dehydrogenase, which is involved in the synthesis of LPS.It has been previously demonstrated that the wbpA deletion mutant can affect the synthesis of LPS in Pseudomonas aeruginosa.LPS can evade host defenses, resist phagocytosis and serum-mediated killing and is also a crucial virulence factor for V. vulnificus (Burrows et al., 2000;Jones and Oliver, 2009).

Complex structure and ecological group differentiation of co-adaptation networks
The largest co-adaptation network (N1) included most of the interacting SNPs, which challenged interpretation due to its abundance of pairwise interactions.We thus used hierarchical clustering based on N1 variants to classify variants into two ecological groups (EGs), EG1 and EG2 (Figure 5A).EG1 contained L1 and L2 variants, while EG2 included variants within L1, L2, L3, L4, L5, and L6.EG1 isolates (1/59) had a lower percentage of clinical samples than EG2 isolates (86/459), indicating a lower virulence potential of this ecological group in humans.Despite the substantial number of coadaptation-related differences between EG1 and EG2, both ecological groups had no obstacle to gene exchange between groups in most regions of the genome (Figure 5B).
We detected 4 co-adaptation networks (N54, N78, N79, and N80) involving SNPs that were interactions of the type "incompatibility", meaning that when one or more genes exist, at least other genes are absent.We detected a further 161 coadaptation networks (N2-N53, N55-N77, and N81-N166) involving accessory genes that were "genome island-like".

Multidimensional phenotyping
To assess the potential role of coadapted gene complexes on ecologically important phenotypic traits, we conducted a preliminary investigation of isolates from the two EGs (n = 5, 2 isolates for EG1, 3 isolates for EG2) to measure relevant phenotypes: growth rate under temperature challenges (4°C, 37°C, and 45°C), osmotic challenges (2, 4, 6, and 8% NaCl), and pH challenges (pH 4,  5, 6, 7, 8 and 9); motility assay, biofilm formation ability, survival under anaerobic conditions, biochemical parameters, and antimicrobial susceptibility.Neither EG1 nor EG2 isolates grew at 4°C or 45°C.EG2 isolates had a faster growth rate than EG1 isolates at 37°C.All V. vulnificus grew best at 2% NaCl concentration, but EG2 isolates were more tolerant to high salt environments than EG1 isolates (Figure 6).The growth states of the two ecological groups were similar at acidic (pH 4-5) and alkaline (pH 9) conditions.EG2 isolates grew faster than EG1 isolates at pH 6-8.In addition, acidic conditions (pH 4-5) are not suitable for the growth and reproduction of V. vulnificus.Neutral conditions (pH 6-7) favor the growth and reproduction of V. vulnificus.Alkaline conditions (pH 8-9) not only impede the growth and reproduction of V. vulnificus but also inhibit its activity (Figure 7).The growth variation of two ecological groups under temperature challenges, osmotic challenges and pH challenges occurred mainly during the stabilization and senescence phases of bacterial growth, with little change during the logarithmic growth period.In motility assay, both EG1 and EG2 isolates had strong swimming and swarming abilities, but there was no difference (Figure 8).The biofilm formation ability of EG2 isolates was stronger than that of EG1 isolates (Figure 8).As expected, all isolates failed to grow under strict anaerobic conditions (Supplementary Figure 2).Biochemical assays detected no difference among isolates (Supplementary Table S5).Minimal inhibitory concentration (MIC) values were employed to assess antimicrobial susceptibility.Isolates were sensitive to most of the antibiotics tested, but insensitive to cefazolin (Supplementary Table S6).Cefazolin, as a first-generation cephalosporin, has limited effectiveness against Gram-negative bacteria, making them more susceptible to developing resistance.This is attributed to the relatively weak stability of cefazolin to b-lactamase produced by Gram-negative bacteria.Previous studies have reported the resistance of V. vulnificus to cefazolin (Wang et al., 2009;Pan et al., 2013).

Discussion
Our results combined GWAS and GWES and applied them to the field of V. vulnificus for the first time.In this study, we used a robust mix of WGS, GWAS, and GWES, as well as phenotype description, to analyze 518 genomes of V. vulnificus isolates collected worldwide from a variety of sources to investigate genetic diversity, virulence genes, the drivers of genotypic and phenotypic evolution, and the emergence of coadapted gene complexes.Our phylogenomic analysis indicates that V. vulnificus has diverged into six distinct lineages, L1 through This result improves resolution of V. vulnificus population structure, identifying one more lineage than a previous study (Roig et al., 2018).The six lineages overlap extensively across biogeographic regions.Given that V. vulnificus cannot permanently colonize large animals, seabirds, or other aquatic animals that migrate long distances, we speculate that human activities, such as shipping and global trade in aquatic products, have facilitated the spread of V. vulnificus across oceans, allowing for genetic exchange between bacteria from different seas.This has ultimately affected the structural composition and evolutionary patterns of bacterial populations.
According to our GWAS results, thirteen genes are associated with the pathogenicity of V. vulnificus clinical isolates, of which eleven were newly discovered in this study.The purH (Narisawa et al., 2005), and pldA (Naparstek et al., 2014) genes identified in our analysis have been experimentally linked with the pathogenicity of V. vulnificus.These findings highlight the reliability of our methodology and the potential of GWAS for investigating pathogenesis in bacteria.
We were successful in examining potential epistatic interactions using GWES to identify co-evolved proteins and thus hypothesize potential networks of functionally linked genes.Our GWES results combined with GWAS results showed that a total of six genes correlate with the pathogenicity of V. vulnificus clinical isolates, of five (gmr, yiaV, ramA, and wbpA) were newly discovered in this study.Our findings can be applied to find candidate targets for vaccine development against V. vulnificus and provide an important foundation for further insight into the pathogenic mechanism of this species.Future studies should consider gene knockout and in vivo infection experiments on these targets.
Furthermore, we discovered that all V. vulnificus isolates could be separated into two EGs based on core-genome SNPs by GWES.To further elucidate the evolutionary drivers behind the divergence of the two EGs, we conducted phenotypic experiments.The two EGs exhibit distinct ecologies based on our investigation of V. vulnificus phenotypes, genomic evolution, and distribution.EG1 members appear to be more adapted to low-nutrient, lower-salinity, and brackish-like water environments, whereas EG2 members were more likely to come from rich-nutrient, higher-salinity, ocean-like environments.Our phenotypic experiments indicate EG2 isolates can withstand a wider range of stressors and have a competitive advantage in colonizing and growing in diverse hosts, which might explain why most isolates were obtained from this cluster.It was noteworthy that the proportion of clinical samples in EG1 isolates Temperature and salinity survival assays of V. vulnificus isolates from two EGs.Growth curves of isolates from two EGs measured using A 600 at different temperatures (4°C, 37°C, and 45°C), and salinities (2% NaCl, 4% NaCl, 6% NaCl, 8% NaCl).
(1/59) was lower than that in EG2 isolates indicating a lower potential of this ecological group in humans.Interesting findings here include the clear presence of clinical isolates in both EG1 and EG2 that confirm work that "clinical isolates" are not phylogenetically grouped.We speculate that these phylogenetically distant clusters occupy distinct niches that may differ in their natural hosts or habitats, and this physical isolation may lead to obvious evolutionary pressures that reduce the likelihood of encounters and recombination, leading to genetic isolation and the emergence of distinct ecotypes, with potentially devastating consequences for aquaculture and human health.Despite unresolved issues, our findings emphasize the central role of lateral motility in constructing ecologically significant variation within the species.We also found that the co-adaptation networks went through progressive stages, consistent with the previous research results in Vibrio parahaemolyticus (Cui et al., 2020), such as accidental combination, semi-stability, stability, and the emergence of new species.These data provide insight into the evolution of V. vulnificus and its potential as an emerging infectious disease.

Conclusions
We combined WGS, GWAS, and GWES, as well as detailed phenotypic analysis of V. vulnificus to reveal the presence of six phylogenetic lineages that are related to the geographical distribution of isolates.Human activities such as shipping and trade in global aquatic products may contribute to the transoceanic transmission of V. vulnificus.The results obtained from GWAS and GWES reveal complex genomic variations, providing a rigorous prediction of which genes were essential for V. vulnificus Acid-base survival assays of V. vulnificus isolates from two EGs.Growth curves of isolates from two EGs measured using A 600 at different pH (pH 4, pH 5, pH 6, pH 7, pH 8, pH 9).clinical isolates' pathogenicity.Particularly, genes gmr, yiaV, dsbD, ramA, and were identified as of the co-selection maps.Additionally, V. vulnificus isolates were clustered into two EGs (EG1 and EG2) with distinct patterns of bacterial behavior in our investigation.Our study can be utilized to better understand the pathogenic mechanism and evolution of V. vulnificus.

FIGURE 1
FIGURE 1Population structure of V. vulnificus.A Maximum Likelihood (ML) tree of 518 V. vulnificus isolates was constructed based on 373,704 core-genome SNPs.The ring colors from inner to outer indicate sample type and country of sampling, respectively.Branch colors indicate lineage: orange for L1, purple for L2, green for L3, blue for L4, pink for L5, gray for L6.

FIGURE 3
FIGURE 3 FIGURE 5 The largest co-adaptation network (N1) identified by SpydrPick.(A) The x-axis is the 518 V. vulnificus isolates and the right y-axis indicates coadaptation loci.Colors of the heatmap indicate the status of genetic variants, with light yellow/orange for allele1 and allele2 of the SNP allele.The left clustering tree indicates different tiers (Tier 1 and Tier 2).The upper clustering tree indicates (from top to bottom) different EGs (EG1, and EG2), lineages, and sources (clinical vs environmental sample).(B) A NJ tree of Vibrio isolates.EG1 isolates are indicated with green branches and EG2 isolates with blue branches.