Vibrio vulnificus mutation rate: an in vitro approach

Vibrio vulnificus is a multi-host pathogenic species currently subdivided into five phylogenetic lineages (L) plus one pathovar with the ability to infect fish due to a transmissible virulence plasmid. This plasmid (or a fragment of it) has been transmitted between lineages within the species, contributing to the evolution of V. vulnificus. This study aimed to provide an experimental approximation to the V. vulnificus mutation rate by determining spontaneous mutation rates from bacterial cultures of representants of the different lineages by whole-genome sequencing. To this purpose, synonymous SNP differences, i.e., spontaneous mutation not subjected to the evolutive forces, between initial and final culture after serial growth were evaluated and used for mutation rate calculation.


Introduction
Vibrio vulnificus is an emerging zoonotic pathogen that inhabits brackish water ecosystems from temperate and subtropical areas. In recent years, the pathogen has expanded its geographical distribution to coastal areas near the Arctic Circle due to global warming (Baker-Austin et al., 2013, 2017. V. vulnificus survives in water, either associated with the mucous surfaces of algae and aquatic animals or as a free-living bacterium that can be concentrated by filtering organisms such as oysters, which are its main reservoir (Oliver, 2015;Baker-Austin and Oliver, 2020).
The diseases caused by this species are globally known as vibriosis. Fish vibriosis is a hemorrhagic septicemia that occurs in farms as epizootics or outbreaks of high mortality . V. vulnificus infects fish by contact, colonizes the gills, enters the blood, and causes death by septicemia in <24 h (Marco-Noales et al., 2001). Human vibriosis can be acquired by either ingestion of raw seafood or contact of wounds with seawater or carrier fish (Oliver, 2015;Baker-Austin and Oliver, 2020). In the first case, the pathogen colonizes the intestine and causes gastroenteritis and/or primary septicemia. In the second case, the pathogen crosses the skin barrier causing local but severe necrosis and/or secondary septicemia. The main factor that predisposes to death by sepsis is a high level of serum iron because of multiple pathologies (e.g., hemochromatosis, diabetes, cirrhosis, and viral hepatitis) (Oliver, 2015;Baker-Austin and Oliver, 2020).
V. vulnificus was defined as a bacterial species in 1976 (Farmer, 1979) and was later split into three biotypes (Bts) based on a few biochemical and serological differences as .
This study aimed to estimate the MR for both chromosomes of V. vulnificus in strains representative of three main phylogenetic lineages by using an "omic" approach through the identification of synonymous SNPs (sSNPs) present in the same strain after 30 days of serial 24 h cultures. Our estimate of the MR can be used to predict the divergence time between strains or variants, thus providing a temporal scale for relevant evolutionary and epidemiological events in V. vulnificus.

Materials and methods
Bacterial isolates, culture conditions, DNA extraction, and sequencing The strains (Table 1) were routinely grown in tryptone soy broth or agar plus 5 g/l NaCl (TSB-1 or TSA-1, Pronadisa, Spain) at 28 • C for 24 h. The strains were maintained both as lyophilized stocks and as frozen stocks at −80 • C in marine broth (Difco) plus 20% (v/v) glycerol.

Serial cultures
Tubes containing 10 ml of fresh TSB-1 were inoculated with a single colony from TSA1-plates and were incubated at 28 • C for 24 h. A volume of 100 µl from each grown tube was inoculated in a new tube containing 10 ml of the same medium and this was incubated again at 28 • C for 24 h. The procedure was repeated daily for 1 month. To control the bacterial growth in each tube, the number of bacteria was estimated by drop plate (Herigstad et al., 2001) at times 0, previously to each new inoculation. Plates were incubated for 24 h at 28 • C. The number of UFC was expressed in UFC/ml. This calculus shows the information referent to the number of daily inoculum and the growth rate of the day to check the proper evolution of the experiment.

Number of generations
The number of generations was calculated using the expression (Maier, 2009) from the equation:

Number of generations = log 2 cells at end of incubation cells at beginning of incubation
where the initial number of bacteria is the initial inoculum (number of UFC) and the total number of bacteria is the accumulative number of bacteria obtained from the drop-plating counts.
Sequencing, filtering reads, and adapter removal DNA was extracted using GenElute TM Bacterial Genomic DNA (Sigma, Spain). Samples with a DNA concentration of 10-15 ng/µl were used for sequencing using the Illumina Genome Analyzer .
Technology GAII flow cell in the Genome Analysis Center in Norwich (UK). To this end, unique index-tagged libraries for each sample (up to 96 strains) were created using TruSeq DNA sample preparation for subsequent cluster generation (Illumina cBot), and up to 12 separate libraries were sequenced in each of eight channels in Illumina Genome Analyzer GAII cells with 100 base paired-end reads. The index-tag sequence information was used for downstream processing to assign reads to the individual samples (Harris et al., 2010). The quality of the FASTQ files was checked using FastQC (http://www.bioinformatics.babraham.ac. uk/projects/fastqc/). After the quality analysis, based on the results obtained, the adapters were removed using the program Cutadapt (Martin, 2011); after this step, the preprocessing of the data was carried out using Prinseq (Schmieder and Edwards, 2011). The preprocessing was carried out by filtering terminal indetermination in both ends (N), and sequences with mean quality (Phred quality score) lower than 26 or with more than 5% of indetermination or size smaller than 150 bp were removed. Finally, bases, in both ends, with quality lower than 28 were trimmed. A filtering quality of the reads was performed based on quality scores and the presence of Ns in the sequence. Finally, unpaired reads were removed from the preprocessed fastq files using a custom script. Raw data can be accessed as an SRA project with the accession number PRJNA534103.

Variant calling
The preprocessed reads were mapped using the Bowtie2 software (Langmead and Salzberg, 2012), as previously described (Roig et al., 2018) using the corresponding genes of the strain YJ016 (BA000037.2-BA000038.2) as reference. As a reference to plasmids, selected based on previously described plasmid distribution in the species (Roig and Amaro, 2009), the plasmid of the strains YJ016 and CECT4602 (AP005352.1, AM293859, and AM293860) was used. The variant calling was performed using the pipeline GATK with Picard and Samtools (Li et al., 2009;McKenna et al., 2010;Pirooznia et al., 2014) using the default options of each tool.

Mutation rate and divergence time calculation
The sSNPs were selected because we assume that they are neutral or nearly neutral in terms of selection and therefore allow for relatively unbiased estimation of SNP accumulation (Hershberg and Petrov, 2010). The previously described expression (Galloway-Peña et al., 2012). Synonymous SNP sites were calculated by elucidating the sSNP sites for each codon within the core genome genes .
Present in the core of the strains with those plasmids. (Roig et al., 2018). The number of potential sSNP positions is listed in Table 2. The calculation was performed by analyzing all codons of the reference sequence, and for each codon, the positions where synonymous and non-synonymous changes can be produced were analyzed. Finally, the VCF files obtained for each sampling time were compared to obtain the sSNPs. Those positions in which the change, synonymous or non-synonymous, occurred between sampling times, regardless of their base present in the reference, were considered as a change.
The time required for sequence differentiation between strains can be estimated from the number of SNPs accumulated during a known time of change between the ancestral and derived strains (Foster et al., 2009;Galloway-Peña et al., 2012;Sandberg et al., 2014). The following equation was used to obtain an approximate age of divergence for each pairwise comparison: Divergence time = number of sSNPs (num of possible sSNP×MR×num. of generations per year× 2) .

Number of generations
The final average number of generations calculated using the previously expressed formula turned out to be 215 (213 ± 2 for 94385, 214 ± 1 for 162, and 219 ± 4 for CECT4604) in 30 days of continuous culture experiment between the initial inoculum and the final culture. In the absence of real data for V. vulnificus, as previously mentioned, this number of generations will be considered the number of generations for 1 year of environmental growth.

Characteristics of the genomes
Two different strains representative of lineages 2 and 3 plus one additional strain representative of one of the three geographically specific lineages (L4) were used in the study (Table 1). A total of 2,867 chromosomal genes conformed the core genome of the three strains, defined as the set of genes with a length of 80% and a DNA identity higher than 70% with respect to the homologous gene in the reference genome (Roig et al., 2018). The accessory genome contained 2,229 genes in strain 94385, 3,082 genes in strain CECT4604, and 2,479 genes in strain 162. The distribution of genes, total size, and synonymous sites in the three strains for the different chromosomes and plasmids are shown in Table 2.
The accessory genome contained 2,229 genes in strain 94385, 3,082 genes in strain CECT4604, and 2,479 genes in strain 162. From these accessory genes, 58 genes corresponded to a conjugative plasmid present in strain 94385, and 67 and 69 genes to virulence and conjugative plasmids, respectively, both present in strain CECT4604. The distribution of genes, total size, and synonymous sites in the three strains for the different chromosomes and plasmids are shown in Table 2.

Mutation rate
The obtained average number of generations in 30 days was 215. This value corresponds to a generation time of 3.35 h under lab growth conditions. MRs were estimated separately for the chromosomes and plasmids in the three studied strains (Tables 3-7). The average MR estimate for the core genome was 7.291E-08; for the accessory genome, the average MR was 1.370E-07; and for the pan genome, the average MR was 8.643E-08 mutations per base pair per generation. The MR estimated for each isolate was different, for CECT4604 and 162 strains, the MR was higher for the accessory chromosome II than for chromosome I. In contrast, strain 94385 showed a higher MR in the accessory genome of chromosome I than in the accessory genome of chromosome II; these changes also were reflected in a higher MR for chromosome I than in chromosome II for the pan genome (Tables 4, 5).

The ratio of non-synonymous to synonymous substitutions
Attending to the ratio of non-synonymous to synonymous substitutions (dN/dS), the type of selection forces on the evolution of this bacterial species can be evaluated (Kryazhimskiy and Plotkin, 2008). Values of dN/dS < 1 indicate the purifying selection, a dN/dS = 1 indicates neutral evolution, and a dN/dS > 1 indicates positive selection. The values obtained for the studied strains are summarized in Table 8. Based on these results, strain 162 showed a very high ratio in all the genomic compartments and subdivisions of the genome, which is indicative of adaptive evolution. The same results could be found for strain 94385 but with a lower bias, but for strain CECT4604 not all the genomic subdivision showed this adaptative evolution as the core genome presented purifying (chromosome II and both chromosomes as a unit) or neutral selection (chromosome I). This strain, CECT4604, in the accessory genome showed the lowest ratio among the three studied strains.

Divergence time
The divergence time can be estimated using the MR and the number of changes in the genomes of the studied strains, specifically the core genome of the species. The core genome was described by Roig et al. (2018). The size of the core genome is presented in Table 2. The time was estimated using the . /fmicb. .  The results for the studied strains in this study are shown in Table 9, and the divergence time for all strains studied in the core genome article (Roig et al., 2018)

Discussion
The MRs obtained for the core genome of both chromosomes of the three selected V. vulnificus isolates confirm the established theory on the evolution of both chromosomes in Vibrio (Cooper et al., 2010). According to this theory, chromosome II contains most of the genes involved in environmental adaptation, whereas chromosome I contains most of the housekeeping genes. Consequently, chromosome II is subject to a lower force of purifying selection and exhibits a higher evolutionary rate (dN and dS), a lower positive asymmetry of the rate distributions of orthologous sets, a lower codon usage bias, and, finally, a higher MR. For the accessory genome, only two of the strains studied, CECT4604 and 162, confirmed this theory; however, the third strain showed a higher MR for chromosome I than for chromosome II.
The first two strains belong to lineages II and III, while the third strain belongs to lineage IV, a minority lineage formed, so far, by European strains, so the result obtained cannot be standardized to this lineage until more strains are isolated and their genomes are sequenced. In any case, these results are consistent with the great biological and genetic diversity of this species (Oliver, 2015;Heng et al., 2017;Roig et al., 2018). For example, the rtxA1 gene is present in this accessory genome, a gene encoding a multidomain toxin essential in virulence that is present on chromosome II (Roig et al., 2011(Roig et al., , 2018. A lower chromosome II accessory genome MR could therefore pose adaptive value for those species that require genes in the accessory genome for survival. Based on the dN/dS values, it appears that strains from L3 (162) and L4 (94385) are under positive selection pressure, the results for strain 162 showing the highest value while strain from L2 seems to be in "evolutionary neutrality". This strain belongs to a clonal complex that emerged in fish farms in the 70s of the past century. This clonal complex appears to be quite stable, having been able to spread worldwide (Roig et al., 2018). Our results showed that the divergence time for chromosome I varied .
/fmicb. .  On the other hand, the divergence time for lineage II was 259 years, this value being the second lower number of years found for lineages with more than one representative. The most compact lineage in terms of divergence time is L3. This lineage emerged in the 90s of the last century. The divergence time between the strains studied is <1 year and the dN/dS rate is so high, indicating that this lineage is in full clonal expansion, which is consistent with previous studies. These findings joined with the periodical outbreaks of V. vulnificus (López-Pérez et al., 2021) seem to indicate that the evolution of the species is not only influenced by the intrinsic MR but also be conditioned to external factors that exert selection pressure. This effect of selection pressure was previously studied by the authors in the selection of antibiotic-resistant clones under the presence of quinolones (Roig and Amaro, 2009). Our results point to the fact that V. vulnificus is at a biological expansion level, with a lead to MR than other studied species; in E. coli, the calculated MR is 2.2 × 10 −10 (Lee et al., 2012) mutations per base pair per generation, and in Bacillus anthracis, the calculated MR is 5.2 × 10 −10 (Galloway-Peña et al., 2012).
. /fmicb. . These findings alongside the colonization of new habitats due the climate change can lead to a rise in the number of infections, not only at the human level but at fish levels as well. This assumption is stated in the possibility of the emergence of new phenotypes with new putative host and virulence features. Subsequent studies were conducted considering different culturing conditions, such as the presence of compounds such as blood or variations in temperature, which can induce differential expression of virulence-related genes and may potentially play a role in the MR, in this case by positive selection. This experiment could provide valuable insights into the evolution forces of the species. However, it is important to note that the objective of this study is focused solely on determining the MR under standard conditions. Therefore, future studies could explore the impact of various culturing conditions on both the MR and the differential expression of virulence-related genes. This could involve investigating a larger sample size or incorporating additional strains to enhance the generalizability and provide a more comprehensive understanding of the relationship between culturing conditions, MRs, and virulence-related gene expression.

Data availability statement
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://www.ncbi.nlm. nih.gov/genbank/, PRJNA534103.

Author contributions
FR designed the work and wrote the manuscript. FR, AA, and JC performed the experimental and statistical analysis. All the authors contributed to the discussion and improvement of the manuscript. All authors contributed to the article and approved the submitted version.

Funding
This study was partially funded by Departamento de Ciencia, Universidad y Sociedad del Conocimiento, from the Gobierno de Aragón (Spain) (Research Group T71_23D). This study also forms part of the ThinkInAzul program and was supported by grant THINKINAZUL/2021/027 from MCIN (Ministerio de Ciencia e Innovación de España) with funding from European Union NextGeneration EU (PRTR-C17.I1) and GV (Generalitat Valenciana). The study was also supported by grants PID2020-120619RB-I00 funded by MCIN/AEI/10.13039/501100011033 and AICO/2020/076 funded by GV. This study was also supported by Universidad San Jorge (internal call for Open Access publication grants 2022-2023).