Examination of clinical and environmental Vibrio parahaemolyticus isolates by multi-locus sequence typing (MLST) and multiple-locus variable-number tandem-repeat analysis (MLVA)

Vibrio parahaemolyticus is a leading cause of seafood-borne infections in the US. This organism has a high genetic diversity that complicates identification of strain relatedness and epidemiological investigations. However, sequence-based analysis methods are promising tools for these identifications. In this study, Multi-Locus Sequence Typing (MLST) and Multiple-Locus Variable-Number Tandem-Repeat Analysis (MLVA) was performed on 58 V. parahaemolyticus isolates (28 of oyster and 30 of clinical origin), to identify differences in phylogeny. The results obtained by both methods were compared to Pulsed-Field Gel Electrophoresis (PFGE) patterns determined in a previous study. Forty-one unique sequence types (STs) were identified by MLST among the 58 isolates. Almost half of the isolates (22) belonged to a new ST and added to the MLST database. A ST could not be generated for 5 (8.6%) isolates, primarily due to an untypable recA locus. Analysis with eBURST did not identify any clonal complex among the strains analyzed and revealed 37 singeltons with 4 of them forming 2 groups (1 of them SLV, and the other a DLV). An established MLVA assay, targeting 12 total genes through three separate 4-plex PCRs, was successfully adapted to high resolution melt (HRM) analysis with faster and easier experimental setup; resulting in 58 unique melt curve patterns. HRM-MLVA was capable of differentiating isolates within the same PFGE cluster and having the same ST. Conclusively, combining the three methods PFGE, MLST, and HRM-MLVA, for the phylogenetic analysis of V. parahaemolyticus resulted in a high resolution subtyping scheme for V. parahaemolyticus. This scheme will be useful as a phylogenetic research tool and as an improved method for outbreak investigations for V. parahaemolyticus.


Introduction
Vibrio parahaemolyticus can cause acute gastroenteritis associated with consumption of raw or undercooked seafood (Nishibuchi and DePaola, 2005). Presently, this bacterium represents the most common cause for seafood-associated infections in the United States (Iwamoto et al., 2010). V. parahaemolyticus has shown a high rate of recombination and mutation which leads to a high genetic diversity (Gonzalez-Escalona et al., 2008). V. parahaemolyticus isolates are frequently characterized for their virulence gene profile, serotype, ribotype, and/or Pulsed-Field Gel Electrophoresis (PFGE) pattern for research studies and epidemiological investigations (Broberg et al., 2011;Jones et al., 2012;Paranjpye et al., 2012;Banerjee et al., 2014;Xu et al., 2015). However, there are several phylogenetic and evolutionary methodologies for differentiation of V. parahaemolyticus, such as Multi-Locus Sequence Typing (MLST) and Multiple-Locus Variable-Number Tandem-Repeat Analysis (MLVA), which may provide greater discrimination.
For this study, we selected two sequence-based typing methods, MLST and MLVA. MLST is based on sequence diversity of loci which are generally well-conserved, or under purifying selection, as is the case with housekeeping genes. MLST is a frequently used typing method for many organisms and the development of a public database (PubMLST) has simplified sequence analysis and identification of evolutionary relationships within bacterial species (Maiden et al., 1998;Perez-Losada et al., 2013). MLST was selected to examine the current strain collection based on the previous success of the technique for characterizing diverse environmental and clinical V. parahaemolyticus isolates (Gonzalez-Escalona et al., 2008) and to add a diverse set of isolates to the MLST database. MLVA has also been used to distinguish isolates with little genetic variation. MLVA uses PCR for amplification of size polymorphisms in several Variable-Number Tandem-Repeat (VNTR) loci (Lindstedt, 2005). The VNTRs are highly polymorphic and are well suited for differentiation of bacterial isolates (Lindstedt, 2005;van Belkum, 2007). MLVA was selected for use in this study as it is able to differentiate between indistinguishable PFGE patterns for V. parahaemolyticus (Hayat et al., 1993;Harth-Chu et al., 2009) or identical MLST sequence types (STs) in other organisms (Maiden et al., 1998(Maiden et al., , 2013. Recently, MLVA was used for epidemiological analysis for discrimination of clinical and environmental V. parahaemolyticus isolates with indistinguishable Direct Genome Restriction Enzyme Analysis (DGREA) patterns (Harth-Chu et al., 2009;Garcia et al., 2012). Generally, the PCR products from MLVA are separated by agarose gel or capillary electrophoresis (CE) (Lindstedt et al., 2013). However, differentiation of amplification products using high resolution melt (HRM) analysis has been described for MLVA assays in other organisms (Fortini et al., 2007).
The objective of this study was to evaluate the combined use of MLST and the HRM-MLVA sequence-based methods for discrimination of environmental and clinical V. parahaemolyticus isolates previously characterized by other fingerprinting methods, such as PFGE (Ludeke et al., 2014). MLST was selected to identify phylogenetic relationships while MLVA was applied with the hypothesis that it will further discriminate isolates with identical STs. In order to achieve this objective, an HRM-MLVA protocol for rapid and simple characterization of V. parahaemolyticus isolates was developed. This is the first study, to the authors' knowledge, to use the combined approach of the subtyping methods MLST, HRM-MLVA, and PFGE for differentiation of V. parahaemolyticus isolates.

Bacterial Strains
For the MLST and MLVA analysis, 58 V. parahaemolyticus isolates were selected; among those were 28 environmental (oyster) and 30 clinical isolates. Isolates were selected to represent multiple collection states and serotypes ( Table 1). Each isolate was inoculated into Luria Bertani broth with 1% NaCl and incubated with shaking overnight at 35 • C. Afterwards, 1 mL of the overnight culture was transferred to a 1.5 mL microcentrifuge tube, heated to 100 • C for 10 min, and placed in ice for 5 min. The samples were stored at −20 • C until used as a PCR template.

Multi-Locus Sequence Typing (MLST)
MLST was performed as described in the protocol for Vibrio parahaemolyticus (Gonzalez-Escalona et al., 2008). The PCR products were purified using a PCR purification kit (Qiagen, Valencia, CA) with a total elution volume of 25 µL. The purified samples were sequenced on an ABI 3730 × l sequencer at McLab (South San Francisco, CA). Sequences were analyzed with BioEdit software 7.1.9 (Abbott, Carlsbad, CA). The allelic and sequence type (ST) identification was determined using the MLST database (http://pubmlst.org/vparahaemolyticus). In the cases where whole genome sequence data was available, sequences were de novo assembled using CLC Genomics Workbench software 7.0.3 (CLCbio, Germantown, MD) and consensus assemblies submitted to the MLST database for allelic and ST identification.
For identification of clonal complexes, eBURST version 3 was used (http://eburst.mlst.net/). As reported previously, two different STs were considered single-locus variant (SLV) when they differed by a single locus; a double-locus variant (DLV) has two different loci (Gonzalez-Escalona et al., 2008). To be part of a clonal complex, isolates needed to share at least six out of seven alleles.
The minimum evolution tree of the concatenated sequences of the seven loci was built based on the method of Kimura-2parameter in Mega 6 (Tamura et al., 2013). The ratio between the number of synonymous and non-synonymous substitutions, showing the type of selection at each locus, was calculated using the method of Nei and Gojobori in Mega 6. The hypotheses of neutrality (d S = d N ), purifying selection (d S /d N > 1), and positive selection (d S /d N < 1) were tested.

Multiple-Locus Variable-Number Tandem-Repeat Analysis (MLVA)
Three multiplex real-time-PCR assays using HRM curve analysis were performed with the LightCycler R 480 High Resolution  Due to overlapping peaks in Multi A and C, the presence of the target loci VNTR5 and VNTR7, VNTR3, and VP2-07 were confirmed using simplex real-time-PCR for each gene with the same PCR reaction conditions and cycling parameters as described for the multiplex, but with omission of the three other primer sets. The melt curves were analyzed with BioNumerics software 6.6 with a customized script (Applied Maths, Austin, TX). This script compares the melting curves of each multiplex PCR, as well as a combination of all curves from the three multiplex PCRs. The combined dendrogram of all three multiplex PCR was built based on the Pearson correlation of average trend curves in BioNumerics as well. This dendrogram was also converted to a rendered rooted tree.
All loci showed ratios of synonymous and non-synonymous substitutions (d N /d S ) below 1 and therefore under purifying selection, as expected for housekeeping genes. eBURST analysis divided the 53 isolates for which a ST could be identified into 37 singletons and two groups: one SLV and one DLV (data not shown). No clonal complexes could be identified; demonstrating that none of the STs identified in this study share more than six alleles and, therefore, belong to different V. parahaemolyticus lineages.
A minimum evolution tree was constructed using the concatenated sequences of each allele (Figure 1). The isolates grouped into two main clusters, or lineages (I and II), with each lineage containing ST of clinical and oyster isolates. Isolates with the same ST generally had the same serotype; ST631 isolates possessed serotype O11:Kut, ST676 were serotype O8:Kut, ST36 were serotype O4:K12 or O4:Kut, and ST313 were serotype O1:Kut. However, the three ST3 isolates had all different serotypes.

Multiple-Locus Variable-Number Tandem-Repeat Analysis (MLVA)
The 58 V. parahaemolyticus isolates from this study were further analyzed by MLVA with HRM analysis. Three multiplex PCRs covering twelve different loci were used. Each multiplex PCR generated reproducible melting curve profiles of select isolates (data not shown).

Comparison of MLST, MLVA, and PFGE
Based on the hypothesis MLVA can differentiate isolates with the same ST and PFGE pattern, these isolates' MLVA patterns were compared to the MLST data, as well as previously published PFGE results (Ludeke et al., 2014). To compare these methods, dendrograms were built of the combined melting curves from the three MLVA multiplex PCRs and correlated to the PFGE cluster and ST of each isolate. MLVA allowed further differentiation of isolates with identical STs and PFGE clusters (Figure 2). Specifically, the isolates with ST3 and ST36 share the same PFGE cluster, but were distinguishable by MLVA melting curve profiles (Figure 3). The dendrogram with only ST3 and ST36 isolates showed ST-specific clusters, but separation within those clusters based on the combined melting curves of MLVA.

Discussion
This study analyzed 58 V. parahaemolyticus isolates by MLST and a newly-developed HRM-MLVA assay to investigate the FIGURE 1 | MLST minimum evolution tree of the 58 V. parahaemolyticus isolates. The tree was built with Mega six software using concatenated sequences. The scale represents the evolutionary distance and the branches show bootstrap values above 50%.
relatedness of the isolates. The seven gene MLST protocol reported in a previous study was employed in this study (Gonzalez-Escalona et al., 2008). A different MLST method using ten housekeeping genes was described (Yan et al., 2011). Both methods report the same level of discrimination; however, the seven gene protocol was selected due to the availability of a public repository for the data (http://pubmlst.org/vparahaemolyticus), which allows comparisons to be made with isolates analyzed by other researchers. With 22 novel STs, this study substantially contributed to the diversity in the MLST database. Four of our isolates were untypeable for recA. In a recently published study, a V. parahaemolyticus strain contained a recA gene that was fragmented by a 30 kb DNA insertion (Gonzalez-Escalona et al., 2015). It is possible a similar insertion exists in the recA gene of some of the strains in the current study, but further analysis is needed to confirm. ST3, ST32, and ST36 were the STs that occurred most often in our isolates, as well as in the public database. In our study, the fourth most frequent ST observed was ST676, which is one of the novel STs reported here. Two of these STs (ST3 and ST36) have been reported as part of clonal complexes CC3 and CC36, respectively (Gonzalez-Escalona et al., 2008), and correlated with outbreaks in multiple countries, including the US and Chile (Fuenzalida et al., 2006;Martinez-Urtaza et al., 2013). ST3 was identified as the ancestral ST of CC3 with ST27, ST42, and ST51 as SLVs. None of other STs from CC3 were identified in this study. A previous study using MLST on a set of clinical and environmental isolates from the Pacific Northwest Region of the United States showed that some environmental isolates were of ST3, suggesting a higher potential for virulence than other environmental isolates (Turner et al., 2013). In this study, three clinical isolates were ST3 and no direct relationship to environmental or other clinical clades were observed.
We developed an HRM-MLVA method, based on an existing MLVA method that uses CE, for subtyping of V. parahaemolyticus and to differentiate between similar PFGE patterns or STs. The CE method provides the actual number of tandem repeats while HRM does not. However, the HRM analysis still recognizes allelic variants and is able to distinguish between otherwise indistinguishable strains. For example, the ST3, ST32, and ST36 strains in this study also shared common PFGE profiles, but each isolate produced a unique HRM curve combination. This resolving power of HRM-MLVA is similar to previous reports of CE-MLVA, where Chilean isolates which shared a DGREA pattern and were ST3 could be differentiated by CE-MLVA (Gonzalez-Escalona et al., 2008;Harth-Chu et al., 2009). These data demonstrate that the HRM-MLVA method developed in this study provides similar discrimination as previously reported for the CE-MLVA method and is suitable examination of V. parahaemolyticus isolates. Additionally, the use of HRM-MLVA on the LightCycler R 480 saves the step of electrophoretic detection, thus minimizing the potential for cross contamination by PCR amplicons during that additional handling step.
The loci amplified in this study by MLVA are coding proteins such as a putative hemolysin (VPTR4) and putative collagenase (VPTR3) (Kimura et al., 2008). Most of these genes, could be amplified from the current strain selection with the exception of VPTR7 (Multi A), VP1-10 (Multi B), and VPTR6 (Multi C). Nearly all clinical isolates failed to amplify the VP1-10 locus and approximately half failed to amplify the locus VPTR7. A failure to amplify VPTR7 from some shellfish isolates has been reported previously (Harth-Chu et al., 2009). Fewer than 20% of isolates amplified VPTR6 in this study. Additionally, a previous study found VPTR6 to be one of the few loci with high genetic diversity (Harth-Chu et al., 2009). Together, these data suggest that these two loci might not be suitable targets for future MLVA studies, especially for environmental isolate screening. Nonetheless, the HRM-MLVA method successfully discriminated between otherwise indistinguishable V. parahaemolyticus isolates.
Previous studies have employed multiple methods for characterization and subtyping of V. parahaemolyticus isolates from various sources. For example, DePaola et al. used a combination of serotyping and ribotyping to identify types more highly associated with clinical isolates (DePaola et al., 2003). Turner et al. utilized REP-PCR, as fingerprint-based subtyping method, followed by MLST to identify region-specific clades of V. parahaemolyticus (Turner et al., 2013). Banerjee et al. employed PFGE, MLST, serotyping, and ribotyping to examine clinical V. parahaemolyticus isolates and provided combinatorial analysis to determine relatedness (Banerjee et al., 2014). However, none of these previous studies utilized the combination of two highly discriminatory, sequence-based methods as does the current study. This combined method approach described here using PFGE, MLST, and MLVA has not been previously reported for discrimination of V. parahaemolyticus isolates, but is similar to approaches used for other organisms: for example, the Centers for Disease Control and Prevention has used a combination of PFGE followed by MLVA for epidemiological investigations of STEC O157 to discriminate between closely related isolates (Hyytia-Trees et al., 2006). Also, a combination of MLST and MLVA has been used as an epidemiological tool for distinguishing between clones of Listeria monocytogenes (Chenal-Francisque et al., 2013).
This study used a combined method approach to increase the discrimination of V. parahaemolyticus isolates. MLST was able to determine the identity and the phylogenetic relatedness of the isolates in this collection. As hypothesized, the developed HRM-MLVA method further refined the relationship of isolates by being able to distinguish between isolates with indistinguishable PFGE groupings or STs. Our data demonstrates FIGURE 2 | Combined dendrogram of MLVA melting curves of the three multiplex PCRs built with BioNumerics software version 6.6. using Pearson correlation and the unweighted pair group method using arithmetic averages (UPGMA). Isolates originated from oysters starting with "FDA," isolates from clinical origin labeled with "CDC." The PFGE pattern designations are as previously reported (Ludeke et al., 2014).
FIGURE 3 | Dendrogram of MLVA melting curves of the three multiplex PCRs for the isolates carrying ST3 and ST36 built with BioNumerics software version 6.6. using Pearson correlation and the unweighted pair group method using arithmetic averages (UPGMA). The PFGE cluster designations are as previously reported (Ludeke et al., 2014). that a combination of PFGE, MLST, and HRM-MLVA would be the most suitable approach for outbreak and evolutionary investigations of V. parahaemolyticus, due to the high resolution provided. In instances where further discrimination is needed, and if available, next generation sequence data could be used to determine relatedness or to generate subtyping results in silico.