Genome Characterization and Comparison of Early Mortality Syndrome Causing Vibrio parahaemolyticus pirABvp− Mutant From Thailand With V. parahaemolyticus pirABvp+ AHPND Isolates

Citation: Thadtapong N, Salinas MBS, Charoensawan V, Saksmerprome V and Chaturongakul S (2020) Genome Characterization and Comparison of Early Mortality Syndrome Causing Vibrio parahaemolyticus pirAB Mutant From Thailand With V. parahaemolyticus pirAB AHPND Isolates. Front. Mar. Sci. 7:290. doi: 10.3389/fmars.2020.00290 Genome Characterization and Comparison of Early Mortality Syndrome Causing Vibrio parahaemolyticus pirAB Mutant From Thailand With V. parahaemolyticus pirAB AHPND Isolates


BACKGROUND
Early mortality syndrome (EMS) has emerged as a devastating disease in shrimp production aquaculture since it was firstly discovered in China (2009), and onward in Malaysia (2011), Thailand (2012, and Mexico (2013) (Thitamadee et al., 2016). This disease leads to 100% shrimp mortality within 30 days after the infection and thus, economic loss of over USD 1 billion per annum worldwide (de Schryver et al., 2014;Zorriehzahra and Banaederakhshan, 2015). Penaeus monodon and P. vannamei shrimps in the post larva stage are the hosts of the EMS causative Gram-negative bacterium, V. parahaemolyticus (Flegel, 2012).
V. parahaemolyticus is a Gram-negative bacterium inhabiting in marine and estuarine environments (Tran et al., 2013). The same bacterium is also one of human foodborne pathogens that cause gastroenteritis after consumption of contaminated seafood (Letchumanan et al., 2014). Toxins of V. parahaemolyticus acting as porins in enterocyte plasma membrane and interrupting ion flux in intestine are thermostable direct hemolysin (TDH) and TDH-related hemolysin (TRH). TDH and TRH associate with V. parahaemolyticus hemolytic and cytotoxicity activities in the host. Another hemolysin is thermolabile hemolysin (TLH), which also causes red blood cell lysis (Wang et al., 2015). Moreover, asymptomatic infection or presence of pathogenic strains in shrimp farms or marine environments can be a reservoir for human infection.
Nowadays, EMS-causing V. parahaemolyticus isolates have been classified into two groups based on their histological signs: acute hepatopancreatic necrosis syndrome (AHPND) and non-AHPND (Joshi et al., 2014). AHPND V. parahaemolyticus, e.g., the 5HP strain, carries a 69-kb plasmid named pVA1 that contains V. parahaemolyticus Photorhabdus, insect-related (Pir) toxin-like genes, pirA vp and pirB vp (Lee et al., 2015;Lin et al., 2017). These PirAB vp toxins damage the hepatopancreatic cells by necrosis, called sloughing cells. The shrimps infected by the AHPND V. parahaemolyticus normally completely die within 30 days. Gross signs of infected shrimps by AHPND strains show empty stomach and midgut, and pale hepatopancreas (Zorriehzahra and Banaederakhshan, 2015). On the other hand, the non-AHPND strains, e.g., the 2HP strain, cannot produce PirAB vp toxins (Joshi et al., 2014). Infected shrimps by the non-AHPND strains still show empty stomach and pale hepatopancreas. Hepatopancreas tubule epithelia show thinness and numerous vacuoles in E-cell or collapsed cells. In immersion challenge testing, 2HP causes 75% of shrimps to die within 96 h-post infection; in contrast, the AHPND isolates (1D, 3HP, 5HP, and "China") cause 100% mortality in shrimp in the same period of time (Joshi et al., 2014). 2HP may maintain other virulence factors that causes collapsed lesion in shrimp. Recently, we learned that the non-AHPND V. parahaemolyticus 2HP carries three plasmids: 183-kb pVP2HP, 65-kb pVP2HPa, and 50-kb pVP2HPb (Theethakaew et al., 2017). The 183-kb pVP2HP shares genetic element of combined 69-kb pVPE61a (containing pirA vp and pirB vp genes) and 118-kb pVPE61b, but pirA vp and pirB vp genes are absent (Theethakaew et al., 2017). However, virulence factors in non-AHPND V. parahaemolyticus that cause collapsed lesion and lead to EMS pathogenesis are still unclear. The virulence factor candidates in the non-AHPND V. parahaemolyticus 2HP might be maintained in its chromosomes (I and II). In order to provide better insights into the virulence factors in the non-AHPND V. parahaemolyticus, for the first time to the best of our knowledge, we performed whole genome sequencing of V. parahaemolyticus 2HP, and comprehensively compare with the existing genomes of the AHPND V. parahaemolyticus isolates (3HP, 5HP, and China), as well as non-pathogenic V. parahaemolyticus S02 (Yang et al., 2014). The virulence genes and pathogenicity islands were predicted in order to explore putative factors that might be involved in non-AHPND EMS pathogenesis. In addition to the pathogenic protein-coding genes, non-coding RNAs (ncRNAs) are one of the factors regulating many processes of cell and affect protein expression or activity (Gottesman and Storz, 2011). Some ncRNAs involve in regulation of virulence factors and pathogenesis Wang et al., 2016;Pappesch et al., 2017). Thus, we also predicted ncRNAs from the newly sequenced non-AHPND V. parahaemolyticus 2HP to investigate its involvement in EMS pathogenesis. In addition, antibiotic resistant genes were predicted which suggested the incident of drug resistance in the shrimp pathogen. The genome sequences of non-AHPND and AHPND isolates from Thailand and other Asian countries were compared to elucidate common and unique mechanisms of pathogenesis in EMScausing V. parahaemolyticus.

DATA DESCRIPTION
Bacterial Strain, Culture Conditions, and Genomic DNA Extraction V. parahaemolyticus 2HP was cultured in tryptic soy broth (TSB) supplemented with 1.5% NaCl. The bacterial culture was incubated at 30 • C and shaken at 200 rpm overnight. Then, bacterial cells were collected by centrifugation. Genomic DNA of V. parahaemolyticus 2HP was extracted by the QIAamp DNA Micro kit (Qiagen, Hilden, Germany). Quantity and quality of extracted genomic DNA was measured by the Denovix spectrophotometer (Denovix, DE, USA) and 1.5% agarose gel electrophoresis.

Genomic Sequencing and Annotation
Whole genome sequencing library of genomic DNA was prepared by the QIAseq FX DNA library kit (Qiagen, Hilden, Germany) and the library was sequenced on the Illumina HiSeq 2500 platform (Illumina, CA, UAS). Sequences were trimmed and assembled de novo by the CLC genomic workbench version 11.0.1 (CLC bio, Denmark). De novo assembled contigs were evaluated for quality, and genome size was estimated against V. parahaemolyticus RIMD2210633 (Makino et al., 2003) used as the reference genome using QUAST (Gurevich et al., 2013). Non-coding RNA, rRNA, and tRNA were predicted using Infernal (Kolbe and Eddy, 2011), RNAmmer (Lagesen et al., 2007) and Aragorn (Laslett and Canback, 2004) available from the PROKKA pipeline (Seemann, 2014) via the Galaxy server (Afgan et al., 2018). Genes and virulence factors were predicted using Rapid Annotation using Subsystem Technology tool kit (RASTtk) (Brettin et al., 2015). Pathogenicity islands (PAIs) and mobile elements were predicted using VRprofile 2.0 (Li et al., 2018a). HMMer E-value at 0.0001 is set as a cut-off. Minimum number of the significant hits for each prophagelike region was 5. Other strains of AHPND and non-pathogenic V. parahaemolyticus were selected to analyze and against 2HP as well. The V. parahaemolyticus 3HP, 5HP, and China isolates were all AHPND strains, obtained from GenBank using the accession numbers JPKS00000000, JPKT00000000, and JPKU00000000, respectively (Yang et al., 2014). The V. parahaemolyticus S02 (accession number JPKV00000000) was a non-pathogenic strain used in this comparison (Yang et al., 2014). Antibiotic resistant genes were predicted by ResFinder 3.1 (Zankari et al., 2012) and PATRIC 3.5.39 (Wattam et al., 2017). The comparative genomics were performed by protein family sorter program with PATRIC genus-specific families (PLfams) as a filter in the PATRIC 3.5.39 server (Wattam et al., 2017). This program uses k-mer-based functional assessment to categorize the protein family formation and differentiates the cluster of protein families across different genomes using Markov Cluster algorithm (MCL), MCL inflation value = 3 as a cut-off (Davis et al., 2016). The comparison of zot and ace nucleotide sequences were performed by Multiple Sequence Alignment, EMBL-EBI (Madeira et al., 2019) and BLASTN, NCBI (Johnson et al., 2008).

Genome Characteristics of V. parahaemolyticus Strain 2HP
The genome assembly of V. parahaemolyticus 2HP is consisted of 5,501,309 bp in 194 contigs. There are 97 RNAs i.e., 4 rRNA, and 93 tRNA. The G+C content of this assembled genome is 45.2%. The quality assessment of draft genome from QUAST showed that N50 (representing the contig length that longer or equal length contigs presents half of the bases of the assembly) and L50 FIGURE 1 | Subsystem category distribution of V. parahaemolyticus 2HP (results of RASTtk annotation). Subsystem coverage shows sets of functional roles in specific biological processes (green box) and non-specific processes (blue box). Overall genome annotation is represented in a pie chart of subsystem category distribution. Different colors in pie chart explain details and numbers of genes in subsystem feature counts.
(representing the minimum number of contigs that present half of the bases of the assembly) were 183,055 bp and 10, respectively. Two chromosomes of 2HP were estimated to be 2,996,005 and 1,724,740 bp for chromosome I and II, respectively. RASTtk analysis showed that, from the whole genome, the number of coding sequences was 5,424 and the number of functional subsystems was 406 (Figure 1).

Virulence and Antibiotic Resistance Genes in V. parahaemolyticus Strain 2HP
From the RASTtk analysis, we found four encoding homologs of V. cholerae toxins, which are also similar to the V. parahaemolyticus AHPND strains (Yang et al., 2014). Six toxin genes are two zona occluden toxins (zot), two accessory cholera toxins (ace), the gene encoding transcriptional activator ToxR (toxR), and the gene encoding transmembrane regulatory protein ToxS (toxS). 2HP carries two different sizes of putative ace, 345 and 333 bp, renamed putative ace1 and ace2, respectively. 3HP and 5HP also have two types of putative ace, but the China strain has only one putative ace1 (345 bp). The 2HP zot is also consisted of two types with different sizes, 1,401 and 1,425 bp, called putative zot1 and zot2, respectively. Similar with ace, 3HP and 5HP carry two types of zot. A unique gene found only in the China strain is a putative zot (1,386 bp), which has a different size when compared to the Thai isolates. The zot1 and ace1 genes are maintained on chromosome I while zot2 and ace2 are carried on chromosome II. We also found only one type of hemolysin in the current draft genome, the thermolabile hemolysin or tlh gene. These toxin genes might play a role in the EMS pathogenesis in shrimps. In addition, the presence of zot and ace genes, which are virulence factors in clinical field , indicates that this 2HP strain could cause disease in human, despite lacking thermostable directed hemolysin (tdh) and TDH-related hemolysin (trh) genes (Castillo et al., 2018). Presence of zot1, zot2, ace1, and ace2 genes was confirmed by PCR (Figures S1-S3).
In addition to the virulence genes and putative PAIs described, RASTtk also annotated 34 genes that are putatively responsible for resistance to antibiotics and toxic compounds: four genes for resistance to fluoroquinolones, four genes for multidrug resistance efflux pumps, one gene for resistance to chromium compounds, 11 genes for cobalt-zinc-cadmium resistance, 14 genes for copper homeostasis ( Table S5). The genome analysis on ResFinder 3.1 (% identity threshold and minimized length were 80 and 60% as cut-offs) and PATRIC (based on database of National Database of Antibiotic Resistant Organisms) found presence of one beta-lactam resistant gene, beta-lactamase CARB-type (blaCARB-41), and two tetracycline resistant genes, tet-34 and tet-35. In aquaculture, tetracycline is one of common drugs for the treatment of bacterial diseases, e.g., vibriosis (Neela et al., 2007), and beta-lactam drugs (such as ampicillin and penicillin) have been applied in human and animal treatment, FIGURE 2 | Comparative analyses of (A) prophage structures based on VRprofile prediction, (B) protein families based on PATRIC genus-specific families, PLfams, and (C) ncRNAs (C) (based on Infernal prediction) among non-AHPND V. parahaemolyticus (2HP), V. parahaemolyticus AHPND (3HP, 5HP, and China), and non-pathogenic V. parahaemolyticus (S02). (A) The gray, purple, and dark blue blocks represent genes of outer structure of prophage, prophage, and PAI, respectively. Red arrows and numbers show boundary and start/stop positions of prophage structures. Labeled regions show regions of rstB, zot, ace, and parA-like genes. (B) shows five-circle Venn diagram of distribution of protein families across different V. parahaemolyticus genomes. The orange, purple, green, yellow, and blue circles represent 2HP, 3HP, 5HP, China, and, S02, respectively, with number of total genes in each strain. The overlapped areas show shared protein families among strains, and numbers of proteins found in all strains are shown. Numbers of specific features in AHPND (3HP, 5HP, and China) and EMS (AHPND strains and 2HP) areas show in red and blue, respectively. (C) shows ncRNA prediction in V. parahaemolyticus isolates. Orange, green, and blue circles represent 2HP, AHPND strains, and S02, respectively, with the number of predicted ncRNAs of each strain. The overlapped areas represent the shared predicted ncRNAs among strains, and numbers of ncRNAs found in all strains are shown. (Fernandes et al., 2013). Our finding supports the significance of antimicrobial resistance gene surveillance in aquaculture.

Comparisons of Pathogenicity Islands Prediction With V. parahaemolyticus AHPND Isolates
We next investigated the common and unique PAIs in our newly sequence non-AHPND 2HP and other existing EMS V. parahaemolyticus AHPND 3HP, 5HP, China, and the reference S02 strains. We found the consistency of predicted prophage in the EMS-causing strains on chromosome I, but not in the non-pathogenic strain (Figure 2A and Table S1). Prophage carrying zot1 and ace1 genes was found in 3HP and 5HP strains, while China strain was presented with zot and ace1 in prophage. Those prophage structures are closely related to pre-CTX prophage (lacking ctxAB in CTX prophage) (Pant et al., 2019). In V. mimicus, putative pre-CTX prophage was found in a clinical strain MB451, not in an environmental strain VM223 (Hasan et al., 2010). The putative pre-CTX prophage is consistently found in EMS strains and might be involved with EMS pathogenesis, in addition to pirAB vp .

Non-coding RNA Prediction and Comparison
The prediction of ncRNAs by the Infernal program (Kolbe and Eddy, 2011) showed 43 putative ncRNAs in 2HP, 42 in AHPND strains, and 41 in S02 (Figure 2C). We found that the C4, IsrK, and psRNA2 ncRNAs are present only in the EMS-causing strains (2HP, 3HP, 5HP, and China), but not in S02. C4 targets incoming bacteriophage and/or interact with bacteriophage transcription  while IsrK interferes with production of AntQ, an antitermination protein for transcription (Hershko-Shalev et al., 2016). The function of psRNA2; however, is still unclear (Gvakharia et al., 2010). For unique ncRNAs, mini-ykkc ncRNA, a class II guanidine riboswitch (Sherlock et al., 2017), is only found in 2HP, while PtaRNA1 and TraJ-II ncRNAs are found only in S02. PtaRNA1 is an antisense antitoxin ncRNA in XCV2162-ptaRNA1 system belonging to class I toxin-antitoxin system (Findeiss et al., 2010). TraJ-II is found in 5 ′ -UTR of traJ gene and functions in bacterial conjugation process (Li et al., 2018b). We validated the presence of predicted ncRNA in 2HP by RT-qPCR. Spot42, RyhB, and mini-ykkc were chosen as representatives and their transcript levels shown in (Figure S4) confirmed expression of the selected ncRNAs.

Comparative Pangenomic Analyses of EMS-Causing V. parahaemolyticus Isolates
Not only virulence factors lead to pathogenesis, but some genes known to be involved in the bacterial fitness, which are specific to each strain, may also play crucial roles in disease establishment (Diard and Hardt, 2017). Hence, 2HP may carry some features, in addition to zot and ace, which overlap with AHPND strains or features unique to 2HP which lead to EMS in shrimp. To investigate the strain-specific features and unique features in each group of AHPND and EMS, we selected 2HP, 3HP, 5HP, China, and S02 for pan-genomic analysis by the protein family sorter program in PATRIC ( Figure 2B). The number of unique genes in 2HP, 3HP,5HP,China,and S02 are 26,25,21,483, and 615 genes, respectively ( Table S4). The 2HP-specifc genes include one encoding a transposase and 25 encode hypothetical proteins.
Pangenome (all genes from all genomes) and core genomes (present in all the genomes) contained 6,287 and 4,128 genes, respectively. Accessory genome (genes present in specific strains) in AHPND (3HP, 5HP, and China) and EMS strains (AHPND strains and 2HP) contained 13 and 177 genes, respectively (Tables S2, S3). AHPND-specific genes are pirA vp , pirB vp , one gene encoding peptidase M20, and 10 hypothetical genes. EMS strain-specific genes compose of many genes encoding functional proteins in methylation and acetylation e.g., S-adenosylmethionine (SAM)-dependent methyltransferase, acetyltransferase, histone acetyltransferase HPA2. The rsbW gene, which encodes a serine kinase RsbW, is specific to the EMS groups. Free RsbW phosphorylates SypA leading to regulation of biofilm formation (Morris and Visick, 2010). The ORFs for capsular polysaccharide biosynthesis protein and YjbG polysaccharide synthesisrelated protein, important in capsule formation and adherence to surface (Ferrieres et al., 2007;Broberg et al., 2011), are also EMS-specific.
In terms of key virulence factors, nucleotide sequences of accessory cholera toxin and zona occluden toxin in China strain are different from those of EMS strains from Thailand (2HP, 3HP, and 5HP). The zot and ace1 from China strain are closely related to zot and ace in V. parahaemolyticus RIMD2210633 (clinical strain). The zot gene from China strain shows 97.62% similarity to RIMD2210633 strain while the zot1 and zot2 genes from Thai isolates share 50.73 and 51.06% nucleotide similarity, respectively, to RIMD2210633 strain. The ace1 gene from China strain shows 99.13% nucleotide similarity to RIMD2210633 strain while ace1 and ace2 genes of Thai isolates show 49.86 and 47.17% nucleotide similarity, respectively, to RIMD2210633. Therefore, zot1, zot2, ace1 (Thai), and ace2 genes are considered members of 2HP-3HP-5HP accessory genome which are different from those of RIMD2210633 and China strains.
In summary, we have for the first time sequenced the non-AHPND 2HP V. parahaemolyticus and comprehensively annotated the newly sequence genome. Our comparative genomic analysis revealed the "conserved" virulence genes and prophage commonly found in both pathogenic AHPND and non-AHPND strains. In V. cholerae, Zot increases permeability of epithelial barrier by disassembling the intercellular tight junctions, and Ace increases ion transport and water secretion . Sequences of zot and ace in 2HP, 3HP, and 5HP are different from those of clinical strain RIMD2210633. Therefore, the activities of Zot and Ace might be more specific to shrimp than human. However, zot and ace in the China strain are closely similar to those of RIMD2210633. Zot and Ace in this China strain might affect both human and shrimp. Both toxins in non-AHPND and AHPND strains might have similar functions in shrimp, similar to other hepatopancreatic toxins such as to PirAB vp toxins (Prachumwat et al., 2019). Prophage carrying zot and ace might be necessary for EMS pathogenesis in both AHPND and non-AHPND. Moreover, this prophage may be used as biomarker for diagnosis of EMScausing V. parahaemolyticus. Not only virulence genes and prophage, but ncRNA also showed specific patterns among the EMS strains. We also identified conserved ncRNAs in EMS strains involved in transcription process. These might be transcriptional factors for enhancement of EMS pathogenesis. From comparative genomic analysis, we found that most of the strains-specific genes were hypothetical genes. Interestingly, we found ORFs which encode regulatory protein and capsule formation proteins in EMS group. This finding suggests that EMS strains, AHPND and non-AHPND, carry the special features that S02 lacks. These genes might enhance EMS strains to induce the virulence and pathogenesis. The 2HPspecific genes contain a lot of hypothetical genes. 2HP may maintain some unique features that involve in EMS mechanism of the non-AHPND strain. The functions of these gene are still unknown. Further investigation is required to identify mechanisms of pathogenesis in AHPND and non-AHPND strain for EMS.

DATA AVAILABILITY STATEMENT
The genome data of V. parahaemolyticus 2HP from this study had been submitted to DDBJ/EMBL/GenBank under the accession number SRKW00000000. The version of genome sequence described in this work is the first version, SRKW01000000.

AUTHOR CONTRIBUTIONS
NT performed the experiments, analyzed the data analysis, and prepared the manuscript. MS assisted with antibiotic resistant gene prediction. VC, VS, and SC designed the study and prepared the manuscript. All authors proofread the manuscript.

FUNDING
NT was supported by funding from Thailand Graduate Institute of Science and Technology (TGIST), National Science and Technology Development Agency (NSTDA), Thailand (SCA-CO-2559-2502-TH). This study was supported by Mahidol University. VC was supported by the Thailand Research Fund (TRF) grants MRG6080235 and DBG6080003.