Evolutionary Diversity of Prophage DNA in Klebsiella pneumoniae Chromosomes

Mobile gene elements play an important role in the continuous evolution of the prophage DNA of bacteria, promoting the emergence of new gene structures. This study explored the evolution of four strains of Klebsiella pneumoniae harboring prophages, 19051, 721005, 911021, and 675920, and 16 genomes of K. pneumoniae from GenBank. The results revealed a wide range of genetic variation in the prophage DNA inserted into the sap sites of K. pneumoniae chromosomes. From analysis and comparison of the sequences of the 20 prophage DNAs determined from the four strains and the 16 GenBank genomes of K. pneumoniae using high-throughput sequencing and antimicrobial susceptibility tests, we identified a novel transposon, Tn6556. We also identified at least nine large genetic structures with massive genetic acquisitions or losses and five hotspot sites showing a tendency to undergo insertion of gene elements such as IS1T, IS1R, IS26, ISKpn26, ISKpn28, Tn6556, MDR, and In27-related regions as variable regions; however, the only highly conserved core genes were int and umuCD among the 20 prophage DNAs. These findings provide important insights into the evolutionary diversity of bacteriophage DNA contained in K. pneumoniae.


INTRODUCTION
Antimicrobial resistance (AMR) genes originate from environmental bacteria, especially from soil bacteria, which are thought to have evolved together with related antibiotic production organisms over thousands of years (Dcosta et al., 2011;Nesme et al., 2014). It usually takes several years of clinical use of drugs associated with mobile AMR genes for the genes to penetrate human pathogen populations (Lewis, 2013). Hundreds of mobile AMR genes have been found in Klebsiella pneumoniae (Holt et al., 2015;Navon-Venezia et al., 2017). The accumulation of AMR in these bacteria is mainly related to horizontal gene transfer assisted by plasmids and mobile gene elements (Pendleton et al., 2013), such as insertions (IS), transposons (Tn), and integrons (In). Evolutionary forces drive the growth of bacteria in many circumstances, and survival under less than optimal conditions is essential. Therefore, bacteria often acquire the ability to survive in the presence of an antimicrobial agent in an effort to adapt to their environment (Alanis, 2005). Bacteriophages (phages) are viruses that infect bacteria; they constitute a large and diverse group within microorganisms. It is estimated that approximately 20% of the total bacterial genome has been obtained from phage elements, indicating an evolutionary correlation between bacteria and phages (Brüssow et al., 2004). Phages induce antiviral immunity and prevent the elimination of bacterial infections (Sweere et al., 2019). A number of studies have shown that in mammals, phages are also related to bacterial colonization (Davies et al., 2016;Shen et al., 2019).
Phages that integrate into bacterial chromosomes are called prophages, which are important gene elements of bacterial chromosomes and enable horizontal gene transfer between bacteria and phages (Bushman, 2002). The interaction between phages and bacteria may have evolutionary benefits because many bacterial species have phage-derived factors in their genomes that benefit their survival, and such genetic exchange can be promoted given the appropriate ecological conditions (Calero-Cáceres et al., 2019). For example, Wang et al. (2010) has described how prophage DNA can help the host survive under multiple adverse conditions including the presence of antibiotics and environmental stress. Prophages also encode the virulence factors of many pathogens (Castillo et al., 2018;Deutsch et al., 2018). Attempts have been made to use phages to treat multidrug-resistant bacterial infections (Furfaro et al., 2018;Torres-Barceló, 2018;Dedrick et al., 2019;Kortright et al., 2019), but their function and significance remain unclear (Keen and Dantas, 2018). Only recently have studies confirmed that phages play more crucial roles than previously expected in the maintenance and dissemination of acquired resistance genes (Calero-Cáceres et al., 2019).
Klebsiella pneumoniae is an important conditional pathogen in hospitals and the environment. The global multidrug resistance (MDR) problem of K. pneumoniae is becoming more and more serious, and the antimicrobial treatment options for infection are limited (Rice, 2010). The role of prophages in highly virulent and multidrug resistant pathogens, including K. pneumoniae, has received much attention (Brown-Jaque et al., 2015). Building on previous studies, herein, four strains of K. pneumoniae, 19051, 675920, 721005, and 911021, from hospital patient specimens and 16 genomes of K. pneumoniae from GenBank were compared and analyzed for the presence of prophage DNA at the sap sites in an effort to confirm its importance as the basis of the horizontal transfer of resistance genes in the prophage DNA. A novel transposon, Tn6556, was discovered and the related properties of the prophage genes inserted into the sap sites of K. pneumoniae are described. This study describes the diversity of prophage genetic mutations, and deciphers the detailed molecular mechanisms of AMR for K. pneumoniae harboring prophage DNA.

Ethics Statement
This study was approved by the Ethics Committee of the Taizhou Municipal Hospital, Taizhou University, Zhejiang, China, and written informed consent was obtained from each of the participants in accordance with the Declaration of Helsinki. The rights of the research subjects were protected throughout, and we confirm that this study was conducted in our hospital.
The use of human specimens and all related experimental protocols were approved by the Committee on Human Research of the indicated institutions, and the protocols were carried out in accordance with approved guidelines.
Bacterial Strains and Sequencing of the 16S rRNA Gene Klebsiella pneumoniae strains 19051 and 721005 were isolated from urine samples of two patients in hospital in 2011 and 2013. K. pneumoniae strain 911021 was obtained from the urine of a patient in a teaching hospital in 2014. K. pneumoniae strain 675920 was obtained from a patient's sputum in another hospital in 2015. Escherichia coli TOP10 (Invitrogen, Carlsbad, CA, United States) was used as the host for cloning and azideresistant E. coli J53 was used as the recipient strain for the conjugation experiments.
For PCR amplification of the almost complete 16S rRNA genes of the K. pneumoniae strains, the following universal eubacterial primers were used: AGAGTTTGATYMTGGCTCAG (forward), and TACCTTGTTACGACTT (Y, T or C; M, A or C) (reverse). The length of the amplicon was about 1,500 bp (Frank et al., 2008). The Taq enzyme was a Fermentas Taq:Pfu 3:1 mixture (ThermoFisher Scientific, Burlington, VT, United States). The 30 µl reaction contained 1.5 U of enzyme. Amplification was carried out using a temperature program consisting of initial denaturation at 94 • C for 3 min, 30 cycles of denaturation at 94 • C for 40 s, annealing at 50 • C for 40 s, extension at 72 • C for 1 min, and final extension at 72 • C for 5 min. The PCR products were bidirectionally sequenced to confirm their identity.

Conjugative Transfer of Prophage DNA
Sodium azide-resistant E. coli J53 was used as a receptor, and the prophage-containing K. pneumoniae strains 19051, 675920, 721005, and 911021 were used as donors for the conjugative transfer experiments. Small amounts (10-20 µl) of donor and recipient glycerol bacteria were inoculated in 3 ml of brain heart infusion (BHI) broth (BD Biosciences, San Jose, CA, United States). After culture overnight, the donor bacteria were mixed with the recipient bacteria, centrifuged, resuspended in 80 µl of BHI broth, supplemented with a resuspended bacterial solution at the surface of a filter (filter size 1 cm 2 , pore size 0.45 µm), and the filter was adhered to BHI (BD Biosciences) agar plates. The cells were fully absorbed by the filter and cultured at 37 • C for 12-18 h. The bacteria were then washed from the filter and spotted on to the BHI agar plates containing 200 µg/ml sodium azide and 50 µg/ml amikacin to select conjugons harboring the prophage.

Sequencing and Sequence Assembly
Genomic DNA was isolated from each of the 19051, 721005, and 911021 isolates using a Qiagen Blood & Cell Culture DNA Maxi Kit (Qiagen, Hilden, Germany). Genome sequencing was performed with a sheared DNA library with an average size of 15 kb (ranging from 10 to 20 kb) on a PacBio RSII sequencer (Pacific Biosciences, Menlo Park, CA, United States), as well as a paired-end library with an average insert size of 400 bp (ranging from 150 to 600 kb) on a HiSeq sequencer (Illumina, San Diego, CA, United States). The paired-end short Illumina reads were used to correct the long PacBio reads utilizing proovread (Hackl et al., 2014), and then the corrected PacBio reads were assembled de novo utilizing SMART de novo 1 .
For the 675920 isolate, genome sequencing was performed with a Qiagen large construct kit with sequencing from a matepair library with average insert size of 5,000 bp using a MiSeq sequencer (Illumina, San Diego, CA, United States). Sequence assembly was performed as described previously (Feng et al., 2016). Briefly, the contigs were assembled using Newbler 2.6 (Nederbragt, 2014).

Phenotypic Assays
The method used for testing bacterial resistance was BioMérieux VITEK2, and the results were determined in accordance with the 2017 Clinical and Laboratory Standards Association (CLSI) guidelines (Clinical and Laboratory Standards Institute [CLSI], 2017).

Comparison of the Single Nucleotide Polymorphisms of the Backbone Sequence and Insertion Site of the Variable Region
The evolution tree of the nucleotide sequence for the single nucleotide polymorphisms (SNPs) of the backbone was inferred using the Maximum Likelihood method. The analyses were conducted using MEGA 7.0.14 software. The insertion gene elements of the prophage variable region were identified using MeV4.9.0 software.

Sequence Coverage and Identity of Prophage DNA, and Drug Resistance Genes or Mobile Gene Elements of the Variable Region
The hotmaps of sequence identity and coverage of prophage DNA and the comparative study of the drug resistance genes 1 https://github.com/ruanjue/smartdenovo 2 https://inkscape.org/en or mobile gene elements in the variable region were drawn with MeV 4.9.0 software.

Nucleotide Sequence Accession Numbers
Nucleotide sequence accession numbers of the prophage DNA sequences in the four strains, namely, 19051, 675920, 721005, and 911021, and the 16 genomes of K. pneumoniae from GenBank are shown in Table 2.

Antimicrobial Susceptibility Tests and Transferrable Features
The 16 S rRNA sequences were determined to be K. pneumoniae using BLAST. The results of the antimicrobial susceptibility tests on four isolates of K. pneumoniae (19051, 675920, 721005, and 911021) are shown in Table 1. After these tests, we used E. coli J53 as a recipient strain to conduct conjugative transfer experiments on 19051-sap, 675920-sap, 721005sap, and 911021-sap of K. pneumoniae harboring prophage. However, we could not create conjugons carrying prophage despite repeated trials.

Comparison and Analysis of Prophage DNA
The 20 prophages in this study were all identified from the K. pneumoniae chromosomes. A naming convention of XXX-sap was used to indicate where the prophage sequences were inserted into the sap sites of the chromosomes. The shortest prophage sequence was 18 kb and the longest was 65 kb, including 39-109 bp open reading frames. The average G + C content in the prophages was 49.6-53.6% (Figure 1, Table 2, and Supplementary Table S1). The region from 1 to 1,633 bp at the 5 end was the common core backbone region of all sequences. All sequences contained the core genes int and umuCD except for GOS436-sap and GOS442sap (Figure 1). The same 16 bp in the forward direction sequences (DRs: target site duplication signals) was confirmed at both sides of all sequences (Figure 1). A more detailed genomic comparison revealed that the same or different sites in the backbones of the prophages were interrupted by various insertion sequences, including IS1R, IS1T, IS2, IS26, ISKpn26, ISKpn28, Tn6556, MDR, and the In27-related region, except for in Goe154414-sap, KPNIH50-sap, and KPNIH39-sap ( Table 2). Compared with 34618-sap, the sequences of the backbone regions of the other prophages were gradually lost. Only int and umuCD were highly conserved core genes (Figure 1).

Massive Genetic Acquisitions or Losses Resulting in Evolution of Prophage DNA
At least nine large genetic structures that appeared to be from acquisitions or losses were identified in the 20 prophages in this study (Figure 2). First, IS2 was inserted    Table 2). Genes are indicated by arrows; characteristics of function and classification are marked with colors for genes and mobile gene elements. Shaded parts indicate homologous regions (nucleotide homology >95%).
Frontiers in Microbiology | www.frontiersin.org into the genes int (integrase gene) ( KP5-sap) and recT (recombinant repairing protein RecT) ( GOS442-sap), leading to both genes being disrupted into two parts, int-5 and int-3 , and recT-5 and recT-3 , but no base loss was induced. Second, the insertion of IS1T into the orf309 and orf396 genes was generally determined; however, these simple insertions might not delete bases in AR0152-sap, KP5-sap, 459-sap, and GOS442-sap. Third, ISKpn28 was inserted upstream of the hol (holin) gene, resulting in a 3.1-kb loss in 20046-sap, but no base variation occurred in 34618sap, GOS436-sap, 3562-sap, 20079-sap, and GOS442sap. Fourth Seventh, two IS26s were inserted into GOS436-sap, located in the middle and downstream of the orf3069 gene, splitting this gene into two parts, orf3069-5 and orf3069-3 , resulting in an 8.9-kb deletion in GOS436-sap. IS26 in CAV1392-sap inserted upstream of the hin gene resulted in a 13.4-kb loss, while the insertion of IS26 upstream of the hin gene in 3562-sap also did not cause any base loss. Eighth, among the 20 sequences analyzed, only 721005-sap containing the MDR region, located upstream of the retA gene (reverse transcriptase gene), resulted in a 34.1-kb loss, including hin and its upstream genes. Ninth, the exogenous insertion into 19051-sap of the In27-related region, located between the hin and exoVIII genes (deoxyribonuclease exoVIII), resulted in a 45-kb loss, involving part of the hin and exoVIII genes.

The MDR Region From 721005-Sap
The largest number of drug resistance genes in 721005sap was found to be located in the MDR region (Figure 3), followed by the Tn1548-related region, in which In27 in the original Tn1548 was replaced by In127, the truncated repAciN gene, and a 7.2-kb-long unknown functional region trb-to-pem, a 2.5-kb-long Tn2 remnant (including Tn2 IRL and truncated tnpA), and complete Tn6502 (containing the beta-lactamase resistance gene bla CTX−M−55 ). Tn1548 was shown to be a composite transposon flanked by IS26 with no forward direction repeats (DRs) at both ends and its structure was IS26-In27-ISCR1-ISEc28-armA (aminoglycoside resistance gene)-ISEc29-msr(E)-mph(E) (macrolide resistance gene)-orf543-repAciN-IS26. In27 in Tn1548 could be replaced by different class 1 integrons, forming many Tn1548-related elements (González-Zorn et al., 2005;Du et al., 2012). In127, unlike In27, had only one aadA2 (aminoglycoside resistance gene) gene cassette (Figure 3).

Evolution of the In27-Related Region From 19051-Sap
Tn1696 belongs to the Tn21 subgroup of the Tn3 transposon family. It was produced by insertion of the dissociation site (res) into the core backbone region by class I integron In4, and its structure was shown to be IRL (left reverse repeat)-tnpA (transposase)-tnpR (dissociation enzyme)res (dissociation site)-mer (mercury resistance site)-IRR (right reverse repeat sequence) (Partridge et al., 2001). The In27-related region in 19051-sap was derived from Tn1696, which was produced by five major evolutionary events in Tn1696 as follows. First, the truncated IS5075 was inserted into the middle of the IRR, resulting in only a 22-bp remnant in the IRR. Second, the incomplete In27 was inserted into the same res site of In4, and the res was truncated to the 5 end as a 75-bp remnant. The sequences of the In27 and In4 gene cassettes were completely different; the In27 gene cassette featured aadA2-gcuF-dfrA12, while the In4 gene cassette contained cmlA1-aadA2-orfE-aacC1. Moreover, the 3 -CS of In27 was absent. AadA2 and dfrA1 are genes conferring aminoglycoside resistance and trimethoprim resistance, respectively, and gcuF encodes a pseudo protein. The other three events included the insertion of ble (bleomycin resistance gene) and nimA (unknown function), the deletion of the mer gene, and the replacement of IS26 and IS5075 by IS6100. Downstream of IS5075 was GIsul2 (3 end of GIsul2: resG-orf225-ISCR2-glmM-sul2) and IS26. GIsul2 is a large mobile element carrying integrase (int), including the resolvase gene resG, several conjugative transfer genes, sulfonamide resistance gene sul2, and ISCR2 sequences, and is present in various bacteria (Nigro and Hall, 2011).

Evolutionary Analysis of the Prophage DNA Sequences
The evolutionary relationships of the 20 prophage DNAs are shown in Figures 4A-C, 5, 6. The identities of the related SNPs found in the backbone sequence were obtained with high confidence among the BAA2146-sap, INF164sap, 721005-sap, 675920-sap, 911021-sap, 20079-sap, CAV1392-sap, 3562-sap, GOS436-sap, 20046-sap, and 34618-sap (Figure 4A, all identified above a 93% confidence level, purple line), and the KP5-sap (Figure 4A, identified at a 96% confidence level, purple line). The identities of the related SNPs found in AR0152-sap and GOS442-sap were obtained at a 99% confidence level (Figure 4A, purple line). These results suggested that this region of the backbone sequence was largely homologous. However, regions of the backbone sequence where the identities of the SNPs were associated with a confidence level of less than 60% indicated that the nucleotide sequence of that backbone region was diverse compared with the above region. These results are consistent with the elements and sites of insertion shown in Figures 4B,C. Figure 4B shows the insertion sites of the genes or gene elements. Figure 4C graphically shows the inserted genes or gene elements. Taken together, Figure 4 summarizes the evolution and changes of the 20 prophage DNAs, including the sites and types of inserted genes or gene elements, suggesting there are contradictory features in phage DNA stability and variation. FIGURE 5 | Sequence coverage and identity among the prophage DNAs for four strains of K. pneumoniae and 16 prophage DNAs identified from GenBank. There is high sequence identity (average nucleotide homology >95%), but various coverage among the 20 prophage DNAs based on Supplementary Table S1. FIGURE 6 | The correlativity of drug resistance genes or mobile gene elements are determined with CARD, ResFinder, and Tn Number Registry databases among the four strains of K. pneumoniae and 16 prophage DNAs identified from GenBank. There are only 10 prophage DNAs harboring drug resistance genes, which are generally located on mobile gene elements such as the MDR region, In27-related region and novel Tn6556. Figure 5 illustrates the high identity of the partial sequence in the 20 prophage DNAs, indicating an average nucleotide identity (homology) of >95%, consistent with the shaded regions in Figures 1-3. However, the various coverages of the 20 prophage DNAs vividly suggest the polymorphism of evolution and change. Figure 6 reveals the relationship among the drug resistance genes or gene elements in the variable region, suggesting the everchanging characteristics of gene elements. These results suggest the high sequence identification and homology of prophage DNA and helped to characterize the evolutionary diversity of the prophage DNA examined in this study.

DISCUSSION
The analysis of four prophage DNA sequences and 16 similar prophage DNA sequences from GenBank indicated that they were all specifically inserted into the sap sites of K. pneumoniae chromosomes. It is well known that sapABC is a permease of the peptide transport system, which mediates the transport of phages or particles into K. pneumoniae cells; accordingly, the capsule biosynthesis of phage in K. pneumoniae is productive and regulated (Dorman et al., 2018). The sap ABC transporter promotes capsule production by increasing gene expression in the middle of the capsule (Dorman et al., 2018). Then, the sapABC system replicates and inserts phage DNA into the K. pneumoniae DNA, forming prophage DNA. Over time, after experiencing the above-mentioned massive genetic acquisitions and losses, including changes to the lysogenic transformation regions, DNA replication regions, and transcriptional regulatory regions of the prophage, among others, the bases of the backbone regions were gradually lost over the course of evolution. This would increase excessive insertions with or without drug resistance genes or gene elements such as IS1T, IS1R, IS26, ISKpn26, ISKpn28, Tn6556, In27-related, and the MDR region. These factors enable the host bacteria to reduce their metabolism and external pressures, promoting better adaptation to the environment, and revealing the evolutionary benefits of prophage DNAs. Over the course of prophage evolution, the int gene has always been present, suggesting that it is the most conserved core of the backbone region. Owing to the evolutionary selective pressures present during the process of integrating DNA into host bacteria, the prophage DNA inevitably undergoes gene mutations and deletions. Transfer of prophage DNA requires two reactions: first, excision of the prophage DNA from the host bacterium and second, integration of the prophage into a new host bacterium. The excision of the prophage involves a functional xis gene, but this is not absolute. If there is no Xis enzyme, the integration enzyme (Int) and the integral host factor (IHF) can complete the transfer of the prophage DNA.
Integrating a prophage into a new host requires a functional integrase (Int) followed by a specific binding site (attB) in the host, and any nucleotide changes in the core region of the binding site might result in failure of the prophage cell integration (Nash, 1981). Likewise, if the conjugate was integrated into a bacterial cell, the phage could express a protein that was toxic to the cell and kill all exconjugants. Therefore, the failure of the conjugate transfer experiment in this study is not surprising. It is generally accepted that DNA excision depends on phageencoded proteins, and the excision enzyme (Xis) and integrase (Int) are considered to be involved in the first step in prophage induction, allowing the prophage DNA to be excised and then replicated (Davidson, 2018). However, it has also been reported that several prophages of Staphylococcus aureus were found to have significantly delayed transcription of the xis gene compared with that of the genes encoding the proteins required for DNA replication and prophage-particle production. The result of this delay was the replication of prophage DNA in situ within the bacterial genome and subsequent encapsulation of the prophage DNA, which was still attached to the adjacent bacterial DNA (Chen et al., 2018;Davidson, 2018).
Based on the genes or genetic mobile elements investigated, some of prophages appeared functional and were even phenotypically expressed. Comparative analysis of the 20 prophage genomes from the sap site of the K. pneumoniae chromosome indicated that a positive repeat of 16 bp in length was observed at both ends of all prophages, and the sequences were identical, labeled as attL and attR (Figure 1, a small square flag at both ends of the sequence). Compared with the reference sequence 34618sap, the sequence of the backbone regions of the other prophages was gradually lost. However, the most stable core genes were int and umuCD (Figure 1). At least nine exogenous insertion regions including IS1R, IS1T, IS2, IS26, ISKpn26, ISKpn28, Tn6556, MDR, and In27related regions were determinate in this study (Table 1), in which Tn6556, MDR, and the In27-related regions were involved as resistance genes. Additionally, a novel composite transposon Tn6556 was discovered in the K. pneumoniae chromosome. Different insertion elements of a series of exogenous insertion regions including IS and Tn on the prophage had relatively specific insertion sites where ISKpn28 was located upstream of the hol gene, ISKpn26 was located upstream of the orf gene, and Tn6556 was located upstream of the hin gene. As the simplest transposition element, the shearing and transfer of IS was accomplished by its own transposase. Tn6556 was flanked by two repeating IS26s, and this repetitive IS mediated homologous recombination of Tn6556, and also resulted in genetic loses near the recombination site resulting in diversity in the prophage DNA.
Tn6556 containing In127 was discovered for the first time in this study (Tables 2, 3 and Figures 2-4, 6). The results of the antimicrobial susceptibility test shown in Table 1 indicated that the K. pneumoniae strains were highly resistant to various antimicrobial agents, suggesting that in addition to the phenotypic expression of the resistance genes or gene elements carried on the chromosome, one or more plasmids carrying resistance genes or gene elements might also be involved (Tables 1-3 and Figures 1-3, 6), and warrants further study. The results of Figure 2 summarizing the five sites of base insertion and nine types of base acquisition or loss in prophage DNA (Table 2 and Figures 2, 3, 4B,C) show in detail the molecular mechanism behind the evolution of the prophage DNA and the changes of AMR (Table 3 and Figure 6). AMR encoded in a bacterial chromosome might originate from new mutations or acquired AMR genes, which could be maintained by replication without selection pressure (Navon-Venezia et al., 2017). This process would facilitate the evolutionary adaptation of bacteria and has a potential impact on the monitoring and treatment of bacterial infections (Shen et al., 2019). Our results confirm that AMR genes are encoded in the chromosome of K. pneumoniae (Mathers et al., 2017), and they can be classified as "core AMR gene" or "acquired AMR gene" (Wyres and Holt, 2016). Furthermore, these results indicate that the incorporation of phage DNA has resulted in a diverse set of K. pneumoniae chromosomes, confirming their ever-changing roles in the evolutionary success of the bacterium. This study did not examine how the phages of the relevant gene locus are induced and transferred to other strains, and how these phages play a role in the transfer of AMR genes, and therefore these questions require further investigation in future studies.

CONCLUSION
The emerging diversity of prophages is the result of base acquisition or loss with a large number of variable regions being associated with insertions of different prophage DNA. This has resulted in various degrees of base loss in the backbone region near the insertion sites, conferring evolutionary diversity on prophage DNA.

DATA AVAILABILITY STATEMENT
The datasets generated for this study can be found in the GenBank.