Networking and Specificity-Changing DNA Methyltransferases in Helicobacter pylori

Epigenetic DNA base methylation plays important roles in gene expression regulation. We here describe a gene expression regulation network consisting of many DNA methyltransferases each frequently changing its target sequence-specificity. Our object Helicobacter pylori, a bacterium responsible for most incidence of stomach cancer, carries a large and variable repertoire of sequence-specific DNA methyltransferases. By creating a dozen of single-gene knockout strains for the methyltransferases, we revealed that they form a network controlling methylome, transcriptome and adaptive phenotype sets. The methyltransferases interact with each other in a hierarchical way, sometimes regulated positively by one methyltransferase but negatively with another. Motility, oxidative stress tolerance and DNA damage repair are likewise regulated by multiple methyltransferases. Their regulation sometimes involves translation start and stop codons suggesting coupling of methylation, transcription and translation. The methyltransferases frequently change their sequence-specificity through gene conversion of their target recognition domain and switch their target sets to remodel the network. The emerging picture of a metamorphosing gene regulation network, or firework, consisting of epigenetic systems ever-changing their specificity in search for adaptation, provides a new paradigm in understanding global gene regulation and adaptive evolution.


INTRODUCTION
The currently dominating model for adaptive evolution assuming selection from diverse genome sequences is derived from genetics and molecular biology of microorganisms and other forms of life, but its general validity became thoroughly testable only recently with the availability of many genome sequences within a species. Meanwhile studies of cell differentiation in multicellular organisms have revealed critical roles of epigenetics, here defined as information added to genome nucleotide sequences and heritable through genome replication. The diverse epigenomes, as opposed to the diverse genomes, may be regarded as the units of evolution (Figure 1A). There are indeed increasing lines of evidence for trans-generation epigenetic inheritance in plants and animals (Miska and Ferguson-Smith, 2016;Quadrana and Colot, 2016). In unicellular and equally dividing bacteria, a somatic cell can be regarded as a germ line cell, to the first approximation, which simplifies the issue. Sequencespecific DNA base methylation there affects gene expression (Sánchez-Romero et al., 2015) and its reversible changes can lead to different phenotype sets (De Ste Croix et al., 2017;Gorrell and Kwok, 2017). A DNA methyltransferase (MTase) is often paired with a restriction enzyme to form a restrictionmodification (RM) system (Roberts et al., 2003). They form a prokaryotic immune system attacking non-self DNA lacking proper epigenetic DNA methylation and also behave as selfish mobile elements (Vasu and Nagaraja, 2013).
Prokaryotes possess, on average, five DNA methyltransferase genes and three methylated sequence motifs (Blow et al., 2016;REBASE PacBio statistics, 2020). A large repertoire of DNA methylation systems, with twenty to thirty genes per genome, has been found in Helicobacter pylori Krebes et al., 2014). They colonize on children's stomach cells using their motility, persist there for decades, tolerating attacks with ROS (reactive oxygen species) by immune systems and repairing DNA damages, and may eventually cause ulcer and cancer (Backert and Yamaoka, 2016). Some of their RM systems are conserved, while some others are present only in few lineages ( Figure 1B; Vale et al., 2009;Uchiyama et al., 2016). Many are obtained through horizontal transfer, within the species or from a different species, and decay by mutation (Lin et al., 2001;Furuta et al., 2011;Furuta and Kobayashi, 2012b). In some RM types (Type I, Type III, Type IIG), target sequence recognition domains (TRDs) move within a gene and between genes, sometimes from various (eu)bacterial species beyond phylogenetic barriers (Furuta et al., 2011;Furuta and Kobayashi, 2012b). Because of these processes, the repertoire of methylation systems changes rapidly and apparently irreversibly during lineage diversification of H. pylori . Single-molecule real-time sequencing technology indeed revealed highly diverged methylomes even for closely related genomes . Each of the methylomes may have a unique gene expression pattern and phenotype set and might be regarded as the units of adaptive evolution ( Figure 1A; Furuta and Kobayashi, 2012a). Previous experiments have indeed found effects of individual DNA methylation genes on the transcriptome and phenotype in many bacterial species (Srikhanta et al., 2009(Srikhanta et al., , 2011Fang et al., 2012;Kumar et al., 2012;Furuta et al., 2014;Sánchez-Romero et al., 2015;Zhang et al., 2017;Nye et al., 2019).
In order to examine the above epigenetics-based adaptive evolution model, we here knocked out ten of specificitydeterminant genes of DNA methylation systems, examined the resulting methylomes and transcriptomes, and predicted and tested adaptation-related phenotype changes. We focused on Type III and Type I RM systems, which frequently change sequence specificity by domain movement (Furuta et al., 2011Furuta and Kobayashi, 2012b), and several uncharacterized Type II RM systems.
Our results revealed a huge gene regulation network that involves many, interacting and ever-changing DNA methylation systems as hub transcription factors ( Figure 1C). It controls gene expression and adaptive phenotype sets in a unique way.

Bacterial Strains, Culture Media and Plasmids
The strains, plasmids and their relevant characteristics are shown in Supplementary Table S3 of the supporting information. E. coli strain was cultured using Luria-Bertini (LB, Lennox) broth (10 g tryptone, 5 g Yeast extract, 5 g NaCl per 1L

DNA Manipulation and Mutant Construction
The oligonucleotides used in this study are listed in Supplementary Table S4. KOD FX neo (Toyobo, Ohtsu, Japan) was used as the DNA polymerase for PCR. E. coli HST08 chemical competent cells (Takara Bio, Kusatsu, Japan) and In-Fusion HD cloning kit (Takara Bio, Kusatsu, Japan) were used for conventional cloning. Gene-Elute TM plasmid DNA kit (Sigma-Aldrich, St. Louis, MO, United States) was used for plasmid purification.
For gene knockout experiments, a region of approximately 800 bp flanking an MTase gene or the specificity subunit gene was cloned into pUC119 or pUC19 together with a fragment containing a Km resistance gene (aphA) with a promoter derived from pHel3 (Heuermann and Haas, 1998 (Lin et al., 2001;Furuta et al., 2011;Furuta and Kobayashi, 2012b;Kojima et al., 2016 competent cells prepared in 300 µM sucrose by electroporation, and the transformants were selected on BB-FBS agar containing Km. After two rounds of single colony isolation, the clones were stocked. The absence of the original locus from these mutants was confirmed using PCR. To disrupt methyltransferase gene of a Type II RM system, R gene alone was replaced with the Cm gene first. Then, in the second recombination experiment, the Cm gene and the M gene were replaced together by the Km gene. In gene restoration experiments, the original methyltransferase gene (hpyAXM, hpyPVIIM) was PCRamplified from genomic DNA and joined with Cm resistance gene on pUC19 with their respective c.a. 800-bp upstream and downstream fragments. In parallel, a construct without the MTase gene was also made. These constructs were transferred into strain PIK14 ( hpyAXM) or PIK64 ( hpyPVIIM) (Supplementary Figure S4). The absence of the initially integrated Km allele from these mutants was confirmed by PCR and sequencing.

Methylome Decoding
Total DNA was extracted from a stationary-phase liquid culture of each H. pylori strain, PIK38, PIK39, PIK40, by lysozyme treatment and phenol-chloroform extraction. The DNA was further purified using Qiagen Genomic-tip 100-G column (Qiagen, Hilden, Germany) following the manufacture's protocol. The DNA was sheared to ∼20 kb using a g-Tube (Covaris, Woburn, MA, United States) and the libraries for sequencing were constructed with SMRTbell Template Prep Kit 1.0 (Pacific Biosciences of California, Menlo Park, CA, United States) following the standard instruction for 20-kb template preparation. In the process of size selection using BluePippin (Sage Science, Beverly, MA, United States), 10-50 kb molecules were collected with high-pass v3 program. Sequencing was performed using PacBio RS II (Pacific Biosciences of California, Menlo Park, CA, United States) with P5-C3 chemistry and 3-h movie. Data assembly was performed using the program HGAP (Chin et al., 2013) packaged in SMRT Analysis v2.2.0.
The re-sequenced strain P12 genome was annotated by MiGAP pipeline in National Institute of Genetics, Japan (Sugawara et al., 2009), and then the updated annotation was used for RNA-seq data analysis.
Methylomes of strain 26695, PIK14, PIK16, and PIK17 were decoded to confirm the loss of methylation at the expected sites. For this analysis, we used stationary phase cell culture and Genomic-tip 100-G column.

Transcriptome Analysis
Total RNA was extracted from H. pylori liquid culture at exponential phase (OD 600 nm = 0.40-0.45 in a 1-cm path cuvette) using a Pure-Link RNA mini kit (Lifetechnologies, Carlsbad, CA, United States). The rRNA was depleted from the total RNA using a RiboZero gram-negative kit (Illumina, San Diego, CA, United States) prior to cDNA synthesis. cDNA libraries of 200 to 400-bp fragment size were constructed using a TruSeq mRNA kit (Illumina, San Diego, CA, United States) for strain 26695 and its derivatives, or SureSelect Strand Specific RNA library kit (Agilent Technologies, Santa Clara, CA, United States) for P12 and its derivatives. Indexed cDNA libraries were pooled and sequenced in paired-end mode under a HiSeq 2500 platform with unrelated samples.
Reads of 100 bp were mapped on to the strain 26695 genome (RefseqID: NC_000915.1) or P12 genome (our re-sequenced data) using the BWA program, allowing 4% mismatches. This gave rise to 17 to 31 million mapped reads per sample for 26695 and its derivative samples, and 6.5 to 27 million mapped reads for P12 and its derivative samples. The counts of mapped reads per gene were obtained using HTSeq 0.6 (Anders et al., 2015). The read depth was calculated using BEDtools (Quinlan and Hall, 2010). Differential gene expression analysis was conducted for two sets of biological replicate data using Bioconductor package TCC (Sun et al., 2013). In TCC, we used a TMM normalization-edgeR iteration protocol. In this work, the genes with q.value (false discovery rate) < 0.01 and a.value (mean expression value in log 2 ) > 0.75 were considered to be differentially expressed genes (DEGs).
Pathway activity analysis (Lee et al., 2008) was performed using Bioconductor package GSVA (Hänzelmann et al., 2013). TMM-normalized read count data was used as input. 71 KEGG pathway categories that have at least five gene members in H. pylori genome were considered. Gene expression changes in members of representative KEGG categories were visualized using Pathview (Luo and Brouwer, 2013).
For stain P12, transcription unit data was not available. Therefore, we tentatively, defined regulatory regions of each coding sequence as start codon plus 149 bp upstream region (152 bp in total) for each coding sequence. Sequence motifs search for the genomic segments and sequence retrieval were conducted using standard functions implemented in R package Biostrings (Pages et al., 2003). We used Marlov Maximal order model as the number of expected motif count (Rocha et al., 1998). Motif frequency (Mr) was defined as the ratio of observed counts to expected counts.

qPCR
The qPCR was performed for validation of the transcriptome data (Supplementary Table S5). This was performed for total RNA before rRNA depletion, using a StepOnePlus real-time PCR system (Life Technologies), ReveTra Ace RT-PCR (Toyobo), and KOD SYBR qPCR kit (Toyobo) by the absolute quantitation method described in the manufacturers' protocols. Total RNA was first converted to cDNA, and then used as a qPCR template. A reference gene, which showed similar levels of expression in all the strains in the RNA-seq experiments, was used as a reference copy number for cDNA. This was HP1035 for 26695 and its derivatives' samples, lepB (HPP12_0582) for strain P12 and its mod gene KO samples, purD (HPP12_1183) for hpyPVIIM KO and paired P12 samples, and horF (HPP12_0684) for hpyPIM KO and paired P12 samples, respectively. The quantity of cDNA was determined on the basis of standard curves with R 2 > 0.99. The qPCR primers used are listed in Supplementary Table S4.

Growth
From the 50% glycerol stock, each strain was streaked on four TSA plates and incubated for 2 days. The colonies of one plate was pooled in 1 ml BB and then transferred to a Petri dish containing 10 ml BB-FBS and incubated at 37 • C in the presence of 10% CO 2 and 5% O 2 in a CO 2 incubator for 24 h. After incubation, the liquid culture was diluted to OD 600 nm = 0.1 in 30 ml fresh BB medium with 10% FBS and incubated in the CO 2 incubator with agitation (rpm-90) for 4 days, and OD 600 nm was recorded at 6, 12, 24, 30, 48, and 72 h (or 6, 12, 24, 36, and 48 h for strain 26695 data set) after inoculation. At the same time, serial dilution of each strain was made and spread on BB-HB agar. After 3 days of incubation, colonies were counted and cfu (colony forming unit) was calculated.

Oxidative Stress Resistance Test
Liquid culture was prepared in triplicate for each strain described above. They were then diluted in BB to OD 600 nm = 0.8. After dilution, 100 µl of the cell suspension was spread onto BB-HB agar with a cotton swab. A sterile paper disk (5 mm in diameter) was placed on the center of the plate. To assay peroxide sensitivity, 7.5 µl of a reactive oxygen species (ROS)-inducing agent [30% hydrogen peroxide (HP) (Wako Pure Chemical Ind. Ltd.) or 5% tert-butyl hydroperoxide (Wako Pure Chemical Ind. Ltd.) was placed on the center of the disk. After 3 days of incubation at 37 • C in the CO 2 incubator as described above, the diameter of the zone of inhibition (mm) was measured. The diameter of the disk was subtracted. To evaluate the significance for the difference between the control and the mutant, we used two-tailed Student's t-test or Welch t-test after testing equality of variance by F-test assuming normal distribution as we could have only small sample size in each experiment.

Mitomycin C Sensitivity Test
Liquid cultures in triplicate were prepared for each strain as described above and then diluted in BB to OD 600 nm = 0.8. After dilution, 100 µl of the cell suspension was spread onto BB-HB agar using a cotton swab. A sterile paper disk (5 mm in diameter) was placed on the center of the plate. To assay sensitivity against DNA damaging agent, 7.5 µl of mitomycin C (10 µg/ml) (Sigma-Aldrich, St. Louis, MO, United States) was placed on the center of the disk. After 3 days of incubation at 37 • C in the CO 2 incubator, the diameter of the zone of inhibition (mm) was measured as described above. The diameter of the disk was subtracted. The significance of the difference was evaluated using t-test as above.

Motility Assay
From the stock, each strain was streaked on three TSA plates and incubated Luria for 2 overnights Luria After incubation, 1 ml BB Luria was spread on the colonies and then they were scraped off with a sterile loop. Then 200 µl of the liquid culture was mixed with 10 ml BB-FBS Luria and incubated overnight at 70 rpm. Next day, the culture was diluted to OD 600 of 0.8 and 5 µl of the culture was spotted on the center of BB-FBS soft agar [0.35% agar (w/v)]. Optionally, 50 µg ml −1 triphenyl tetrazolium chloride (Sigma-Aldrich, St. Louis, MO, United Kingdom) was added to visualize pH shift around the cells. After 5 days of incubation, the diameter of bacterial swimming zone with central inoculation spot was measured. The significance of the difference was evaluated using t-test as described above.

Diverse DNA Methyltransferases and Their Target Motifs Shaping the Methylome
We knocked out each of the specificity-determinant genes of a total of 10 known/putative DNA methylation systems (Figure 2A) in two laboratory strains, P12 and 26695. Although both belong to hpEurope population (Backert and Yamaoka, 2016), they differ in the repertoire of RM systems ( Figure 1B). RM systems are classified into three (Roberts et al., 2003). Type I RM systems are composed of three subunits: a restriction (R) enzyme subunit, a methyltransferase (M) subunit, and a specificity (S) subunit usually with two target recognition domains (TRDs). In Type II RM systems, R enzyme and M enzyme separately recognize a sequence and M enzyme methylates both strands. In Type IIP RM systems among them, the recognition sequence is a palindrome. Type III RM systems are likewise composed of R and M proteins, but their TRD is present within the M protein and methylation takes place on one strand of a non-palindromic target sequence.
By decoding methylome of P12 and the knockouts, we detected modification of A or C in 15 motifs (Supplementary Table S1) [see also ]. A methylation occurred at 2,578/2,579 sites of the GACC motif in the chromosome, but its methylation was very much decreased in HPP12_1497 (Supplementary Table S1). We concluded that GACC is the recognition motif of the HPP12_1497 product for A methylation and propose to designate this Type III RM system as HpyPIX. The other two Type III RM loci in P12 examined (Figure 2A) are not related to any methylation activity detected (Supplementary Table S1). Methylation activity of locus 3 was not detected in Frontiers in Microbiology | www.frontiersin.org  the other strains examined Krebes et al., 2014;Lee et al., 2015). As this locus is conserved and fosters allelic TRD variation ( Figure 1B; Furuta and Kobayashi, 2012b;Kojima et al., 2016), it may have some unknown function. We further created P12 mutants knocked out for three Type II MTase genes (HpyPI, HpyPVII, HpyPVIII) and 26695 mutants for a Type II MTase gene (HpyAX), a Type III MTase gene (HpyAXI), and a Type I specificity gene (HpyAXIII) (Figures 1B, 2A). The 26695 knockout mutants lost methylation at the expected motifs (Krebes et al., 2014): HpyAX, TCGA; HpyAXIII, CTAN 8 TGT and ACAN 8 TAG; HpyAXI, GCAG (Supplementary Table S1).
An earlier work examined the distribution of methylation motifs along the genome . The genomic islands referred to as plasticity zones (Kersulyte et al., 2009), or Integrative Conjugative Elements (Fischer et al., 2010), apparently avoid occurrence of m4C, and m6A methylation . We now analyzed the distribution of each of the above methylation motifs in comparison with the expected frequency ( Figure 2B). Their average frequency turned out to vary greatly with GTAC and TCGA strongly avoided (Humbert and Salama, 2008;Zhang et al., 2017). The CATG motif of a conserved Type IIP RM system turned out to occur less in the genomic islands including cagPAI (cag pathogenicity island) (Supplementary Figure S1, left), but this is not the case for the other motifs ( Figure 2B). For GACC motif in particular, the sliding windows overlapping with PZ1 and PZ3 showed the highest abundance and those overlapping with PZ2 and the neighboring 23 rRNA gene showed a relatively high abundance (Supplementary Figure S1 right, Supplementary Table S2).

Gene Regulation Network Involving Many Sequence-Specific MTases as Hub Regulators
Gene regulation network of H. pylori was proposed to be built shallow with only few transcription factors (Danielli et al., 2010). However, our transcriptome analysis of the MTase knockouts ( Figure 2C, Supplementary Data Set S1) revealed a large network involving many MTases as hub regulators ( Figure 1C). The MTases affect expression of virulence factors such as urease-related genes, cagPAI genes and vacA among others (Figure 1E). The MTases varied widely in their effects on the transcriptome (Figure 2C). MTases for CATG (M.HpyPI), ATTAAT (M.HpyPVII), GACC (M.HpyPIX), TCGA (M.HpyAX) turned out to influence a large number of genes. Except for Type III MTase, M.HpyPIX, these are solitary Type II MTases. In contrast, the Type I system HpyAXIII (CTAN 8 TGT) and the Type III system HpyAXI (GCAG) have only minor influence. We could not detect any significant influence of the Type IIP RM system HpyPVIII (TCNNGA).
We found that the knockout of one MTase gene often affects expression of other MTases [ Figures 1C(ii-iv)]. We extracted several elementary patterns in their interaction [ Figure 1C Expression of Type II systems, HpyPI, HpyPVII, or HpyAX, was not affected by knockouts of Type I or Type III methylation systems, but instead these Type II MTases affected expression of Type I, Type IIG, or Type III RM systems [ Figures 1C(ii,iii)]. Because Type I, Type IIG and Type III RM systems change their TRDs over time ( Figure 1B, "Introduction"), this apparent hierarchy between two groups of RM systems could be of some biological significance. Some of the target MTase genes carry motif sequence copies of the controlling MTase in their coding or upstream regions (Figures 3ABD-F, Supplementary Figures S6A-D), which suggests gene expression control through local methylation. We also found that some MTases control expression of known transcription factors [ Figures 1C(I,iii)]. Inactivation of HPP12_1365 reduced expression of its downstream genes encoding a two-component signal transduction system, which likely explains the transcriptome change. GACC motif is abundant in the windows covering PZ1, PZ3, and a 23S rRNA gene along the genome (Supplementary Table S2, Supplementary Figure S1) as we mentioned above. Upon inactivation of hpyPXM, reduced expression was observed downstream of a 23S rRNA gene including the adjacent PZ2 region (Figure 4G). Parts of genomic islands (PZ2, PZ3) with high GACC frequency reduced expression in hpyPXM knockout (Supplementary Figure S1). Activation of mobile DNA transcription by methylation is the opposite to what is commonly observed in eukaryotes (Slotkin and Martienssen, 2007).
To address whether the MTases contribute to adaptive phenotype, we first determined the effect of the knockouts on the growth (Figure 2D). We detected a difference for HpyPI (CATG), HpyPVII (ATTAAT), HpyPIX (GACC), HpyAX (TCGA), for which we also detected a large transcriptome change ( Figure 2C). Among these, HpyPIX (GACC) knockout increased growth, whereas the other knockouts decreased (Figure 2D and Supplementary Figure S2).

Gm6ACC MTase Has Positive Effect on Motility and DNA Damage Tolerance
HpyPIX (GACC, Type III) was previously shown to switch ON/OFF by simple-repeat length changes and affect expression of 6 genes (Srikhanta et al., 2011). From our transcriptome data, we evaluated its gene regulation activities at the pathway level using KEGG onthology. Top 33 categories changed are shown in Figure 4A. Among those, more than half categories decreased activity. These include pathways associated with DNA replication and repair (ko03410, base excision repair (Supplementary Figure S3A); ko03420, nucleotide excision repair; ko03440, homologous recombination) and cell motility [ko02030, bacterial chemotaxis; ko02040, flagellar assembly (Supplementary Figure S3D)].
The knockout showed a larger clear zone around mitomycin C, a DNA damaging agent, on blood agar ( Figure 4F) (t = −4.8, df = 2.1, P = 0.035 in two-tailed Welch's t-test). In the swimming motility assay on soft agar, the migration zone was significantly smaller in the knockout (Figure 4E) (t = 13.5, df = 4.9, P = 4.3 × 10 −5 , in two-tailed Welch's t-test). These indicate that M.HpyPIX has a positive role in DNA damage resistance and in motility.
Eighty-one genes suggested to be involved in motility of H. pylori (Rust et al., 2008) are classified into four according to the sigma factor involved in their transcription ( Figure 4B). Most genes constituting this network reduced expression in the knockout except for 6 chemotaxis-related proteins and others ( Figure 4B). 11/12 class 2 genes and 4/4 class 3 were repressed. Differentially expressed genes were significantly overrepresented in these classes (P = 5.6 × 10 −3 for class 3, P = 5.4 × 10 −6 for class 2, and P = 1.5 × 10 −4 for intermediate in one-tailed Fisher's exact test). In the intermediate class, 15/23 genes were repressed with differentially expressed genes overrepresented (odds ratio = 5.2, P = 1.5 × 10 −4 in one-tailed Fisher's exact test). These responses are consistent with the strongly reduced expression of rpoN (σ54) for class 2 (1/3.5) and fliA (σ28) for class 3 (1/2.6). GACC motifs are present in the operon-like gene clusters containing fliA and rpoN (Figures 4C,D), so that their methylation may directly influence the expression. These results suggest that repression of fliA (for σ28) and rpoN (for σ54) expression is a major cause of the shutdown of the motility network by the M.HpyPIX knockout.

Cm6ATG at Translation Start Codon Affects Gene Expression
To address whether methylation locally affects transcription as known for several solitary DNA MTases (Sánchez-Romero et al., 2015) and a Type I MTase , we analyzed relation between occurrence of the methylation site and the transcript changes in detail for Cm6ATG ( hpyPIM), which includes a potential translation start codon ATG (Figure 5). When the group of genes with a CATG motif including start codon ATG and the group without one were compared, higher level of transcript changes by the knockout of Cm6ATG methyltransferase gene was detected in the former (P = 0.023, in one-tailed Wilcoxon's rank sum test) (Figure 5A, FIGURE 5 | Effects of M.HpyPI (CATG) knockout on transcriptome. (A) Genes with high CATG motif abundance in the regulatory region or genes starting with CATG motif tend to be affected. Transcript abundance change was compared between two groups. The upstream 149 bp from the coding region was defined as the regulatory region. Statistical significance of the difference was evaluated by one-tailed Wilcoxon's rank sum test; n.s.: P > 0.05. CDS, coding sequence. (B) Sequence conservation in the upstream region of CATG-starting genes. Conservation level was presented using Seqlogo (Schneider and Stephens, 1990). (C) Large transcript changes in hopM. (D) Large transcript changes in cytochrome c peroxidase gene. (E) (i) Large transcript changes in mccB and mccC genes for microcin on a plasmid. (ii) nucleotide sequence around mccB upstream region. CATG was highlighted in yellow. SD sequences are underlined. Putative terminator sequence (Zukher et al., 2014) were indicated by arrows. (F) Transcript abundance changes, upon M.HpyPVII knockout, in the bioB gene ending with three copies of ATTAAT. A blue box indicates a differentially expressed gene. A vertical line indicates a copy of a methylation motif. right). Therefore, such start codon methylation somehow promotes transcription.
The transcript fold change was also significantly higher in genes with higher motif frequency in the upstream regulatory region (P = 0.043, one-tailed Wilcoxon's rank sum test) (Figure 5A, left). We do not know whether this effect is due to the presence of an initiator peptide starting with CATG. There was no such effect detected for the CATG frequency within the coding sequence (Figure 5A, middle).
Upstream of the gene-starting CATG there are sequences characteristic of SD (Figure 5B, top panel and the middle panel as opposed to the bottom panel). When CDSs starting with CATG were separated into two groups: one carrying typical SD sequence "AAGG" within 15 bp from ATG (n = 207) and the other without "AAGG" (N = 176), the level of gene expression change (absolute value of m.value) was greater in the former group (p-value = 0.01429, one-tailed Wilcoxon rank sum test. Further upstream, the differentially expressed genes with CATG start show more of the features of transcriptional promoters for housekeeping σ 80 , consisting of an extended Pribnow box and a periodic AT-rich signal (Sharma et al., 2010), than those not differentially expressed genes with CATG start (Figure 5B). These observations suggest that some association between start codon methylation and nearby upstream transcription initiation underlies the transcript changes in the knockout.
The largest transcript abundance difference by the knockout among the genes with CATG start were found for hopM, encoding an outer membrane protein, and for cytochrome c peroxidase, an electron transfer protein (Figure 5A right, CD). They both carry a cluster of CATG sites around the start codon (some in dnaE, for DNA polymerase III, neighboring the cytochrome c peroxidase gene).
Among the genes not starting with CATG, the largest transcript differences by the knockout was observed for microcin (a bacteriocin) genes, mccB and mccC [ Figures 5A,E(i,ii)]. The mccABC operon carries a copy of CATG in the unstraslated region, the second CATG at the start codon for mccA, encoding a small peptide and not counted as a gene in our analysis, and the third CATG in the 5 side within mccB gene. Palindromic DNA sequences potentially forming a stem-loop structure on RNA leading transcription termination is found in this region [ Figure 5E(ii)]. There was no CATG in the corresponding region in the E. coli mccABC operon. Ribosome-controlled transcription termination is involved in translation of these genes (Zukher et al., 2014). These again suggest some association between methylation, transcription, and translation.
For hpyPAX (TCGA), hpyPVIIM (ATTAAT) and hpyPXM (GACC), no such difference between the groups was detected (Supplementary Figure S5). The highest transcript fold changes were observed in 'inactivated' MTase genes: GANTC-methylating MTase for hpyPVIIM (Supplementary Figure S5B) and a Type I M gene (HP0463) for hpyPAX (Supplementary Figure S5). This again indicates MTase-mediated control of expression of another MTase. Autorepression activity may be present in these target MTases in their intact forms as in some MTases (Karyagina et al., 1997;Mruk et al., 2011) and the gene inactivation may have led to loss of such auto-repression and to the large effects.
Because ATTAAT, the methylation motif of HpyPVIIM, contains TAA, a translation stop codon, we sought for genes with this stop codon among the differentially expressed genes between hpyPVIIM (ATTAAT) and its parent. We noticed one such case involving bioB (Figure 5F, Supplementary Data Set S1). This gene carries three copies of ATTAAT toward its 3 end and the last copy was annotated as the stop codon.

DISCUSSION
These results revealed a complex gene expression regulation network involving many DNA methyltransferases, as well as known transcription regulators, as hub regulators ( Figure 1C). The MTases regulate each other in several patterns so that methylation activity of one MTase influences various aspects of cell phenotypes likely increasing phenotypic variation, consistent with the epigenome-base adaptive evolution model. Many of these MTase hubs are present in only limited lineages ( Figure 1B) and/or change their sequence specificity (see section "Introduction"), frequently remodeling the network. The system may evolve irreversibly through these changes. Such irreversibility distinguishes the system from the known reversible phenomena involving simple-repeat length changes or site-specific recombination (Sánchez-Romero et al., 2015). The network may be worth the name of Firework. Genome comparison revealed various ways in which the DNA methyltransferases change their sequence specificity in H. pylori: (1) amino-acid substitution , (2) homologous recombination moving target recognition domains Kobayashi, 2012a, 2013;Furuta et al., 2014), and (3) changes in the number of central repeats in the S subunit of Type I system (Price et al., 1989;Andres et al., 2010). Although (2) and (3) appear more rapid than (1), there has been no attempt to measure these rates in the microevolution process or in the laboratory setting.
The Firework metamorphoses likely in search for adaptation. The phenotype affected by MTases includes motility, which is central to H. pylori's establishment and persistence of infection. Two MTases have opposing effects on motility, an accelerator and a brake [ Figure 1D(i)], likely for fine tuning. ROS from the host immune system damages DNA and proteins of H. pylori but also acts as a signal controlling redox status of enzymes and regulatory proteins (Ortiz de Orué Lucana et al., 2012). Two MTases are involved in ROS metabolism in opposite ways [ Figure 1D(ii)]. Their overlap in some physiological functions may allow loss of one of the MTase genes during long-term H. pylori evolution. Even in a small H. pylori population in Japan, we observed sporadic loss of hpyAXM-equivalent gene . Genes/apparent operons detected as differentially express genes in more than three MTase knockouts include groEL chaperone operon (4 knockouts, 3 activation, 1 repression), dnaK for a chaperon, vacA, HPP12_1177 (F0F1 ATP synthase subunit), HPP12_0834 (RNA-binding protein), mdaB (antioxidant protein), flaA (flagellin), ureB (urease), HPP12_0571 (membrane protein), babB (outer membrane protein), and mcc (microcin) operon on the plasmid pHPP12 (Supplementary Data Set S1). Some of these genes are involved in host interaction and virulence (vacA, flaA, ureB, babB, ureB) [Rust et al., 2008;Figures 1E(i,ii)] and in bacterial interaction (mcc) (Zukher et al., 2014). Effect of MTase on chaperon expression was reported earlier (Srikhanta et al., 2005;Vitoriano et al., 2013). The baseexcision repair category (KEGG id 03410) is repressed in three MTase knockouts (Figures 4A,H; Supplementary Figure S3A). We do not know whether this is related to the presence of a baseexcision type restriction enzyme in Helicobacter (Miyazono et al., 2014;Kojima and Kobayashi, 2015). These observations imply that the chaperones, the DNA repair systems, the motility system, the redox system, and the host interaction systems can sense the activity of MTases.
Reversible changes of gene expression, phase variation, may result from ON/OFF of DNA methyltransferase genes by simplerepeat length changes (Srikhanta et al., 2009) and from generation of a specificity subunit by site-specific recombination events in a restriction-modification system (Manso et al., 2014;Li et al., 2016). The unique features of the present system are that (1) DNA methyltransferases form a hierarchical network, (2) the specificity changes can take place in many of the hub DNA methyltransferases by movement of the domains (in addition to ON/OFF of the DNA methyltransferase genes) (3) and the overall change in the gene regulation network appears irreversible.
MTases regulate its expression and activity in various ways (Mruk and Kobayashi, 2014). One is auto-repression, which may explain increased expression of MTase genes (above). One MTase of H. pylori, M.HpyAXI (GCAG), changes its biochemical activity depending on pH (Banerjee and Rao, 2011). M.EcoRI transcript overlaps with an antisense RNA forming a bi-stable switch (Mruk et al., 2011). If an MTase can sense environmental signal, the Firework can be a simple system to generate phenotypic but heritable variations in a short time in vivo. To fully explore the potential for such "Lamarckian" evolution, further investigation is necessary on changes of genes and expression of MTases in response to environmental signals.
To our surprise, association of DNA methylation and translation initiation/termination in determining transcript abundance was suggested from multiple lines of observation with CATG, including a potential start codon ATG, and ATTAAT, including a stop codon TAA.
Our following observations suggest relation of DNA methylation at CATG at start codon and transcript abundance.
1. The genes with CATG start showed a higher level of transcript changes by the knockout of Cm6ATG methyltransferase gene (Figure 5A, right). 2. When CDSs starting with CATG were separated into two groups: one with SD and the other without one, the level of gene expression change was greater in the former. 3. The differentially expressed genes with CATG start show more of the features of transcriptional promoters ( Figure 5B).
4. The CATG-starting genes with the largest transcript differences by the knockout (hopM and cytochrome c peroxidase) also carried CATGs around the start codon ATG (Figures 5C,D). 5. A large transcript differences by the knockout was observed for mccBC operon (Figures 5A,E), which carries a CATG in the upstream untranslated region and another CATG at the start codon for mccA. Ribosome-controlled transcription termination is involved in translation of these genes (Zukher et al., 2014).
These observations suggest that methylation at or around start codon affects transcription. We also found a case (Figures 5F, Supplementary Data Set S1), where methylation at three copies of ATTAAT at 3 end (with the last copy annotated as the stop codon) decreases transcript. This suggests methylation at a potential translation stop codon may affect transcription.
These findings may be explained by the coupling of translation and transcription in bacteria (Figure 6; McGary and Nudler, 2013). When coupled to RNA polymerase (RNAP), the translating ribosome ensures transcriptional processivity by preventing RNAP backtracking. The trailing ribosome "pushes" RNAP, thereby modulating the rate of transcription by creating a synchrony of mRNA production and protein synthesis. The first 50 nucleotides of transcribed mRNA are particularly prone to stalling and backtracking. The rate of translation depends on the mRNA secondary structures. The ribosome behaves as an mRNA helicase, disrupting mRNA duplexes. In a Cryo-EM structure of RNA polymerase -ribosome 30S subunit complex  (Demo et al., 2017), the RNA exit tunnel of RNA polymerase aligns with the SD-binding site of 30S subunit. Ribosomal protein S1 forms a wall of the tunnel between RNAP and the 30S subunit, consistent with its role in directing mRNAs onto the ribosome. NusG, present in E. coli and H. pylori, can bind both RNA polymerase and ribosome to mediate the coupling (Krupp et al., 2019;Washburn et al., 2019). Some of these elements are close to the template DNA (Krupp et al., 2019) and may well be affected by DNA base methylation and proteins bound to the methylated DNA. Indeed, a palindromic DNA sequence generating a stemloop structure on mRNA that might lead to transcript release by Rho are found between mccA and mccB. Type I methylation on a long palindrome decreases transcript in an operon-like structure . Further informatic and experimental works are necessary to test this hypothesis.
While we were preparing this manuscript, one paper appeared (Srikhanta et al., 2017), which knocked out M.HpyPIX in P12 and confirmed its methylation motif as GACC and reported its effects on transcription and motility. One recent paper (Kumar et al., 2018) demonstrated effects of Tm4CTTC MTase on transcription, natural transformation and host interaction. Another paper (Estibariz et al., 2019) demonstrated that Gm5CGC methylation in the promoter affects transcription.

DATA AVAILABILITY STATEMENT
The datasets generated for this study can be found in the DDBJ Sequence Read Archive (https://www.ddbj.nig.ac.jp/dra/indexe.html) under accession numbers DRA005953, DRA004356, DRA003551, and DRA003688.

AUTHOR CONTRIBUTIONS
HY and IK conceptualized the study. HY, MF, YF, TS, TN, and SSh contributed to the formal analysis. HY, ER, MA, and TS investigated the study. YS, SSu, MH, AT, KS, and IK provided the resources. HY, MA, TS, and IK wrote the original draft. HY, MF, and IK reviewed, and edited the manuscript. IK supervised the study. All authors contributed to the article and approved the submitted version.

FUNDING SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb. 2020.01628/full#supplementary-material FIGURE S1 | Gene expression changes in high GACC frequency regions in the hypoPXM knockout.         DATASET S1 | Differential gene expression analysis.