Bacterial DNA methyltransferase: A key to the epigenetic world with lessons learned from proteobacteria

Epigenetics modulates expression levels of various important genes in both prokaryotes and eukaryotes. These epigenetic traits are heritable without any change in genetic DNA sequences. DNA methylation is a universal mechanism of epigenetic regulation in all kingdoms of life. In bacteria, DNA methylation is the main form of epigenetic regulation and plays important roles in affecting clinically relevant phenotypes, such as virulence, host colonization, sporulation, biofilm formation et al. In this review, we survey bacterial epigenomic studies and focus on the recent developments in the structure, function, and mechanism of several highly conserved bacterial DNA methylases. These methyltransferases are relatively common in bacteria and participate in the regulation of gene expression and chromosomal DNA replication and repair control. Recent advances in sequencing techniques capable of detecting methylation signals have enabled the characterization of genome-wide epigenetic regulation. With their involvement in critical cellular processes, these highly conserved DNA methyltransferases may emerge as promising targets for developing novel epigenetic inhibitors for biomedical applications.


Introduction
DNA methylation is a process that adds methyl groups to DNA nucleotides by enzymes known as DNA methyltransferase (Adhikari and Curtis, 2016). This process is involved in regulating a wide range of cellular processes, including epigenetic regulations in bacteria. Epigenetics is a change in gene expression that is heritable without a change in the DNA sequence itself. Unlike eukaryotes, which employ complex epigenetic regulation mechanisms, bacterial epigenetic control is primarily achieved through DNA methylation. It superimposes secondary information on a primary DNA sequence, adding additional directions to DNA transactions such as transcription, transposition, initiation of chromosome replication, and prevention of mutations by DNA repair (Marinus and Casadesús, 2009;Kumar et al., 2018;Estibariz et al., 2019;Tourancheau et al., 2021). DNA methylation is found throughout the prokaryotic kingdom, with S-adenosyl-L-methionine (SAM) as a common methyl group donor. However, DNA methyltransferases have been shown to be very diverse.
Based on the position to which the methyl group is transferred, DNA methyltransferases can be divided into two classes: exocyclic amino methyltransferases and endocyclic methyltransferases. Exocyclic amino methyltransferase transfers a methyl group to the N 4 position of cytosine (N 4 -C) or the N 6 position of adenine (N 6 -A), e.g., Dam and CcrM. Endocyclic methyltransferase methylates cytosine at the C 5 position (C 5 -C), e.g., Dcm (Wion and Casadesús, 2006;Marinus and Casadesús, 2009;Kumar et al., 2018;Chen S. et al., 2022). Among these variants, C 5 -C is predominantly found in eukaryotes, whereas N 4 -C and N 6 -A are mainly found in bacteria, Figure 1.
There are two main categories of DNA methyltransferases: methyltransferases in Restriction-Modification systems ("R-M systems") and "solitary" or "orphan" methyltransferases. The R-M systems constitute DNA methyltransferases (MTases) and associated restriction enzymes (REases) (Bickle and Krüger, 1993). In a majority of bacteria, the R-M systems function like an immune response, protecting their own DNA while degrading foreign DNA (Lees and Gladstone, 2015). Host DNA is methylated by a DNA methyltransferase which protects against digestion from the cognate restriction endonuclease, whereas foreign DNA, such as invading phage DNA, is unmethylated and degraded by the endonuclease (Bickle and Krüger, 1993). However, protecting the integrity of its own genomic DNA is not the sole purpose of R-M systems. Studies have found that MTases from R-M systems are also involved in regulating gene expression (Lees and Gladstone, 2015;Seib et al., 2020).
Genes encoding DNA methyltransferases are widely present in the genomes of bacteria, indicating that DNA methyltransferases play an important role in bacteria. Distinct bacterial lineages and phenotypic heterogeneity are common in bacteria. The formation of bacterial subpopulations from the same lineage is often controlled by epigenetic mechanisms that generate inheritable phenotypic diversity without altering the DNA sequence (Casadesús and Low, 2013;Sánchez-Romero et al., 2015;Tan et al., 2016;de Ste Croix et al., 2017). However, how DNA methyltransferases are involved in many cellular processes remains unknown. Recent developments in single-molecule real-time (SMRT) sequencing and nanopore DNA sequencing technologies have made detecting methylated bases achievable. They undoubtedly provide us with an accessible tool for studying DNA methylation in bacteria and may open a new era in bacterial epigenomics by deciphering a wealth of information on bacterial genome methylation patterns and functional consequences (Clarke et al., 2009;Flusberg et al., 2010;Attar, 2016;Tourancheau et al., 2021;Chen J. et al., 2022). This review summarizes some of the properties of bacterial DNA methyltransferases and highlights recent developments in understanding the role of DNA methyltransferase in physiological processes, especially gene expression regulation in bacteria.

Two main DNA methylation systems 2.1. Restriction-modification systems
Genes encoding R-M systems are present on most bacterial and archaeal genomes. Their prevalence indicates how important R-M systems are to prokaryotes. The well-known function role of R-M systems is to protect the host cell from invading foreign DNAs (Murphy et al., 2013). Canonical R-M systems contain enzymes carrying out two activities: a restriction endonuclease, which binds to specific recognition sites and hydrolyzes DNA when the sequence is unmethylated, and a corresponding DNA methyltransferase, which methylates DNA on the same sites recognized by their cognate endonuclease. Some DNA methyltransferases, such as M.EcoGII, could methylate not only residues in DNA but also those in DNA:RNA-hybrid oligonucleotide duplexes (Murray et al., 2018).
Because of their ability to distinguish "self " from "non-self, " the R-M systems are considered to provide a primitive immune system (Vasu and Nagaraja, 2013). Consequently, R-M systems are thought to be essential for host colonization by pathogenic bacteria (Ershova et al., 2015).
R-M systems are classified into four main types based on factors such as the subunit composition of its REase and MTase complexes; ATP and cofactor requirements; recognition site structure; and Position of DNA methylation. Cytosine methylation can be either endocyclic (C 5 ) or exocyclic (N 4 ). Adenine can be methylated at N 6 . DNA-cleavage site (Murphy et al., 2013;Ershova et al., 2015). Typical Type I R-M systems consist of three subunits: two modification subunits and one S-subunit. The S-subunit is encoded by hsdS as a specificity subunit, which can specifically bind to recognition sites. The two modification subunits are encoded by hsdM as methyltransferase (M) and by hsdR as restriction endonuclease (R). The MTase in Type I can modify both strands of its substrate DNA. There are two target recognition domains (TRD) in the S-subunit, and each interacts with half of the bipartite recognition site (Murray, 2000;Loenen et al., 2014). Functional restriction activity requires a pentamer comprised of R 2 M 2 S.
Type II R-M systems are the most prevalent and simplest. They generally function as two individual proteins. The REase cleaves the target DNA at defined positions within or close to its recognition site, and the MTase protects host DNA by methylation. They are also the best-investigated group of R-M systems because Type II REases are key enzymes in genetic engineering. Type III R-M systems comprise two subunits: Restriction (Res) and Modification (Mod) enzymes, encoded by the res and mod genes, respectively. Mod binds to and methylates substrate DNA, while Res functions as a DNA restriction endonuclease. Interestingly, Mod can function independently of Res, whereas Res has no activity without Mod (Dryden et al., 2001). Finally, Type IV R-M systems, distinctive from the other three types, hydrolyze modified DNA. Type IV systems have the methyltransferase and endonuclease activity combined in a single enzyme, which exclusively cleaves modified DNA (Vasu and Nagaraja, 2013).
In addition to their traditionally understood role, the R-M systems have been found to be involved in epigenetic regulation in some bacteria (Table 1). Several studies have reported that methyltransferases in Type III R-M systems have sequence features that are consistent with phase-variable expression (Srikhanta et al., 2005;Fox et al., 2007;Li and Zhang, 2019;Anton and Roberts, 2021). Phase variation is a strategy to generate phenotypic diversity in a bacterial population in the absence of selection. It involves reversible, high-frequency ON/ OFF switching of gene expression. Phase variation is often mediated by the presence of highly mutagenic simple tandem DNA repeats, also known as simple sequence repeats (SSRs). The SSRs are often located either within the ORF of genes encoding variant proteins or in their promoter region. Recent research has identified phase variably expressed DNA methyltransferases that act as epigenetic regulators in several pathogenic bacteria (Tan et al., 2016;Lyko, 2018;Han et al., 2022). Many virulence factor genes in bacteria display phase-variable expression, such as pili (Srikhanta et al., 2017), iron-binding proteins (Tauseef et al., 2011), lipopolysaccharide biosynthesis genes (Lüneberg et al., 1998), and outer-membrane proteins (Owen et al., 1996;Green et al., 2019). Phase variation results in genetically and phenotypically diverse populations, which is important in pathogenesis as it provides rapid adaptation to changes brought by the host environment and immune responses (Atack et al., 2018). The DNA methyltransferases that are involved in phase variation themselves are often subject to phase-variable expression (Atack et al., 2015;Tan et al., 2016). It is conceivable that a methyltransferase that is involved in phase variation can cause even more variation possibilities once the methyltransferase itself goes through a phase variation. The numerical possibility of phase variation multiplies by an amplitude when the variable methyltransferase itself is phase varied. The existence of phasevariable methyltransferase raises the possible roles for phase-variable R-M systems in pathogenesis. Using the flagellin A (flaA) gene in Helicobacter pylori as a model, we know that the ModH5 (a Type III MTase) modulates flaA promoter activity in a methylation-dependent manner. The phase-variable switching of ModH5 expression plays a role in regulating Helicobacter pylori phenotypes (Srikhanta et al., 2017).

"Solitary" or "orphan" methyltransferase
In addition to the R-M systems, some methyltransferases exist with no association with any restriction enzymes and are designated as "solitary" or "orphan" methylases (Casadesús and Low, 2006). MTases from R-M systems are distinct from orphan MTases. MTases from R-M systems can cleave unmethylated foreign DNA, whereas orphan MTases cannot. In general, MTases in R-M systems are poorly conserved, whereas orphan MTases are as conserved within a genus as any average gene (Seshasayee et al., 2012). Despite the differences, orphan MTases may have been derived from the R-M systems through loss of function in the REase of an R-M system. An R-M system in which the restriction endonuclease lost its activity but its cognate DNA methyltransferase retained its activity is functionally equivalent to orphan methyltransferases (Casadesús and Low, 2006). Some studies have found that orphan methyltransferases may have originated through horizontal gene transfer, with only the MTase part of an R-M system being transferred (Oliveira et al., 2014). Orphan MTases are present in diverse bacterial and archaeal phyla and show motif specificities and methylation patterns that are consistent with functions in gene regulation and DNA replication (Blow et al., 2016). In addition, there is a theory that orphan MTases might have evolved from R-M systems to fight parasitism caused by selfish or rogue R-M systems. However, there is currently a lack of evidence to support this theory (Murphy et al., 2013).
There are three well-studied conserved orphan methyltransferases: Dam, Ccrm, and Dcm, as shown in Table 1. Among them, the Dam of γ Proteobacteria and the CcrM of Caulobacter crescentus are the most widely studied (Chen et al., 2014). Dam is an orphan MTase first found in Escherichia coli and is involved in methyl-directed mismatch DNA repair and regulation of chromosomal replication (Marinus and Morris, 1973;Urig et al., 2002;Marinus and Casadesús, 2009). Its homologs are present in various enteric bacteria, including Yersinia spp., Vibrio cholerae, Salmonella app., Haemophilus influenzae, and other genera (Barras and Marinus, 1989;Torreblanca and Casadesús, 1996;Zaleski and Piekarowicz, 2004;Giacomodonato et al., 2009;Banas et al., 2011). Dam methylation can influence the expression of virulence factors and, thereby, the pathogenicity of some bacteria (Pucciarelli et al., 2002;Watson et al., 2004;Ershova et al., 2015). Modified live attenuated S. enterica serovar Typhimurium that harbor loss-of-function mutations in dam are capable of eliciting protection against a diversity of Salmonella and are well-tolerated when applied as modified live vaccines in poultry, mice, calves, and sheep (Heithoff et al., 2015). Although Dam is not essential for most bacteria, it is necessary for the survival of V. cholerae (Torreblanca and Casadesús, 1996;Robinson et al., 2005;Taylor et al., 2005;Casadesús and Low, 2006;Val et al., 2014).
Dam is a 32 kDa monomeric protein that catalyzes the transfer of the methyl group from AdoMet to the N 6 position of the adenine residue in 5′GATC3′ sequences. Dam has similar efficiency for both Frontiers in Microbiology 04 frontiersin.org hemi-and unmethylated templates and has been shown to be involved in chromosome replication and segregation, DNA mismatch repair, regulation of transposition events, phase variation, and bacterial conjugation processes (Sánchez-Romero et al., 2015).
The cell cycle-regulated DNA MTase family (CcrM) constitutes a second important group of orphan methyltransferase (Skerker and Laub, 2004;Kozdon et al., 2013;Ershova et al., 2015). It plays an essential role in the life cycle of C. crescentus and is highly conserved in all α Proteobacteria, except for Rickettsiales and Magnetococcales (Brilli et al., 2010;Mouammine and Collier, 2018). In Caulobacter, CcrM is an essential cell component and plays a crucial role in cell cycle regulation (Casadesús and Low, 2006). However, it is not present for the entire life cycle and is only expressed right before cell division. Evidence suggests that CcrM participates in the cell-cycle regulation of C. crescentus through regulating the expression of cell-division genes (Zweiger et al., 1994;Stephens et al., 1996; (Kahramanoglou et al., 2012;Militello et al., 2014) Inducing a high mutation rate in bacteria (Bandaru et al., 1996) Frontiers in Microbiology 05 frontiersin.org Fioravanti et al., 2013). Interestingly, the culture conditions seem to determine the dependence on CcrM methyltransferase in Caulobacter (Gonzalez and Collier, 2013;Adhikari and Curtis, 2016). In some α Proteobacteria, such as Brucella abortus, C. crescentus, and Agrobacterium tumefaciens, CcrM methyltransferase is indispensable (Stephens et al., 1996;Robertson et al., 2000). However, it is not vital for cell growth in other α Proteobacteria, like in Brecundimonas subvibrioides (Robertson et al., 2000;Curtis and Brun, 2014). CcrM is a functional monomer but may form a dimer at physiologic concentration (Shier et al., 2001;Skerker and Laub, 2004;Kozdon et al., 2013;Casadesús, 2016;Horton et al., 2019). Unlike Dam, CcrM has a distinct preference for hemimethylated DNA substrates (Bheemanaik et al., 2006;Albu et al., 2012). It binds to DNA, and only one 5′GANTC3′ (N represents any base) site can be methylated and modified before the enzyme detaches from the DNA (Marinus and Casadesús, 2009).
Dcm is another orphan MTase that is mainly found in enterobacteria such as E. coli. Dcm is a 51KD protein that methylates cytosine in a position that is rarely modified in bacteria but commonly in eukaryotes. As a C 5cytosine methyltransferase, Dcm can methylate the second cytosine in 5′CCAGG 3′ and 5′CCTGG 3′ sites (Militello et al., 2014). Methylated 5′CAG3′ sequences are mutational hotspots. After deamination, 5-methylcytosine is thymine and is not removed by the uracil-Nglycosylase, so GC to AT mutations are common at 5′ CCAGG 3′ sites. Dcm participates in several cellular processes, but it has been shown that Dcm is not necessary for the survival of E. coli (Baba et al., 2008). 3. Structure, function, and mechanism of DNA methyltransferases 3.1. Structure Currently, there are 27 bacterial DNA MTases that have been crystallized with three-dimensional structures determined. 1 Their target sequences are summarized in Table 2 (Roberts et al., 2015). Most of these DNA MTases belong to Type I or II R-M systems, and structures for orphan methyltransferases and Type III R-M systems are relatively  Representative three-dimensional structure of a DNA methyltransferase. AdoMet molecules are shown in yellow. These representative MTases all have a two-domain structure.
(A) Crystal structure of RsrI with AdoMet (PDB code 1NW7). RsrI is a β-class N 6 -adenine MTase that recognizes the palindromic duplex DNA sequence GAATTC and methylates the second adenine on each strand (Thomas et al., 2003). (B) Crystal structure of Dam with AdoMet (PDB code 2ORE). It contains two domains: a seven-stranded catalytic domain that harbors the binding site for AdoHcy and a DNA binding domain consisting of a five-helix bundle and a β-hairpin loop that is conserved in the family of GATC-related MTase orthologs (Liebert et al., 2007). (C) Crystal structure of HhaI with AdoMet (PDB code 2HMY). It is an endocyclic MTase that methylates the cytosine at the C 5 position (O'Gara et al., 1999). The structure of the binary complex of M. HhaI with AdoMet (PDB code 1HMY) was the first structure of any AdoMet-dependent MTase reported (Hong and Cheng, 2016). (D) The crystal structure of Pvull with AdoMet (PDB code 1BOO). The main feature of the common fold is a seven-stranded β-sheet (6↓7↑5↓4↓1↓2↓3↓) formed by five parallel β-strands and antiparallel β-hairpin (shown as arrows). The AdoMet binding site is located at the carboxyl ends of strands β1 and β2, and the active site is formed by the carboxyl ends of strands β4 and β5 and the amino end of the strand β7. (Woodcock et al., 2020).
rare. In general, MTases are bilobed structures folded into two domains: one is the catalytically active region responsible for the transfer of methyl groups, and the other is a smaller region responsible for recognizing methylation sites on DNA (Bergerat et al., 1991;Bheemanaik et al., 2006), Figure 2. There is a common structural core in the larger catalytic region, consisting of a six-stranded parallel β-sheet with a seventh strand inserted in an antiparallel fashion between the fifth and sixth strands (Bheemanaik et al., 2006). This larger catalytic region can be divided into two subdomains. One of the subdomains creates a S-adenosylmethionine (AdoMet) binding site, and the other subdomain is the binding site for the extrahelical target base. The small domains are very diverse in amino acid sequence, size and structure, because of selection for their DNA binding specificity (Bheemanaik et al., 2006). In all cases, the substrate to be methylated is bound or expected to bind in a pocket adjacent to the AdoMet binding site.

Motif
The primary sequences of MTases share a set of conserved motifs (I-X) and a variable target-site-recognition domain located near the C-terminus (Lauster et al., 1989;Pósfai et al., 1989). Together, they are responsible for three basic functions: (i) AdoMet binding, (ii) sequence-specific DNA binding, and (iii) catalysis of methyl group transfer. Briefly, motif I accommodates the methionine moiety of AdoMet, which is conserved among all AdoMet-dependent enzymes as a main binding region with AdoMet. In fact, all DNA methyltransferases that use S-adenosylmethionine as methyl donor are remarkably conserved in motif I.
Motifs II and III are also involved in AdoMet binding, but are less conserved than motif I. Several conserved charged residues in motifs I-III have been shown to have substantial effects on AdoMet binding (Ahmad and Rao, 1996). In motif I, substituting certain glycine residues with aspartic acid or arginine residues abolishes AdoMet binding. However, charged residues are not the only determining factor in AdoMet binding. Hydrophobic side chains in motifs I-III have been shown to stabilize AdoMet binding as well (Roth et al., 1998).
Motif IV, which is critical for catalysis, is also known as the DPPY motif because it contains a consensus sequence of (S, N/D)PP(Y/W/F) (Bheemanaik et al., 2006;Horton et al., 2006Horton et al., , 2019. The prolyl dipeptide on Motif IV is considered to play an important role in the transfer of methyl group to the target base (Smith et al., 1990;Cheng and Roberts, 2001). This sequence constitutes an active pocket that can accommodate the target base. DNA methyltransferase can modify the target base when it enters the active pocket through base flipping (Bheemanaik et al., 2006). Aromatic side chains in motif V have been found to interact with AdoMet (Schluckebier et al., 1995). Motifs VI, VII, and VIII form a DNA-binding cleft, whereas motifs I, IV, and X form a binding pocket for AdoMet's methionine moiety (Cheng and Roberts, 2001).

Methyltransferase-DNA interactions
In general, the mechanism by which DNA methyltransferase recognizes methylation modification sites mainly depends on the surface between the enzyme and the major groove of the substrate double-stranded DNA. The enzyme interacts with deoxyribose and phosphate on the backbone through Van der Waals forces (Roberts and Cheng, 1998;Cheng and Roberts, 2001). Although there are a large number of conserved sequences and homologous structures between different DNA methyltransferases, the interactions between DNA recognition sites and methyltransferase show many variations. The DNA backbone is often severely distorted when a methyltransferase binds to the recognition site.
Methyltransferase Dam contains two domains: a seven-stranded catalytic domain (residues 1-56 and 145-270) harboring the binding sites for AdoMet/S-Adenosyl-L-homocysteine (AdoHcy or SAH) and a DNA binding domain (residues 57-144) consisting of a five-helix bundle and a β-hairpin loop (residues 118-139) (Horton et al., 2006). Dam uses a mechanism called base-flipping to methylate the target adenine. Side chains of the residues in the binding site directly interact with the phosphate groups of DNA backbone. Dam has four conserved residues (R95, N126, N132, and R137) that interact with three Frontiers in Microbiology 07 frontiersin.org consecutive phosphate groups flanking the fourth GATC base-pair of the non-target strand. The methylation target, the adenine of the second base-pair in GATC, is flipped out from the DNA helix. The specific interactions with the remaining bases of the site occur in the DNA major groove. The N-terminal K9 is responsible for recognizing the guanidine of the first base-pair. Contacts to the non-target strand in the second half of the GATC site are established by R124 with the fourth base pair and by L122 and P134 to the third base-pair. The aromatic ring of Y119 intercalates into the DNA between the second and third base-pairs, which is essential for base-flipping to occur (Horton et al., 2006), Figure 3A. The dimeric CcrM methyltransferase uses a different mechanism to recognize the 5′ GAN 6 ATTC 3′ methylation site (Woodcock et al., 2017;Reich et al., 2018). Four loops of CcrM are involved in recognition of methylation sites. However, almost all the phosphates on the non-target base chain come into contact with Loop-2B (residues 31-61), which recognizes the first three bases of the target sequence. The P45 in Loop-2B is inserted between the second and third base pairs. Loop-45 (residues 119-133) can be inserted into the DNA double helix from a minor groove, where P125 and F126 recognize the fifth base in the target site to provide interactions with the two pyrimidines, T 4 and C 5 . The shorter Loop-3C (residues 92-94) and Loop-6E (residues 172-194) supply additional interactions for thymine T 4 (Tyr93 and His94), target adenine A 2 (Thr191 and Lys193), and the DNA backbone phosphate groups flanking guanine G1 (Arg179 and Lys187) (Horton et al., 2019), Figure 3B.
EcoP15I belongs to the type III R-M family. EcoP15I, like CcrM, also causes the backbone of its target DNA fragment to twist after binding (Gupta et al., 2015). EcoP15I binding target 5′ GANTC 3′ sequence (N represents any base). EcoP15I consists of two methylation (Mod) and one (or two) restriction (Res) subunits, resulting in a Mod2Res1 or Mod2Res2 complex. The Mod subunits are responsible for DNA recognition and methylation, whereas the Res subunits are responsible for ATP hydrolysis and cleavage. Cleavage only occurs if two recognition sites (5′CAGCAG3′) are in an inverted-repeat orientation, arranged either as "head-to-head" or "tail-to-tail" (Gupta et al., 2015), Figure 3C.
C 5 -cytosine methyltransferases catalyze cytosine methylation through intermediates in which the DNA is drastically remodeled. The target cytosine is usually buried in a DNA double helix, which hinders the catalytic reaction. In order to carry out the catalytic reaction, a distortion in the DNA occurs, and the target cytosine residue extruded from the DNA helix and plunged into the active site pocket of the enzyme (Sankpal and Rao, 2002). This base flipping was found in all members of the monomeric type II bacterial methyltransferases, like M.HhaI and M.HaeIII (Matthews et al., 2016). The targeting mechanism is particularly intriguing in the case of M.HaeIII, a bacterial C 5 -cytosine methyltransferase that not only extrudes the substrate cytosine but also induces frameshifted base pairing and the formation of a large gap in the duplex DNA recognition site (Didovyk and Verdine, 2012). The target recognition region (TRD) of M.HaeIII contacts most nucleotides in the recognition site directly, thereby stabilizing their internal helical conformation. Then, residue Ile-221 is inserted between the second and third base pair of the recognition sequence (5′GGCC3′), weakening the base stacking force between them and resulting in the extrusion of the target cytosine. Finally, with concomitant abandonment of pairing for the G and C, underwinding (caused by negative supercoiling) of the DNA double helix located near the target site causes the target base to be flipped out of the DNA double helix, Figure 3D.
Similarly, in Dam, the aromatic ring of tyrosine at 119 can insert between the second and third base pairs of 5′GATC3′, which is considered to be a necessary condition for base flipping (Horton et al., 2006). It is worth noting that the angle of the base flip is different for different MTases. In Dam and CcrM, the target base flips about 90°, while in M.HhaI and EcoP15I, the target base flips about 180° (Gupta et al., 2015). In addition, all four nucleotides in the target site of CcrM can be flipped to different degrees at the same time (Horton et al., 2019). The mechanism needs further investigation.

Epigenetic regulation of gene transcription
DNA methylation in the promoter or regulatory regions can be used to regulate transcription (Casadesús and Low, 2006;Wion and Casadesús, 2006). One of the mechanisms involves competition for DNA binding. DNA MTases will not be able to bind to nascent DNA strands if they are pre-occupied by DNA-binding proteins, which will maintain their unmethylated state (Van der Woude et al., 1996;Casadesús and Low, 2006). However, DNA-binding proteins cannot bind to the DNA if the binding site is methylated. Therefore, when the sites of methylation overlap or are adjacent to the binding sites of DNA-binding proteins, there is competition for DNA binding. DNA-binding proteins can thus change the activity of DNA MTases, which, in turn, affects DNA methylation (Kot et al., 2020).
It has been established that DNA methylation influences the expression of bacterial genes, which, in turn, regulates cell lifecycle and virulence (Seib et al., 2020). Regulation of gene expression obviously influences the ability of the bacteria to adapt to and survive in their environment (Casadesús and Low, 2013). Dam and CcrM MTase are two examples for which there is a relatively clear understanding of their role in epigenetic regulation (Adhikari and Curtis, 2016). DNA methylation can also regulate gene expression after transcription, but the mechanism is still unclear (Campellone et al., 2007;López-Garrido and Casadesús, 2010). For example, Dam express level can cause changes in the composition of O-antigen in Yersinia enterocolitica, but the mRNA level of the O-antigen gene cluster did not change. It is not clear whether it was mRNA being modified or other mechanisms. This process is still called epigenetic regulation and is often found in phase variation (Casadesús and Low, 2013).

Phase variation via the restriction modification systems
Over the past two decades, R-M systems have been found not only to protect the integrity of the genome but also to participate in gene expression regulation (Seib et al., 2011;Blakeway et al., 2014;Kwiatek et al., 2015;Anjum et al., 2016;Tan et al., 2016;Srikhanta et al., 2017). Most R-M systems involved in the regulation of gene expression belong to type I and III, with a few belonging to type II, Figure 4A.
Phase variation, the reversible generation of variants of surface antigens, is a survival strategy that is frequently found in pathogenic bacteria (Wion and Casadesús, 2006). Genes that can undergo phase  Figure 5. It is worth noting that the coding genes of these R-M systems contain simple sequence repeats (SSR) that are easy to distinguish. These simple repetitive sequences can be a repetitive nucleotide or an inverted repetitive sequence (inverted sequences, IS) (Atack et al., 2018). Pathogens that have adapted to a host can change the number of simple repeats in the open reading frame or shuffle the inverted repeats to generate phase variation (Dupont et al., 2009;Atack et al., 2018). For example, in H. pylori, some DNA methyltransferase genes encoding type II R-M systems contain SSRs that are related to bacterial colonization and pathogenicity (Lin et al., 2001;Ando et al., 2010;Gauntlett et al., 2014). There are two different kinds of phase variation that have been observed in bacterial R-M systems. Most Type I R-M systems have been shown to carry out phase variation via homologous recombination, Figure 4B. Type I systems are encoded by hsdR, hsdM, hsdS, and 4.3% of hsdR, 2% of hsdM, and 7.9% of hsdS contain simple repetitive sequences which can regulate phase variation (Seib et al., 2020). Regulation of Type I systems can either increase or decrease the number of simple repeats on the coding genes in hsdM and hsdS. Phase Overview of bacterial DNA methylation and phase variation. (A) Bacterial restriction-modification systems. There are three main types of restrictionmethylation (R-M) systems and orphan methyltransferases. Type I R-M system consists of three components, which are encoded by hsdR, hsdM and hsdS, respectively. DNA methylation is mediated by a trimeric M2S complex, whereas DNA is cleaved by a pentameric R2M2S complex. Type II R-M systems are encoded by two individual genes. A single Mod subunit mediates the DNA methylation, whereas the DNA cleavage is mediated by one or two REase subunits. Type III R-M systems use two Mod (M2) subunits for DNA methylation, and R2M2 complexes for DNA cleavage. (B) Phase variation. Phase variation of Type I R-M system is mediated via inverted repeats (red at 5′ end, blue in center) to make recombination between expressed (hsdS) and silent (hsdS′) specificity genes. Each hsdS gene contains two target recognition domains (TRDs). Phase variation of type III R-M systems is achieved via slipped strand mispairing (SSM) of simple sequence repeats (SSR, in red) in the open-reading frame of the mod genes. By losing a repeat unit, the variation in the open reading frame shifts from expression of a functional Mod (Mod ON) to transcriptional termination (Mod OFF) (Seib et al., 2020).
Frontiers in Microbiology 10 frontiersin.org variation can also be regulated through the recombination of the inverted repeats in hsdS to generate phase mutations. The number of simple repeats can regulate the expression of hsdM, which leads to the ON or OFF state of DNA MTase. The recombination of inverted repeats or the change in the number of simple repeats in the openreading frame of hsdS leads to the production of HsdS with different specificities. Ultimately, the site at which methylation occurs changes (Manso et al., 2014).
In H. influenzae, the open reading frame of hsdM, which encodes the type I phase variation regulator, contains simple repeats of a 5′GACGA3′ sequence. Changing these repeats can regulate the expression of hsdM and the sensitivity of H. influenzae to phage infection (Zaleski and Piekarowicz, 2004;Atack et al., 2018). In Neisseria gonorrhoeae, hsdS, which encodes the Type I regulator of phase variation, contains two open-reading frames, hsdSgoAV1 and hsdSgoAV2. There are multiple G residues at the 3′ end of hsdSgoAV1 (polyG tract). Both hsdSgoAV1 and hsdSgoAV2 are expressed when hsdSgoAV1 contains seven G residues at its 3′ end, whereas only hsdSgoAV1 is expressed when there are six G residues. Therefore, the methylation and/or restriction sites can undergo phase variation (Adamczyk-Poplawska et al., 2011).
Another example is the Type I phase variation regulator SpnD39III in Pneumococcus, which contains an hsdS with two targetrecognition regions (TRD) and two inverted repeats (IRs) of SpnD39III. The TRD and IR sequences can be shuffled, resulting in six different versions of hsdS (SpnIII39A-F) with different specificities (Manso et al., 2014). Expression of genes involved in capsule synthesis is down-regulated if SpnIII39B is produced. Pneumococcus with SpnIII39A has changes in the expression of genes involved in stress response and nutrient acquisition. However, no significant changes in gene expression have been detected with SpnIII39C or D (Manso et al., 2014).
In Campylobacter jejuni, there is a type II R-M system, Cj0031, containing polyG that could participate in regulating other genes. Campylobacter jejuni regulates the ON or OFF expression of Cj0031 by changing the number of G residues. This regulation can directly lead to the expression or absence of DNA methyltransferase in the system, which affects the methylation status of 5′CCYGA3′ sites on its genomic DNA. Some studies have found that when cj0031 is deleted or shut off by phase variation, the expression or absence of genes related to adhesion and invasion of host cells, such as capA, cadF, and flpAde, are down-regulated, whereas peb1A, a gene encoding periplasmic-binding proteins associated with ABC transporters, is up-regulated (Anjum et al., 2016).
In H. influenzae, there are multiple modA alleles encoding DNA methyltransferases that are involved in phase variation. These alleles include modA1, modA2, modA4, modA5, modA9, and modA10 (Atack et al., 2015). These alleles contain some simple repetitive sequences, and their expression can be adjusted to ON or OFF (Srikhanta et al., 2005;Atack et al., 2015), Figure 4B. For example, the modA1 openreading frame contains several 5′AGCC3′ SSRs. The number of these repeats determines whether the expression of modA1 is ON or OFF. The modA1 expression level can affect the expression levels of sixteen other genes (Srikhanta et al., 2005). The bacteria can form a robust biofilm when modA2 is expressed. The level of high molecular protein (HMP) in the cell decreases when the expression of modA4 is turned off, which enhances escape from macrophages (Atack et al., 2015).
Applying single molecule real-time (SMRT ® ) sequencing technology and methylome analysis (Figure 5 inset), it has been found that there is a 5′GACC3′ ModH5 methylation site in the promoter of flaA, which encodes the main component of the flagellar filament (Srikhanta et al., 2017). It has been suggested that methylation at the promoter region of flaA could directly regulate its expression. Similarly, a regulator of Type III phase variation is also reported in pathogenic bacteria Neisseria gonorrheae (Srikhanta et al., 2009), Neisseria meningitidis, Mannella hemolyticus (Srikhanta et al., 2010), and Moraxella catarrhalis (Blakeway et al., 2018). SSRs in the open reading frame of mod have been found to produce different regulators of Type III phase variation that affect the expression of other genes.

Gene regulation by Dam and CcrM
The Dam-regulated pap operon is a classic example of regulation by positive feedback (Van der Woude et al., 1996). Pap expression is regulated in a complex manner that involves PapB, PapI, Lrp (Leucineresponsive regulatory protein), and Dam. There are five promoters (P1-P5) in the regulatory region of the pap operon (Løbner-Olesen et al., 1992;Marinus and Casadesús, 2009). There are two main sets of Lrp binding sites, with each set containing a Dam methylation modification site 5′GATC3′. The methylation sites are named as GATC-I and GATC-II (van der Woude et al., 1992). Once Lrp binds to GATC-II, it prevents Dam from methylating GATC-II, resulting in a non-methylated state and blocking RNA polymerase σ70 from binding. In the meantime, GATC-I without Lrp binding is in a methylated state and the transcription of the pap operon is turned off (Pap-Off state). Conversely, when Lrp binds to GATC-I, resulting its being in an unmethylated state and GATC-II being in a methylated state, the transcription of the pap operon is turned on (Pap-On state) (Nou et al., 1995), Figure 6. It has been shown that the binding of PapI to Lrp in vitro reduces the affinity of Lrp for the first set of binding sites by half but increases the affinity of Lrp to the second set of binding sites by four times (Nou et al., 1995). Thus, when PapI expression reaches a certain level, it promotes the transfer of Lrp to the second set of binding sites, shifting from a Pap-On to a Pap-Off state. It has been reported that the frequency of is about 100 times higher than Pap-Off to Pap-On shifting (Blyn et al., 1990). On top of Lrp binding, the entire regulatory region of the pap operon can be bound by H-NS (a histone-like nucleoid structuring protein) (White-Ziegler et al., 1998), phosphorylated CpxR (CpxR-P) (Hernday et al., 2004), and RimJ (White-Ziegler et al., 2002) to affect the methylation status of the region.
Another example of gene regulation is opvAB, which is a cytoplasmic membrane protein gene that can shorten the length of the lipopolysaccharide O-antigen of Salmonella enterica (Cota et al., 2012). The expression of opvAB is turned on when the length of the lipopolysaccharide O antigen needs to become shorter in order to escape infection by bacterial phages, which attach to the lipopolysaccharide O antigen. However, shortened lipopolysaccharide O antigen can reduce the ability to infect its host (Cota et al., 2015). In response to those needs, the expression of opvAB can be turned off in the absence of phage to restore the length of the O antigen. It is worth noting that the genomes of E. coli and other γ-Proteobacteria have more 5′GATC3′ sequences than expected by chance (Hénaut et al., 1996;Sobetzko et al., 2016). This strongly suggests that Dam methyltransferase may be involved, in regulating transcription of many other genes. Some studies have found that in E. coli, dam mutations could affect the expression of many genes related to aerobic respiration, stress and SOS response, and amino acid and nucleotide metabolism (Oshima et al., 2002;Adhikari and Curtis, 2016). Studies have shown that Dam methyltransferase is involved in regulating transcription of sci1 (Brunet et al., 2011), flu (van der Woude andHenderson, 2008), gtr (Broadbent et al., 2010;Sánchez-Romero and Casadesús, 2020) and std (García-Pastor et al., 2019) in E. coli and carA, dgoR, holA, nanA, ssaN, STM1290 and STM3276 (Sánchez-Romero and Casadesús, 2020 However, there is no obvious relationship between Dam-involved gene transcription regulation and the presence of 5′GATC3′ sequences in the promoter of the regulated gene (Horton et al., 2015). Although the expression of many genes changes in dam mutants, only a few of those genes contain a 5′GATC3′ sequence in their promoters (Seshasayee, 2007). Several factors must be considered. In addition to 5′GATC3′, Dam methyltransferase can also react with 5′GTYTA3′/5′TARAC3′. The regulatory regions of the pap, foo and clp genes contain these non-5′GATC3′ sites (Horton et al., 2015). After the deletion of dam in enterohemorrhagic E. coli (EHEC), the expression of the Tir protein and the virulence protein EspFU increased, but their mRNA levels did not change (Campellone et al., 2007). In Yersinia enterocolitica, overexpression of Dam methyltransferase changed the composition of O-antigen, but the mRNA level of the O-antigen gene cluster did not change (Fälker et al., 2007). Similarly, the involvement of Dam methyltransferase in regulating the expression of hild seem to be post-transcriptional (López-Garrido and Casadesús, 2010). Therefore, regulation of gene expression involving Dam can occur after transcription, a process which probably does not involve 5′GATC3′ sites on the genomic DNA. The mechanism of post-transcription regulation by Dam or other DNA methyltransferases is still unclear.
The genome of C. crescentus contains 4542 CcrM methylation modification sites (5′GANTC3′), of which 23% are located inside a cluster of genes (Kozdon et al., 2013). In addition, it has been reported that when CcrM in C. crescentus is deleted or overexpressed, 10% of the genes are misexpressed, and expression of 380 genes changes significantly (Gonzalez et al., 2014). The genes that were significantly affected by the level of CcrM contain at least one 5′GANTC3′ site in their respective promoters. It has been shown that CcrM participates in regulating the transcription of flaY, podJ, ftsN, mipZ and ctrA in C. crescentus (Adhikari and Curtis, 2016). The ctrA gene contains two promoters, P1 and P2, each containing a 5′GANTC3′ methylation modification site (Domian et al., 1999). However, the 5′GANTC3′ site in P2 does not seem to be involved in the regulation of ctrA (Reisenauer and Shapiro, 2002). CcrM regulates the gene ctrA through a corresponding transcriptional regulator GcrA (Reisenauer and Shapiro, 2002). During DNA replication, the replication fork passes through P1, which makes P1 hemimethylated. GcrA binds to the hemimethylated P1 and activates promoter P1, which transcribes ctrA, resulting in the production of a small quantity of CtrA protein (Holtzendorff et al., 2004). CtrA binds to a specific site on the promoter P2, leading to a rapid increase in the level of CtrA in the cell (Domian et al., 1999). The vast majority of CtrA comes from P2 Overview of the function of bacterial DNA methylation and formation of bacterial phenotypically heterogeneous subgroups. DNA methyltransferase changes the expression pap operon by epigenetic regulation and forms phenotypically heterogeneous populations (long pili vs. short pili). DNA methylation defense mechanism can be used to protect the host genome from invasion by foreign DNA. Recent development in single molecule realtime (SMRT ® ) sequencing technology enables real-time sequencing of a large library of DNA fragments without PCR (dashed line inset; Attar, 2016;Chen J. et al., 2022). SMRT sequencing measures polymerase kinetics during the sequencing process, and thus detects DNA modification.
Frontiers in Microbiology 12 frontiersin.org promoter, which demonstrates that the P1 promoter is only active in the hemimethylated state (Reisenauer and Shapiro, 2002). The timing and level of CtrA expression are critical to the cell cycle of C. crescentus. They coordinate the growth and division of pre-divisional cells. CtrA also regulates the expression of more than 90 genes (Laub et al., 2002). One of the 90 genes that CtrA regulates is ccrM. The surge of CtrA leads to an increase of the CcrM level in the cell. The promoter of ccrM contains two 5′GANTC3′ methylation sites. When chromosome DNA is replicated, two hemimethylated 5′GANTC3′ methylation sites are formed in the pre-divional cell, resulting the surge of CtrA and the increased expression of ccrM gene (Stephens et al., 1996;Reisenauer et al., 1999).

DNA methyltransferase and chromosome replication
Escherichia coli chromosome replication is controlled primarily at the level of initiation, and the frequency of initiation determines the rate of cell division (Bogan and Helmstetter, 1997). One important element in the regulation of timing of initiation is the methylation status of the nascent DNA strand. Escherichia coli uses three different mechanisms to control the initiation of DNA replication: controlling the expression of dnaA (Campbell and Kleckner, 1990), sequestration of oriC, the replication origin, by SeqA protein (Bogan and Helmstetter, 1997), and controlling the activity of DnaA (Bramhill and Kornberg, 1988). Dam methyltransferase plays an important role in the first two mechanisms, Figure 7. After the genomic DNA is replicated, it changes from the original fully methylated state to a hemimethylated state. Compared with other 5′GATC3′ sites, the hemimethylation status of eight 5′GATC3′ sequences in the promoter region of dnaA and eleven 5′GATC3′ sequences in the oriC region of chromosome can be extended to about 10 min (Campbell and Kleckner, 1990). Usually, SeqA bind to the hemimethylated promoter region of dnaA before Dam methyltransferase, preventing Dam from acting on those hemimethylated sites and extending their hemimethylated status (Løbner-Olesen et al., 2003). SeqA binding to the hemimethylated 5′GATC3′ of the promoter region of dnaA effectively inhibits the transcription of dnaA and keep DnaA concentration in the cell at a low level, thereby regulating the initiation of chromosome replication (Campbell and Kleckner, 1990). Further, SeqA also bind to the hemimethylated 5′GATC3′ sequences of oriC, preventing DnaA from binding to the origin of replication and preventing the initiation of DNA replication until cell division (Bogan and Helmstetter, 1997). Dam methyltransferase is also necessary for the initiation of chromosome replication in Frontiers in Microbiology 13 frontiersin.org V. cholerae (Val et al., 2014). Recent studies have found that E. coli Dam methyltransferase can also prevent abnormal oriCindependent chromosome replication, also known as constitutive stable DNA replication (cSDR) (Raghunathan et al., 2019). There are three phases in the progression of the cell cycle in C. crescentus: G1, S and G2. CcrM activity has recently been linked to GcrA, a master regulator of the cell cycle that controls the expression of several genes during S-phase (Mohapatra et al., 2014). In C. crescentus., three global regulators coordinate the process of the entire cell cycle, DnaA, GcrA and CtrA (Adhikari and Curtis, 2016). DNA methylation by CcrM is involved in the regulatory activity of both GcrA and CtrA, which then affects the regulation of the initiation of chromosome replication initiation and asymmetric division in pre-divisional cells. Reducing the activity of CtrA could make the pre-division length of cells longer (Reisenauer and Shapiro, 2002). On the other hand, phosphorylated CtrA could bind to the origin replication (Cori) on the chromosome to prevent the initiation of replication (Mohapatra et al., 2014;Mouammine and Collier, 2018). This also means that CcrM methyltransferase could only indirectly participate in regulating the cell cycle and chromosome replication in C. crescentus. (Mouammine and Collier, 2018).

DNA methylation stabilizes bacterial evolution
DNA methyltransferases play a conservative function to maintain the bacterial genome during bacterial evolution. The R-M systems methylate native DNA and cleave foreign DNA, thereby effectively protecting the DNA of their cell. Studies have suggested that the R-M systems exhibit selfish behaviors and can be used to stabilize plasmids by post-segregational killing (PSK), even when the R-M system encoded by the plasmids persist in their hosts (Naito et al., 1995;Kobayashi, 2001;Oliveira et al., 2014). Further, bacterial R-M systems can distinguish methylation modifications between self DNA and foreign DNA, thereby preventing the horizontal gene transfer (HGT) (Ershova et al., 2015). The flow of genetic information between bacterial cells by HGT drives bacterial evolution, and the R-M systems are key moderators of this process. For example, Dam methyltransferase can also maintain the bacterial genome by inhibiting transposable elements (Tomcsanyi and Berg, 1989) and transposable phage (Murphy et al., 2008).
The R-M systems also maintain DNA mismatch repair (MMR), which is a highly conserved biological pathway that plays a key role in maintaining genomic stability. The specificity of MMR is primarily for base-base mismatches and insertion/deletion mispairs generated during DNA replication and recombination (Li, 2008). During DNA repair, the methyl-directed mismatch repair protein MutH, which belongs to a family of type-II restriction endonucleases, recognizes hemi-methylated DNA sites and removes the nonmethylated daughter DNA strand, ensuring that the methylated parental strand will be used as the template for repair-associated DNA synthesis (Casadesús and Low, 2006).

Discussion
DNA methyltransferases have been studied for decades now. Partly due to their broad roles and wide varieties, there are still many open questions regarding their function, mechanism and applications.
Due to the nature of DNA methyltransferases (and the R-M systems), it has been difficult to use genetic methods to study whether and how they participate in regulating gene expression. The effects are often global and difficult to pin-point. However, since the discovery and identification of regulators of phase Dam methyltransferase is involved in the initiation of DNA replication. There are multiple GATC sites in oriC and dnaA promotors of E. coli. DnaAbinding sites are shown with green diagonal stripe rectangles. DnaA binds to its binding sites in oriC and starts melting oriC to initiate replication. Then, SeqA binds to the newly hemimethylated DNA and excludes DnaA from binding, preventing re-initiation. In addition, SeqA binds to the hemimethylated dnaA promotor to prevent the expression of dnaA.
Frontiers in Microbiology 14 frontiersin.org variation, it has been proposed that many R-M systems share similar functions. Moreover, some bacteria contain as many as 50 R-M systems. It is difficult to accept that all 50 R-M systems only function to protect the integrity of bacterial genome (Nobusato et al., 2000). Although many examples of regulators of phase variation involved in regulating gene expression have been found, the mechanisms remain to be elucidated. The origin and evolution of the R-M systems also raise many questions. Closely related strains have different R-M systems. Sometimes, distantly related species have similar R-M systems, which suggests that the R-M systems undergo horizontal transfer (Oliveira et al., 2014). It is possible that DNA methyltransferases promote bacterial evolution, while also stabilizing it. Studies have observed that bacteria can form subgroups; that is, have the same genotype but different phenotypes (Casadesús and Low, 2013). The formation of subgroups could be used as an adaptation strategy for bacteria to escape their host's immune system and unfavorable environment, or as a betting-hedging strategy. This risk prevention could improve survival when the environment changes, thus promoting evolution (Beaumont et al., 2009).
The roles of the R-M systems in providing immunity against horizontal gene transfer (HGT) and in stabilizing mobile genetic elements (MGEs) have been much debated. Some studies demonstrate that R-M systems can be inserted into mobile genetic elements (such as plasmids and prophages), and then be horizontally transferred together with the mobile genetic elements (Furuta et al., 2010). However, it is unclear which factors are involved in the process and if there are any triggering factors, internally or externally. It is possible that there is a deep interplay of R-M systems with mobile genetic elements and horizontal transfer (Oliveira et al., 2014).
Although crystal structures of several bacterial DNA methyltransferases have been determined, the target DNA binding region, structure and mechanism have gone unelucidated. In addition, the number of crystal structures obtained is only a tiny portion of the total number of DNA methyltransferases, considering the diversity of these enzymes. High-resolution crystal structures are difficult to obtain, which may reflect certain characteristics of DNA methyltransferases (Kennaway et al., 2009(Kennaway et al., , 2012Loenen et al., 2014). It is still a mystery that how DNA methyltransferase find their methylation modification sites accurately, given the enormous number of modification sites and the complex 4-dimentional structure present in a genomic DNA. The debate on processive vs. distributive modes of methylation is ongoing.
How does DNA methyltransferase select target bases and what is the driving force for target base inversion? As mentioned before, some DNA methyltransferases are progressive enzymes, and some are partitioning enzymes. It is still unclear how these two functions are differentiated. Is that because of differences in the structure of the enzymes? Understanding the mechanism of DNA methylation may provide opportunities for designing drugs such as methyltransferase inhibitors.
DNA methyltransferases are involved-often critically-in various cellular processes. Because mammals do not methylate DNA at adenine, bacterial MTases that target adenine, such as Dam and CcrM, represent excellent candidates for antibacterial targets. Initial success has been reported with compounds that and selectively targeting bacterial MTases (Mashhoon et al., 2006). It has also been reported that the survival of E. coli under the pressure of antibiotics is severely impaired when Dam methylation sites are blocked from modification. Therefore, inhibitors of Dam methyltransferase are likely to improve the efficacy of antibiotics, even if those inhibitors are not antibacterial by themselves (Cohen et al., 2016).
Moreover, DNA methylation modification is associated with various human diseases. Global DNA hypomethylation at CpG islands coupled with local hypermethylation is a hallmark for breast cancer (Shukla et al., 2010;Du et al., 2019). Therefore, methods that can analyze DNA methyltransferase activity with high accuracy and sensitivity may be able to screen for and identify certain diseases. Many new methods for detecting DNA methyltransferase activity have been developed, including electro chemiluminescent, colorimetric, chemiluminescent, fluorescent, and electrochemical (electrochemical) and photoelectrochemical PEC (photoelectrochemical PEC) methods (Yin et al., 2014;Chen et al., 2018;Hou et al., 2019).
Recent improvements in sequencing and other methods have facilitated bacterial epigenomic data collection. Methylome data of more than 2,470 bacteria and archaea have been obtained through single-molecule real-time sequencing and other potential technologies (Fang et al., 2012;Blow et al., 2016;Oliveira et al., 2020). Based on the ubiquity of DNA methyltransferases in bacteria, it is certain that DNA methyltransferases are involved in many more cellular processes than we currently know. People used to think that the epigenetic control of gene expression is the task of orphan methyltransferases. It was not until the emergence of a large number of regulators of phase variation that we realized that this is not the case (Seib et al., 2020). The DNA methyltransferase of a regulator of phase variation can regulate gene expression through epigenetic mechanisms and affect bacterial virulence (Kumar et al., 2018;Estibariz et al., 2019). It is not surprising that the effects of DNA methyltransferases are so far reaching. After a pathogen infects its host, DNA methyltransferases of the pathogen are very likely to modify the host's genome (Chernov et al., 2015). The phenomenon that pathogens modify the host genome has been reported, but their importance remains to be understood (Niller and Minarovits, 2016;Pereira et al., 2016;Kot et al., 2020).
Bacterial genomes contain numerous DNA methyltransferase modification sites. For example, the genome of C. crescentus contains 4,542 CcrM methylation modification sites, but only 23% of these are located inside ORFs. The function and significance of methylation sites outside ORF are not yet fully understood (Kozdon et al., 2013).
It has been proposed that epigenetic memory systems have numerous potential applications in synthetic biology, including life biosensors, death switches or induction systems for industrial protein production. The large variety of bacterial DNA methyltransferases potentially allows for massive multiplexing of signal storage and logical operations depending on more than one input signal. A synthetic epigenetic memory system was designed by using engineered DNA-methylation-sensitive zinc finger proteins to repress a memory operon comprising ccrM and a reporter gene (Maier et al., 2017). Development on this frontier will undoubtedly establish new strategies for future bio-chip development.