Relevance of GC content to the conservation of DNA polymerase III/mismatch repair system in Gram-positive bacteria

The mechanism of DNA replication is one of the driving forces of genome evolution. Bacterial DNA polymerase III, the primary complex of DNA replication, consists of PolC and DnaE. PolC is conserved in Gram-positive bacteria, especially in the Firmicutes with low GC content, whereas DnaE is widely conserved in most Gram-negative and Gram-positive bacteria. PolC contains two domains, the 3′-5′exonuclease domain and the polymerase domain, while DnaE only possesses the polymerase domain. Accordingly, DnaE does not have the proofreading function; in Escherichia coli, another enzyme DnaQ performs this function. In most bacteria, the fidelity of DNA replication is maintained by 3′-5′ exonuclease and a mismatch repair (MMR) system. However, we found that most Actinobacteria (a group of Gram-positive bacteria with high GC content) appear to have lost the MMR system and chromosomes may be replicated by DnaE-type DNA polymerase III with DnaQ-like 3′-5′ exonuclease. We tested the mutation bias of Bacillus subtilis, which belongs to the Firmicutes and found that the wild type strain is AT-biased while the mutS-deletant strain is remarkably GC-biased. If we presume that DnaE tends to make mistakes that increase GC content, these results can be explained by the mutS deletion (i.e., deletion of the MMR system). Thus, we propose that GC content is regulated by DNA polymerase and MMR system, and the absence of polC genes, which participate in the MMR system, may be the reason for the increase of GC content in Gram-positive bacteria such as Actinobacteria.


INTRODUCTION
There are various external causes of genetic mutation in organisms, such as UV radiations, oxidative environment, or exposure to radiation. However, the main source of DNA mutation arises from replication errors caused by DNA polymerase in the organism itself. Although several DNA polymerases are conserved in the bacterial cell, the major DNA replication enzyme, DNA polymerase III, belongs to the Family C-type (McHenry, 2011). There are two types of the Family C-type replication enzymes: PolC, which conserves the proofreading apparatus of the 3 -5 exonuclease domain in addition to the polymerase domain; and DnaE, which possesses the replication enzyme, α subunit. PolC and DnaE were originally Family A/B-type DNA polymerases but became Family C-type as the result of a mutation in the 3 -5 exonuclease domain (Huang et al., 1997). Family C-type polymerase is currently considered PolC, whereas DnaE and DnaQ are derived from the separation of its two domains (Huang et al., 1997).
Mutations produced during DNA replication are restored by DNA repair enzymes or DNA replication proofreading enzymes, many of which are thought to be widely conserved by a domain unit, regardless of the species (Aravind et al., 1999). This fact supports the hypothesis that the α subunit and 3 -5 exonuclease segregated during the evolutionary history of Family C-type DNA replication enzymes.
Change in the GC content of an organism is a useful index in biological classification, especially in prokaryotic classification. The bioinformatics investigation showed that change in the GC content and polymorphisms of bacterial PolC and DnaE are considered to have a co-evolutionary relationship (Wu et al., 2012). From the phylogenetic classification of several proteins, bacteria and archaea diverged around 4 billion years ago and further split into two large groups 2.5 billion years ago when oxygen concentrations on earth rose: one includes Proteobacteria and spirochaeta, the other Actinobacteria, cyanobacteria, and Firmicutes (Battistuzzi et al., 2004). This oxygen concentration increase supposedly caused an explosive differentiation of bacterial species. However, there are still many unanswered questions and too few concrete answers to explain the kind of events that occurred. Actinobacteria is one of the bacterial groups, which are considered to have diverged in this historical period of oxidization on earth. These bacteria possess a peptidoglycan layer and their genomes have a high GC content. On the other hand, Bacillus subtilis is Gram-positive and belongs to the Firmicutes with a genome of low GC content (43%) compared to Actinobacteria. An in vitro reconstitution experiment has shown that DnaE and PolC are essential in B. subtilis genome replication, with the former replicating the lagging strand while the latter replicating the leading strand (Dervyn et al., 2001;Le Chatelier et al., 2004).
mutS is a representative DNA mismatch repair (MMR) gene that is conserved among not only bacteria but also eukaryotes. In recent years, from studies of B. subtilis, it has been proposed that MutS binds to DnaN (the clamp of the DNA replication complex), repairs the mutation specific to the DNA non-methylated strand, and dissociates the DNA replication complex from DNA during the mutation repair of the errors during DNA replication (Lenhart et al., 2013).
We propose that the mutation factor is the interval between mutation causation by the DNA replication enzyme and restoration by the mutation repair enzyme. In this study, we validated the DNA replication enzyme conservation for several DNA repair enzymes, and we performed an in vivo mutational analysis by using B. subtilis.

PHYLOGENETIC CLASSIFICATION OF DNA POLYMERASE III AND MutS/L/T
To study the conservation of DNA polymerase III, MutS, MutL, and MutT in bacteria, we investigated the amino acid sequences of each gene dnaE (NP_414726), mutS (NP_417213), mutL (NP_418591), and mutT (NP_414641) in Escherichia coli by using DELTA-BLAST (Boratyn et al., 2012). The list of the investigated bacteria is shown in Supplemental Table 1. The acquired amino acid sequence of DNA polymerase III was analysed using MEGA5 (Tamura et al., 2011). All sequences were aligned by Clustal W. Gaps included in the alignment were deleted at the last alignment step. Subsequently, a phylogenetic tree was created by the neighbor-joining method based on the alignment file, from which probability was confirmed by the 500-time trial using the bootstrap method.

RIFAMPICIN MUTAGENESIS ASSAY USING B. subtilis
Two B. subtilis strains, strain 168 trpC2 [hereafter referred as the wild type (WT) strain] and strain 168 trpC2 mutS::spec r , were used for mutation analysis. A frozen stock of each strain was precultured on LB plates and cultured in the LB liquid medium with rotation (48 rpm) at 37 • C, for 8 or 24 h. Each culture was diluted by 10 −5 and 10 −6 and plated on LB plates, while the non-diluted cultures of the mutS strain and WT strain were plated on LB plates containing 5 μg/ml of rifampicin. Colony forming unit (CFU) was calculated according to the number of colonies after 24 h incubation. The rate of mutation was defined as rifampicin-resistant CFU on LB plates with rifampicin per CFU on LB plates. This test was performed three times for each strain.

POINT MUTATION ANALYSIS IN THE rpoB REGION I
For the point mutation analysis, the two strains described above (WT and 168 trpC2 mutS::spec r ) were also used. Twenty colonies from the WT strain and 30 colonies from the mutS strain acquired from the rifampicin mutagenesis assay were selected randomly and the sequence of the rpoB region I was confirmed. The primers for PCR amplification and sequence analysis were rpoB +1157 (5 -gctacttcttcaacctgctgc-3 ) for the forward and rpoB +1673 rev (5 -gttaccttccctgtttcagggtc-3 ) for the reverse. Sequence analysis was performed by MACROGEN JAPAN (http://www.macrogen-japan.co.jp/).

EXTENSIVE LOSS OF MutS/L IN ACTINOBACTERIA
To confirm the correlation between the conservation of the MMR enzyme MutL/S/T and that of the bacterial DNA replication enzyme DNA polymerase III, we verified the conservation of bacterial DNA polymerase III and MutL/S/T by using a BLAST search. The genome analysis for all bacterial species investigated (Supplemental Table 1) has been completed or is currently in progress.
It is noteworthy that neither MutS nor MutL, both of which are MMR enzymes, were detected in actinobacteria by BLAST search, whereas MutT, which decomposes 8-oxoguanine, was found not only in Actinobacteria but also many other bacteria. Furthermore, we could not detect MutS, MutL, or MutT in Mycoplasma mobile strain 163K and Spiroplasma melliferum strain KC3 (Table 1).

STRAINS WITH EXTREMELY BIASED GC CONTENT OF DNA POSSESS THE SAME TYPE OF DnaE
Because MutS/L is not conserved in Actinobacteria, we speculated that the increase in GC content is due to amino acid sequence differences in the DNA replication enzyme domain of these bacteria. The aligned amino acid sequence of the DNA polymerase III α subunit domain from all bacteria was analysed. The algorithm of Clustal W was used for alignment, and phylogenetic analysis was performed by the neighbor-joining method based on the results of the alignment data (Figures 1A,B). From the results of phylogenetic analysis, the phylum Deinococcus-Thermus, which is known to have as high a GC content as Actinobacteria, and Clostridium thermocellum, which shows GC content as low as 39%, were included in the DnaE clade along with Actinobacteria, suggesting a relationship between the type of DNA polymerase and the instability of GC content.

THE MUTATION FREQUENCY OF THE MutS STRAIN IS HIGH IN B. subtilis 168 AND INDUCES MUTATIONAL BIAS OF BASE CHANGE(S) TO AT-GC
We deleted mutS of the Gram-positive B. subtilis 168 strain and measured the mutation frequency. As a result, the mutS strain showed a 100-fold higher frequency of mutation than the WT strain (Figure 2).
From the former experiment, we picked 20 rifampicinresistant WT strains and 30 rifampicin-resistant mutS strains. Since rifampicin is known to react on region I of the RNA polymerase β subunit gene (rpoB) (Campbell et al., 2001), we performed sequence analysis in this region to investigate the frequency of point mutations ( Figure 3A). Ninety percent of mutations in the WT strain and all mutations in the mutS strain were transition mutations ( Figure 3B). The identification of these point mutations showed that the ratios of each base change were 11, 56, and 33% for the C-A, C-T, and A-G substitutions, respectively, in the WT strain, whereas they were 14% for the C-T substitution and 86% for A-G in the mutS Desulfobacter postgatei 2 a c 9 0 4 1 1 1 5 Campylobacter coli 1 4 1 7 0 1 1 1 1 2 Xanthomonas albilineans GPEPC73 0 2 1 1 1 10 Ectothiorhodospira sp. PHS-1 0 2 1 1 1 8 Methylomonas methanica M C 0 9 0 4 2 1 1 8 Methylobacter tundripaludum S V 9 6 0 2 1 1 1 8 Azotobacter vinelandii D J strain ( Figure 3C). The classification of these mutations is as follows: GC content-increasing (AT-GC); GC content-decreasing (GC-AT); and GC content-unchanging (AT=TA, GC=CG). Accordingly, the AT-increasing (GC-AT) mutation in the WT strain was 67% of the total mutations. In contrast, the AT-GC mutation was 86% in the mutS strain ( Figure 3D). Therefore, it appears that the GC content increased as a result of the mutS deletion.

DISCUSSION
The mutation experiment results of the B. subtilis 168 strain shows that MutS suppresses the AT-GC mutation, based on evidence that the deletion of mutS induces mutational bias to AT-GC. In addition, many GC-AT mutations are observed in the B. subtilis 168 WT strain, suggesting the existence of other mechanisms that increase the AT content in this strain. MutS forms a complex with a clamp called DnaN during the DNA replication in B. subtilis (Lenhart et al., 2013). While PolC replicates the leading strand of B. subtilis and has a proofreading function, DnaE, which synthesizes the lagging strand, does not possess this function. If the MutS/L complex takes a central role in MMR of the lagging strand, it is possible that the mutation in the MutS-deleted strain is located mainly in the lagging strand. Simultaneously, DnaE as a replication enzyme tends to create the AT-GC mutation, while MutS represses this mutational bias.
Actinobacteria possess only the DnaE-type (that does not contain 3 -5 exonuclease activity) DNA polymerase and does not conserve MutS/L. Furthermore, since Actinobacteria conserves MutT, it is not likely that the increase of 8-oxoguanine (8-OG) raises the GC content. Actinobacteria and Deinococcus, both with high GC content are thought to have branched about 2.5 billion years ago when the oxidation of the earth's atmosphere by cyanobacteria supposedly occurred (Battistuzzi et al., 2004). Many organisms existing at this time would have experienced this external oxidation event and those intolerant of this change would not have survived. It is difficult to directly correlate the mutation induced by 8-OG, which should occur in the organism growing in the aerobic environment with its increase in GC content. Basically, in the case of the cell in a constant environment, the main power to mutate DNA depends on the replication error rate of the DNA polymerase. Thus, the loss of PolC and MutS/L might induce the increase of GC content during the speciation of Actinobacteria.
Based on the results from the DNA polymerase classification, bacteria with extremely high or low GC contents such as Actinobacteria, Deinococcus-Thermus, and Clostridium were classified into the same clade ( Figure 1B). Bacteria have experienced the appearance of both polC and dnaE genes during their evolutionary history. Assuming that these enzymes may originally have been the driving force pulling the GC content from both sides, this might suggest the existence of a balance between Tenericutes or Firmicutes and replicate DNA by using the PolCtype DNA polymerase. Since their GC contents are extremely low (23∼40%), it has been suggested that the error bias of the DNA replication enzyme and lack of a mechanism to repress mutational errors has induced this extreme decrease. It remains unknown, however, how these bacteria maintain such low GC content, or the type of mechanism that exists to correct replication mistakes.
Since there is proximity between the DnaE-type DNA polymerases of thermophiles Deinococcus-Thermus and C. thermocellum and one species of Actinobacteria, there seems to be no relationship between the increase and decrease of GC content and the type of DNA polymerase. However, it is possible that DNA polymerase is associated with extreme increases or decreases of GC content. While increased GC content in Deinococcus-Thermus and Actinobacteria is due to a balance between DNA polymerase and repair, C. thermocellum possesses not only DnaEtype but also PolC-type DNA polymerases and the balance between these multiple polymerases and DNA repair lowered the GC content. Furthermore, together with the mutation analysis results of B. subtilis strain 168, which replicates DNA by PolC and DnaE and the deletion of its mutS, which increases the GC content, it has signified that MutS itself possesses the function to repress increasing GC content. Since the function of MutS is the recognition of the MMR, it is unlikely that it has the potency to self-introduce mutations; therefore, its function to repress increasing GC content is thought to be a passive reaction. We first thought that the change of GC content was the result of

FIGURE 3 | Point mutation analysis in the rpoB region I. (A)
Rifampicin-resistant regions of the RNA polymerase ß Subunit. Red marks indicate the clusters where rifampicin-resistant mutations have been identified. Since highly conserved residues in these regions directly interact with rifampicin, we investigated region I of the rpoB gene. (B) Comparison of the rpoB mutation bias between the B. subtilis 168 WT strain and the mutS strain focusing on mutation types. Red and yellow indicate transition mutation and trans-version mutation, respectively. (C) Comparison of rpoB mutation bias between the B. subtilis 168WT strain and the mutS strain focusing on a nucleotide change. For example, "A:G" means A to G change. (D) Comparison of rpoB mutation bias between the B. subtilis 168WT strain and the mutS strain focusing on the GC bias. "AT-TA" and "GC-CG" indicate the transversion mutation without change of the GC bias. "AT-GC" and "GC-AT" indicate the transition and transversion mutation changing the GC bias. the fitting of DnaE to the genomic mutation, which occurs due to the loss of MutS; however, this does not explain the change of GC content in Deinococcus-Thermus and C. thermocellum while still maintaining MutS/L. Therefore, we suggest that the increase of the GC content in Actinobacteria is induced by the type of its DNA replication enzyme and the loss of MutL/S. However, this analogy does not apply to examples such as M. mobile 163K and S. melliferum KC3, perhaps because the conditions necessary for the proof have not been established.