A Quantum-Mechanical Looking Behind the Scene of the Classic G·C Nucleobase Pairs Tautomerization

For the first time, at the MP2/6-311++G(2df,pd)//B3LYP/6-311++G(d,p) level of theory, a comprehensive quantum-mechanical investigation of the physico-chemical mechanism of the tautomeric wobblization of the four biologically-important G·C nucleobase pairs by the participation of the monomers in rare, in particular mutagenic, tautomeric forms (marked with an asterisk) was provided. These novel tautomeric transformations (wobblization or shifting of the bases within the pair) are intrinsically inherent properties of the G·C nucleobase pairs. In this study, we have obtained intriguing results, lying far beyond the existing representations. Thus, it was shown that Löwdin's G*·C*(WC) base pair does not tautomerize according to the wobblization mechanism. Tautomeric wobblization of the G*·C*(rWC) (relative Gibbs free energy ΔG = 0.00/relative electronic energy ΔE = 0.00 kcal·mol−1) (“r”—means the configuration of the base pair in reverse position; “WC”—the classic Watson-Crick configuration) and G*t·C*(H) (ΔG = −0.19/ΔE = 0.29 kcal·mol−1) (“H”—Hoogsteen configuration;”t” denotes the O6H hydroxyl group in the trans position) base pairs are preceded by the stages of the base pairs tautomerization by the single proton transfer (SPT). It was established that the G*t·C*(rH) (ΔG = 2.21/ΔE = 2.81 kcal·mol−1) base pair can be wobbled through two different pathways via the traditional one-stage mechanism through the TSs, which are tight G+·C− ion pairs, stabilized by the participation of only two intermolecular H-bonds. It was found out that the G·C base pair is most likely incorporated into the DNA/RNA double helix with parallel strands in the G*·C*(rWC), G·C*(rwwc), and G*·C(rwwc) (“w”—wobble configuration of the pair) tautomeric forms, which are in rapid tautomeric equilibrium with each other. It was proven that the G*·C*(rWC) nucleobase pair is also in rapid tautomeric equilibrium with the eight tautomeric forms of the so-called Levitt base pair. It was revealed that a few cases of tautomerization via the DPT of the nucleobase pairs by the participation of the C8H group of the guanine had occurred. The biological role of the obtained results was also made apparent.

For the first time, at the MP2/6-311++G(2df,pd)//B3LYP/6-311++G(d,p) level of theory, a comprehensive quantum-mechanical investigation of the physico-chemical mechanism of the tautomeric wobblization of the four biologically-important G·C nucleobase pairs by the participation of the monomers in rare, in particular mutagenic, tautomeric forms (marked with an asterisk) was provided. These novel tautomeric transformations (wobblization or shifting of the bases within the pair) are intrinsically inherent properties of the G·C nucleobase pairs. In this study, we have obtained intriguing results, lying far beyond the existing representations. Thus, it was shown that Löwdin's G * ·C * (WC) base pair does not tautomerize according to the wobblization mechanism. Tautomeric wobblization of the G * ·C * (rWC) (relative Gibbs free energy G = 0.00/relative electronic energy E = 0.00 kcal·mol −1 ) ("r"-means the configuration of the base pair in reverse position; "WC"-the classic Watson-Crick configuration) and G * t ·C * (H) ( G = −0.19/ E = 0.29 kcal·mol −1 ) ("H"-Hoogsteen configuration;"t" denotes the O6H hydroxyl group in the trans position) base pairs are preceded by the stages of the base pairs tautomerization by the single proton transfer (SPT). It was established that the G * t ·C * (rH) ( G = 2.21/ E = 2.81 kcal·mol −1 ) base pair can be wobbled through two different pathways via the traditional one-stage mechanism through the TSs, which are tight G + ·C − ion pairs, stabilized by the participation of only two intermolecular H-bonds. It was found out that the G·C base pair is most likely incorporated into the DNA/RNA double helix with parallel strands in the G * ·C * (rWC), G·C * (rw wc ), and G * ·C(rw wc ) ("w"-wobble configuration of the pair) tautomeric forms, which are in rapid tautomeric equilibrium with each other. It was proven that the G * ·C * (rWC) nucleobase pair is also in rapid tautomeric equilibrium with the eight tautomeric forms of the so-called Levitt base pair. It was revealed that a few cases of tautomerization via the DPT of the nucleobase pairs by the participation of the C8H group of the guanine had occurred. The biological role of the obtained results was also made apparent.

INTRODUCTION
Shortly after the establishment of the spatial organization of the DNA molecule by James Watson and Francis Crick (Watson and Crick, 1953a,b), the tautomeric hypothesis was formulated (Watson and Crick, 1953b;Crick and Watson, 1954), which considers the transformation or transition of the nucleotide bases from the main (canonical) into the rare (mutagenic) tautomeric form as the main source of the origin of spontaneous point mutations. Since that time, the topic of tautomerism has remained active over the decades to the present day (Löwdin, 1963(Löwdin, , 1966Topal and Fresco, 1976;Florian et al., 1994;Gorb et al., 2004;Brovarets' et al., 2014;Godbeer et al., 2015;Turaeva and Brown-Kennerly, 2015).
However, up until recently it was considered that only a few unusual tautomers existed for the G·C Watson-Crick nucleobase pair (Pous et al., 2008;Alvey et al., 2014;Brovarets' and Hovorun, 2014a;Nikolova et al., 2014;Poltev et al., 2016;Szabat and Kierzek, 2017;Brovarets' et al., 2019a,b;Srivastava, 2019). In particular, tautomerization via the double proton transfer (DPT) has been carefully investigated in the reverse Löwdin G * ·C * (rWC), Hoogsteen (H) G * ′ ·C * (H), and reverse Hoogsteen G * ′ ·C * (rH) base pairs (Brovarets' et al., 2019b), leading to the novel structures: Eventually a great contribution into the further development of the tautomeric hypothesis was made by Per-Orlov Löwdin (Löwdin, 1963(Löwdin, , 1966 and Topal and Fresco (Topal and Fresco, 1976;Brovarets' et al., 2014). Thus, Per-Orlov Löwdin expressed the revolutionary, non-trivial opinion that the ability of the nucleotide bases to transform into the rare tautomeric form is provided by the electronic structure of the canonical DNA base pairs and qualitatively substantiated this assumption from the position of quantum mechanics. Subsequently, Topal and Fresco elaborated this approach in more detail, by using simple and visual models, and extended it for the explanation of the limited accuracy of codon-anticodon recognition (Topal and Fresco, 1976;Brovarets' et al., 2014).
Thus, by utilizing modern quantum-mechanical (QM) methods, the mechanisms of the mutagenic tautomerization of the pairs of nucleotide bases were investigated in detail, which were revealed to be active players in the field of spontaneous point mutagenesis (Brovarets' and Hovorun, 2018). It was established, in which cases Löwdin's approach was adequate and in which cases another approach should be reconsidered and supplemented.
Thus, it was suggested that the mechanism of the mutagenic tautomerization of the DNA base pairs, in particular classic Watson-Crick pairs, are accompanied by the mutual shifting (wobblization) of the bases one relative to the other into the minor or major DNA grooves at the intrapair sequential proton transfer (Brovarets' and Hovorun, 2015a;Brovarets' et al., 2019a). This valuable finding enables researchers to figure out, how the incorrect DNA base pairs, which architecture is different from the Watson-Crick configuration, can acquire the enzymatically-competent conformation, that guarantees their successful chemical incorporation into the composition of the main carrier of the genetic information-DNA-by the highfidelity DNA-polymerase. Notably, even though these theoretical approaches have been realized in quite basic model objects, they correctly reflect the real state-of-affairs at the macromolecular level, since they have been experimentally confirmed for macromolecular objects.
In this research, the objects of the investigation have been extended-except the Watson-Crick (WC) nucleobase pair, to the other biologically-important G·C nucleobase pairsreverse Watson-Crick G·C(rWC), Hoogsteen G·C(H), and reverse Hoogsteen G·C(rH). Also, it was exactly established why the classic A·T(WC) DNA base pair was selected for the construction of the genetic material Hovorun, 2009, 2015a,b,c,d,e,f;Brovarets' et al., 2018a). The novel mechanism of the mutagenic tautomerization of the biologicallyimportant A·T DNA base pairs through the quasi-orthogonal transition state and also through the protonated amino-group (Brovarets' et al., 2018b,c,d,e,f) was revealed for the first time. Based on these data an assumption was expressed about their possible biological role.
At the same time, investigations into the mechanisms of the mutagenic tautomerization of the pairs of nucleotide bases seemed to be quite a complicated issue, which may not be evident at a first glance. Thus, recent investigations into the tautomerization mechanisms of the biologically-important G·C nucleobase pairs, in which monomers are in the rare, in particular mutagenic, tautomeric form, continue to challenge researchers by its mystery (Brovarets' et al., 2019a,b).
It is still not possible to formulate simple physico-chemical rules, that would predict the course of these biologically important processes. Obviously, this is due to the fact that despite the enormous theoretical and experimental efforts of researchers, the present material remains insufficient for its final generalization.
This work aims to deepen the existing ideas about the microstructural mechanisms of the tautomerization of the biologically important pairs of nucleobases using the example of the G·C base pair (Brovarets' and Hovorun, 2014a), for which both monomers are in the rare tautomeric form.
Such a task is completely substantiated-we have investigated a few surprising tautomerizations, which significantly expand the existing ideas on tautomerization mechanisms and their biological applications. They will be outlined and discussed in more detail below.

Density Functional Theory Calculations of the Geometry and Vibrational Frequencies
Equilibrium geometries of the investigated nucleobase pairs and the transition states (TSs) of their mutual tautomeric transformations, as well as their harmonic vibrational frequencies have been calculated at the B3LYP/6-311++G(d,p) level of QM theory (Hariharan and Pople, 1973;Krishnan et al., 1980;Lee et al., 1988;Parr and Yang, 1989;Tirado-Rives and Jorgensen, 2008), using the Gaussian'09 program package (Frisch et al., 2010). An applied level of theory has proved itself to be successful for the calculations of similar systems Hovorun, 2010a,b, 2015g;Matta, 2010;Brovarets' et al., 2015). A scaling factor that is equal to 0.9668 has been applied in the present work for the correction of the harmonic frequencies of all complexes and TSs of their tautomeric transitions (Palafox, 2014;Brovarets' and Hovorun, 2015g;Brovarets' et al., 2015;El-Sayed et al., 2015). We have confirmed the local minima and TSs, localized by a synchronous transit-guided quasi-Newton method (Peng et al., 1996), on the potential energy landscape by the absence or presence, respectively, of the imaginary frequency in the vibrational spectra of the complexes. We applied standard TS theory for the estimation of the activation barriers of the tautomerization reaction (Atkins, 1998).
All calculations have been carried in the continuum with ε = 1, that adequately reflects the processes occurring in real biological systems without deprivation of the structurally functional properties of the bases in the composition of DNA/RNA and satisfactorily models the substantially hydrophobic recognition pocket of the DNA-polymerase machinery as a part of the replisome (Bayley, 1951;Dewar and Storch, 1985;Petrushka et al., 1986;García-Moreno et al., 1997;Mertz and Krishtalik, 2000;Brovarets' and Hovorun, 2014a,b).

Single Point Energy Calculations
We continued geometry optimizations with electronic energy calculations as single point calculations at the MP2/6-311++G(2df,pd) level of theory (Frisch et al., 1990;Kendall et al., 1992).
The Gibbs free energy G for all structures was obtained in the following way: where E el -electronic energy, while E corr -thermal correction.

Evaluation of the Interaction Energies
Electronic interaction energies E int have been calculated at the MP2/6-311++G(2df,pd) level of theory as the difference between the total energy of the base pair and energies of the monomers, which have been corrected for the basis set superposition error (BSSE) (Boys and Bernardi, 1970;Gutowski et al., 1986) through the counterpoise procedure (Sordo et al., 1988;Sordo, 2001).

OBTAINED RESULTS AND THEIR DISCUSSION
So, based on the obtained data, let us firstly formulate the basic results, which have been obtained for the first time and which have the closest connection to the structural biology and molecular biophysics (Figures 1, 2, Table 1).
Before providing the discussion of the investigated material, let us firstly give attention to the novel mechanisms of the G * ·C * (rWC) tautomerization, which complement the results of the previous work (Brovarets' et al., 2019a).
1. So, in the G * ·C * (rWC) base pair, the non-usual DPTtautomerization was fixed by the participation of the protons at the N3(C) and N2(G) atoms (Figure 1, part I): G * ·C * (rWC)↔G * t N2 ·C(rWC). This process is unusual, since the transfer of the proton from the C * to the G * base along the intermolecular (C)N3H...N1(G) H-bond provokes the rotation of the amino group of the G base into the transposition relative to the C2=N3 double bond. As a result, a significantly non-planar TS G * ·C * (rWC)↔G * t N2 ·C(rWC) of the tautomerization reaction is formed, which proceeds through the asynchronous mechanism and the significantly non-planar product of the tautomerization-the G * t N2 ·C(rWC) base pair, which is stabilized by the three intermolecular H-bonds (G)O6H...O2(C), (G)N1H...N3(C), and (C)N4H...N2(G).    Its characteristic structural specificity has significant nonplanarity and out-of-plane deformation of the purine ring of the O6H, N1H, and N2H atomic groups with trans-orientation relatively to the neighboring C2N3 bond. 2. Further, it was found out that Löwdin's G * ·C * (WC) DNA base pair, which is formed from the classic G·C(WC) DNA base pair through the DPT and is stabilized by the participation of the three intermolecular (G)O6H...N4(C), (C)N3H...N1(G), and (G)N2H...O2(C) H-bonds (Brovarets' and Hovorun, 2014a), does not tautomerize in the wobble manner.
In this case all localized transition states of tautomerization in this manner and its pathways are the same as in the case of the wobble mutagenic tautomerization of the G·C(WC) DNA base pair, which has been investigated and described earlier (Brovarets' and Hovorun, 2015a). In other words, in order to tautomerize in the wobble-manner, the Löwdin's G * ·C * (WC) DNA base pair should revert back to the classic G·C(WC) configuration (form) Hovorun, 2015a, 2018). This bright fact allows us to claim that the functional role of the tautomeric G·C(WC)→G * ·C * (WC) transition consists in the removal of the steric obstacles for the conformational G·C(WC)→G * ·C * (rWC) transition (Brovarets' et al., 2019a) and is not directly related to the origin of the spontaneous point mutations-transitions and transversions, as it was suggested earlier (please, refer to work (Brovarets' and Hovorun, 2014a) and references provided therein for more details).
This aforementioned conformational transition, in its turn, guarantees the integration of the G·C(WC) nucleobase pair into the DNA/RNA with parallel strands.
3. Opposite to the previously considered methods both the so-called correct and incorrect DNA base pairs Hovorun, 2009, 2015a,b,c,d,e,f;Brovarets' and Hovorun, 2018), the process of the tautomeric wobblization in the investigated G * ·C * (rWC) (Figure 1, parts II and III), and G * t ·C * (H) (Figure 1 (Brovarets' and Hovorun, 2014a). This dynamically nonstable intermediate is associated with the local minimum on potential (electronic) energy surface (PES). This situation is observed for the first time. Up until now the commonly accepted idea, that mutagenic tautomerization of the classic DNA base pairs is assisted by the intermediate corresponding to the local minimum on the PES, has not been confirmed.
The first process of the tautomeric wobblization of the G * ·C * (rWC) base pair (Figure 1, part II)-G * ·C * (rWC)↔G·C * O2 (rWC)↔G·C * (rw WC )↔G * ·C(rw WC )-is most likely tightly connected with the incorporation of the G·C(WC) base pair into the DNA/RNA with parallel strands (Watson and Crick, 1953b).
Another tautomerization process (Figure 1, part III)-G * ·C * (rWC)↔G * N2 ·C * (rw wc )↔G * ·C * O2 (rw wc ), which proceeds through the unique TS G+·C−(rWC)↔G * ·C * O2(rwwc) path with the (G)N1-H-O2(C) covalent bridge, is most probably concerned with the mechanisms of maintaining the RNA spatial architecture due to the incorporation of the non-stable (in the main tautomeric state) Levitt base pair (Crick and Watson, 1954;Levitt, 1969). This suggestion is based on the established structural mechanism of the tautomeric interconversion of the G * ·C * (rWC) pair into the eight stable planar tautomeric forms of the Levitt base pair (Watson and Crick, 1953a) (Figure 1, parts IV-VI)-G * t ·C * O2 (rw WC ), G * ·C * O2 (rw WC ), G * t ·C * t O2 (rw WC ), G * ·C * t O2 (rw WC ), G * N2 ·C * (rw WC ), G * t N2 ·C * (rw WC ), G * N2 ·C * t (rw WC ), and G * t N2 ·C * t (rw WC ) (Figure 2, Table 1)-and in principle, allows us to understand the dynamic of the formation of the Levitt base pair, which has not been considered before in the literature. It would be interesting to investigate how the tautomers of the Levitt base pair is stabilized in RNA by the H-bonds and surrounding environment further in the future (Oliva et al., 2007).

4.
A quite interesting situation is observed for the tautomeric wobblization of the G * t ·C * (H) base pair (Figure 1, part VII): G * t ·C * (H)↔G * N7 ·C(C)↔G * t ·C * O2 (w H )↔G * N7 ·C * (w H ). The transition of the C * O2 tautomer of the cytosine (C) within the G * t ·C * O2 (w H ) base pairs with cis-orientation of the N4H C-imino group into the trans-orientation through its inversion leads to the decreasing of the energy in the tautomerization (Figure 1, part VIII): . This decreasing of energy occurs when the affinity of the C * t O2 tautomer according to the "complementary" G * t tautomer is higher than the C * O2 tautomer. This decreasing of the energy with excess overrides the increasing of the internal energy of the C * O2 tautomer at its tautomerization C * O2 → C * t O2 . In the another pathway of the tautomeric wobblization of the G * t ·C * (H) base pair (Figure 1, part VII, VIII) the decreasing of energy in the course of the process is achieved by the conformational transition of the G * t tautomer within the G * t ·C(w H ) complex into the low-energy mutagenic tautomeric form G * N7 , which is zwitterion. 5. At this, the G * t ·C(w H )↔G * t N7 ·C * (w H ) DPT tautomerization process does not really occur, since its barrier is negative under normal conditions (Figure 1, part IX): G * t ·C(w H )↔G * ·C(w H )↔G * N7 ·C * (w H )↔G * t N7 ·C * (w H ). The same situation is also observed for the G * t ·C * (H)↔G * N7 ·C(H)↔G * t ·C(w H )↔G * t N7 ·C * (w H ) DPT tautomerization (Figure 1, part X). 6. Tautomeric wobblization of the G * t ·C * (rH) base pair (Figure 1, part XI) occurs through the two traditional pathways without any preparatory SPT stages through the TSs, which represent themselves as the covalently bonded tight G + ·C − ion pairs in reverse Hoogsteen conformation, which are only supported by two Hbonds: G * t ·C * (rH)↔G * t ·C * O2 (rw H )↔G * t N7 ·C * (rw H ) and G * t ·C * (rH)↔G * t ·C * (rw H )↔G * N7 ·C * (rw H ). The transition of the G * t tautomer within the G * t ·C * O2 (rw H ) complex into the G * mutagenic tautomer through the orthogonal TS decreases the energy of the further process of tautomerization. 7. Also, in addition to the previously revealed processes, DPT tautomerization was also fixed by the participation of the proton at the C8 carbon atom of G, which lead to the dynamically-stable, but short-lived, complexes by the participation of the yilidic forms of the G base (Figure 1, parts IX-XI). 8. Finally, there are three more fixed mysteries, which deserve more attention. Several G·C base pairs, in which both bases were in the rare tautomeric form and their energy of stabilization significantly exceeded the analogical values for the classic G·C(WC) DNA base pair were fixed.
Despite the structural softness of the heterocycles of the G and C bases for the out-of-plane deformational bending (Hovorun et al., 1999), it was not revealed that there was any deviation from the plane in the investigated processes of the tautomerization of the base pairs.
Obtained data convincingly show that among all possible tautomeric wobblizations of the G * ·C * (rWC), G * t ·C * (H), and G * t ·C * (rH) DNA base pairs, which possess Watson-Crick, Hoogsteen, and reverse Hoogsteen configurations and both monomers of which are in the rare tautomeric form, at least one non-dissociative transition was absent, which would recover the tautomeric status of both the G * /G * t and C * bases to the canonical G and C bases, correspondingly. This fact altogether with the results, obtained in our previous work (Brovarets' et al., 2019a), soundly exhibits why the Watson-Crick DNA base pairs were chosen for the building of genetic material (Brovarets' et al., 2018a).

CONCLUSION
Concluding the obtained results, we arrived to a summation after providing an investigation of the tautomeric wobblization of the biologically-important G·C(WC), G * ·C * (WC), G * ·C * (rWC), G * t ·C * (H), and G * t ·C * (rH) nucleobase pairs and extended the existing thoughts about the microstructural mechanisms of these processes, as well as about their functional roles. Thus, it was established that the G·C base pair is the most likely to be incorporated into the DNA/RNA double helix with parallel strands in the form of the G * ·C * O2 (rWC), G·C * (rw WC ), and G * ·C(rw WC ) tautomers, which are in rapid tautomeric equilibrium with each other.
For the first time we have formulated rules, defining these biologically-important processes.

DATA AVAILABILITY STATEMENT
All datasets generated for this study are included in the article/supplementary material.

AUTHOR CONTRIBUTIONS
OB: idea formulation, setting of the task, calculation of the data, building of the graphs, data extrapolation, preparing, and proofreading of the draft of the manuscript. AM: idea formulation, calculation of the data, building of the graphs, preparing, and proofreading of the draft of the manuscript. DH: idea formulation, preparing, and proofreading of the draft of the manuscript. All authors contributed to the article and approved the submitted version.