Structural effects of inosine substitution in telomeric DNA quadruplex

The telomeric DNA, a distal region of eukaryotic chromosome containing guanine-rich repetitive sequence of (TTAGGG)n, has been shown to adopt higher-order structures, specifically G-quadruplexes (G4s). Previous studies have demonstrated the implication of G4 in tumor inhibition through chromosome maintenance and manipulation of oncogene expression featuring their G-rich promoter regions. Besides higher order structures, several regulatory roles are attributed to DNA epigenetic markers. In this work, we investigated how the structural dynamics of a G-quadruplex, formed by the telomeric sequence, is affected by inosine, a prevalent modified nucleotide. We used the standard (TTAGGG)n telomere repeats with guanosine mutated to inosine at each G position. Sequences (GGG)4, (IGG)4, (GIG)4, (GGI)4, (IGI)4, (IIG)4, (GII)4, and (III)4, bridged by TTA linker, are studied using biophysical experiments and molecular modeling. The effects of metal cations in quadruplex folding were explored in both Na+ and K+ containing buffers using CD and UV-melting studies. Our results show that antiparallel quadruplex topology forms with the native sequence (GGG)4 and the terminal modified DNAs (IGG)4 and (GGI)4 in both Na+ and K+ containing buffers. Specifically, quadruplex hybrid was observed for (GGG)4 in K+ buffer. Among the other modified sequences, (GIG)4, (IGI)4 and (GII)4 show parallel features, while (IIG)4 and (III)4 show no detectable conformation in the presence of either Na+ or K+. Our studies indicate that terminal lesions (IGG)4 and (GGI)4 may induce certain unknown conformations. The folding dynamics become undetectable in the presence of more than one inosine substitution except (IGI)4 in both buffer ions. In addition, both UV melting and CD melting studies implied that in most cases the K+ cation confers more thermodynamic stability compared to Na+. Collectively, our conformational studies revealed the diverse structural polymorphisms of G4 with position dependent G-to-I mutations in different ion conditions.


Introduction
The postulated Watson-Crick model of DNA has revolutionized the study in genetics and underpinned the existing understanding of heritability in living organisms at a molecular level.The double helical structure of DNA encodes genetic information via complementarity with G:C and A:T pairs.This eminent double helical framework provided insights into both the accessibility and packaging for genetic materials.The precise sequence of bases dictates the instructional properties for downstream transcription and translation (Travers and Muskhelishvili, 2015).However the discovery of non-helical DNA motifs and DNA modifications, and their critical roles in regulating gene expression (Liyanage et al., 2014), implies that DNA plays more than just the passive role of housing the genetic repertoire.While epigenetic markers control chromatin remodeling and transcription, telomeres at the distal region of the eukaryotic chromosome are one of the key factors in cell senescence where high levels of telomerase activity is associated with tumorigenicity (Harley et al., 1990;Bodnar et al., 1998;Hahn et al., 1999;Mergny et al., 2002).
The telomeric regions consist of guanine-rich tandem repeats of (TTAGGG)n that can adopt a quadruplex structure with thymine and adenine in the connecting loops (Wellinger and Sen, 1997;Bryan, 2020).These guanine-rich motifs are also found in other regions of biological significance such as c-myc promoter (Siddiqui-Jain et al., 2002;Seenisamy et al., 2004), hypoxia inducible factor 1 alpha (HIFalpha) promoter (De Armond et al., 2005), human insulin gene (Catasti et al., 1996) and disease implicated repeats including fragile X syndrome (Fry and Loeb, 1994;Weisman-Shomer et al., 2000;Fojtik et al., 2004).Likewise, mis-regulation of quadruplex associated proteins can lead to severe disorders such as Bloom (Sun et al., 1998) and Werner syndromes (Fry and Loeb, 1999).Moreover, in normal cells, the telomeric tandem repeats are shortened with cell divisions and eventually leads to cell apoptosis in contrast to immortal cancer cells with highly active telomerase that can reverse cell senescence by repeat extension (Carrino et al., 2021).Since G-quadruplex inhibits telomerase activity (Wang et al., 2011;Carrino et al., 2021), understanding the mechanism of quadruplex formation from linearized telomeric DNA induced by small molecule binding is fundamental in anticancer drug design (Neidle, 2010).
Deoxyinosine (dI) is a natural DNA nucleotide usually generated from the deamination of deoxyadenosine from the exposure to nitrosative compounds in the environments.The dI residue is read as dG by the replication machinery (Alseth et al., 2014).Due to the lack of N 2 -amino group and the similar electrostatic potentials of the two, the G-to-I mutation was commonly used to study inosine induced Hoogsteen base paring, as well as the effects of ligand binding to both duplex and quadruplex DNAs (Nikolova et al., 2014;Zachary et al., 2019).Inspired by such prospects in cancer intervention, in this work, we performed biophysical and computational studies to explore the conformational features of the telomeric tandem quadruplexes (TTAGGG) 4 modified with inosines at different positions (Figure 1).The folding landscape of quadruplex formation in the presence of Na + and K + was also investigated and compared through CD, UV-melting and computational modeling studies.

Materials and methods
The single stranded telomeric DNA and its inosine modified analogs were purchased from IDT (Integrated DNA Technologies).The sequence mimics a small section of the repeat stretch of TTAGGG and its analogs are generated by G 1 , G 2 , G 3 inosine substitution.DNA samples were reconstituted in nuclease-free water to a final concentration of 1 mM.

Circular dichroism (CD) spectroscopy studies
CD and CD melting experiments for G4 were performed using the JASCO J-815 CD Spectropolarimeter.Standard CD measurements were carried out under room temperature at 25 °C.CD melting utilized temperature gradient spanning from 15 °C to 85 °C at the rate of 0.5 °C/min.A 3 min delay is incorporated for every 5 °C increment.Spectra were recorded in a cell path length of 1 mm with an average of 3 scans from 350 nm to 200 nm per interval temperature measurement.Samples were prepared at 10uM DNA concentration in either Na + or K + buffer (same buffer as in UV melting study).Samples were annealed by heating at 95 °C then slowly cooled to room temperature over 2 h followed by an overnight incubation at 4 °C prior to measurement.The resulting spectra were background subtracted using a buffer blank.

UV Thermo-melting (T m ) studies
All thermal stability experiments for G4 were performed on a Cary UV-Vis Multicell Peltier spectrophotometer (Agilent technology).DNA quadruplexes were prepared in either Na + or K + buffer containing 10 mM Na 2 HPO 4 , 10 mM NaH 2 PH 4 , and 100 mM NaCl (for Na + buffer) or 100 mM KCl (for K + buffer) at pH 7, reaching a 600 µL total volume and a final concentration of 1.5 uM.Samples were first annealed at 95 °C for 5 min, slowly cooled down to room temperature for 2 h, and incubated at 4 °C overnight prior to measurement.Thermal denaturation curve was determined by examining the absorbance as a function of temperature recorded at 290 nm in 1 cm quartz cuvette.Sigmodal fitting with R 2 of 0.99 was used as a statistical standard to quantify the goodness of the fits (Supplementary Figures S1A, S1B).Each ultraviolet melting curve was measured from 10 °C to 85 °C at the rate of 0.5 °C/min.All measurements were taken four times or with two complete cycles of heating and cooling.Meltwin 3.5 software was then used to obtain the thermodynamic parameters by analyzing the fitted melting curve of DNA quadruplex.The melting temperature of the quadruplex was identified by the maximum in the first derivative of the best fit curves (Supplementary Figures S2A-S2F).

Computational modeling
3D molecular models were generated for the native and inosine modified sequences to gain structural insights and supplement our experimental observations.Experimentally determined structures of the telomeric repeat sequence in parallel (PDB ID: 1KF1) (Parkinson et al., 2002), antiparallel (PDB ID: 143D) (Wang and Patel, 1993) and hybrid conformations (PDB ID: 5LQG) (Galer et al., 2016) were used as reference structures for the native construct.G-to-I mutations were then performed in MOE (Molecular Operating Environment) [https://www.chemcomp.com/Products.htm] to create 3D structures for each of the inosine modified analogs.The structures were energy minimized to obtain the final structures with inosines accommodation.The structures were then visualized in PyMOL (The PyMOL Molecular Graphics System, Version 2.0 Schrödinger, LLC.https://pymol.org/2/).For hydrogen bond analysis, the donor-acceptor distance cut off and acceptor-donor-hydrogen angle cut off were set at 3.3 Å and 30 °respectively.

Results and discussion
Circular dichroism spectrum of telomere and its analogs in Na + /K + buffer To understand how the inosine substitutions and the buffer ions affect the conformation of these telomeric repeats, we first investigated the G-quadruplex (GQ) configurations (parallel, antiparallel, and hybrid) using spectroscopic methods.CD has been extensively used to examine the structural confirmation of G-quadruplex with signature absorbance bands occurring between 200 and 300 nm, associated with parallel or antiparallel sugarphosphate backbones (Gray et al., 2008).Numerous studies reported that the typical antiparallel CD signature demonstrates a positive band at 295 nm and a negative band at 265 nm.In contrast, a parallel CD signature exhibits a 260 nm positive and a 240 nm negative bands (Balagurumoorthy et al., 1992;Guo et al., 1993;Xu et al., 2006).
The CD spectra for the unmodified human telomeric repeats and their inosine substituted analogs in the presence of 100 mM Na + or K + monovalent cations are shown in Figure 2. The CD spectra vary significantly from sequence to sequence and also for the different cations.In Na + , (GGG) 4 exhibits bands at +295 nm, −264 nm and +213 nm, consistent with an antiparallel GQ, but in K + it exhibits a hybrid topology with bands at +285 nm and −236 nm (Figures 2A,B).(GGI) 4 and (IGG) 4 show very similar behavior in Na + and K + buffers.Though the spectra exhibit several bands for (GGI) 4 : +297 nm, −276 nm, +256 nm, −236 nm, +216 nm, and for (IGG) 4 : +292 nm, −274 nm, +254 nm, −236 nm, +215 nm, the positive band around 295 nm and a slightly shifted negative band at 275 nm is consistent with an antiparallel GQ topology.On the other hand, (GIG) 4 exhibited bands at +268 nm, −240 nm, +216 nm, suggesting a parallel topology in both cation buffers.
UV Thermo-melting (T m ) studies of inosine G4 strands in Na + /K + buffer We next investigated the effect of G-to-I substitutions on the stability of potentially formed guanine tetrads by UV-thermal melting analysis in the presence of Na + and K + metal cations.Normalized optical melting curve profiles are shown in Figure 3 for each strand.Among all the examined sequences, native (GGG) 4 exhibited a sigmodal melting curve with a negative slope in both Na + and K + buffers.K + ions impart more stability as indicated by a higher melting temperature (Figure 3; Table 1).Interestingly, terminal inosine modified DNAs (IGG) 4 and (GGI) 4 displayed aberrant sigmodal fitting curves in Na + and K + buffers (Figures 3A,B).In our CD data, these sequences present an antiparallel GQ topology.We speculate that I 1 and I 3 substitutions only partially weaken the G-quartets, allowing for a transition stage with a hybrid conformation before it finally melts.Hagen and coworkers reported on 24 nt RNA GQ with inosine substituted on the flanking and showed that a two-step unfolding upon melting and that GI tetrad initiated the unfolding accompanied the loss of one K + ion followed by two-stacked G4 unfolding at higher temperature with the release of the remaining K + ions.Moreover, our results were in agreement with Hagen and coworkers in that inosine induced ~7°C -8 °C decrease in Tm compared to 10 °C due to the loss of one H-bond per inosine addition in K + buffer (Hagen et al., 2021).
Despite having different sequence in RNAs, the formation of GQ is the same as in DNA.
(GIG) 4 showed better stability with a slightly higher melting temperature in K + than in Na + buffer (Table 1).Consistent with the literature, K + ions can stabilize quadruplex structure more effectively than the smaller Na + ions (Largy et al., 2016;Zaccaria et al., 2016).
Our CD data shows a parallel configuration for (GIG) 4 , in which the inosines form a weaker quartet and are sandwiched between the two G-tetrads, leading to a stronger quadruplex (refer to the modeling section for details, Figure 5, S6-S8).Study reported by Tanaka and coworkers with 37 nt human telomeric DNA showed that central G-to-I substitution at various positions displayed GQ with different CD spectra of the human telomeric repeat and its inosine substitute analogs in 100 mM Na + or K + containing buffer.Native sequence compared with one, two or three inosine substitutions in Na + (A,C) and K + (B,D) buffers.loop arrangements and that the shorter loop is thermodynamically more stable than longer loop.Furthermore, GIG could also be incorporated within a single loop (Atsushi Tanaka and Tetsuro, 2014).Similar scheme with longer (five to seven) human telomeric repeats used in NMR studies by Yue and coworkers showed that selectively modifying the central guanosine with inosine resulted in (3 + 1) form 2 GQ with one extended double chain-reversal or propeller and two edgewise loops.The group further demonstrated that the propeller loop could harbor one or more GGGTTA repeats.This is consistent with our results that GIG contained stable parallel TTA propeller in four repeats human telomeric.Moreover, Tm value detected for our native (GGG) 4 in K + is 61.4 °C that is comparable to the reported Tm of 62.0 °C by Yue's group (Yue et al., 2011).
Sequences with more than one inosine modification exhibited non-sigmodal behavior except (IGI) 4 , which shows a positive sigmodal melting curve in both buffer ions.This data further supported our assumption that the formation and the  CD melting data for the telomere and its analogs in Na + /K + buffer While UV T m data informs stability via melting temperature assessment, it gives little information regarding the stability of topological conformations during the melting process.To further understand the conformational stability associated with unfolding of these telomeric repeat sequences, we obtained the CD melting spectra for a temperature range 15 °C-85 °C.The spectral profiles as a function of temperature are shown as a heatmap in Figure 4, where the color represents the standardized CD spectra intensity, a purple band indicates a negative peak and a red band indicates a positive peak.This representation provides a conspicuous view to observe transitions in topologies.The classical representation of CD spectra profiles is shown in Supplementary Figure S5A for Na + and Supplementary Figure S5B for K + for reference.
For the native (GGG) 4 in 100 mM Na + , the spectra show a transition from antiparallel to parallel GQ topology, as the (+295nm, −264 nm) shifts to a (+262 nm, −240 nm) band.This transition happens between 50 °C and 60 °C range, which is consistent with the observed melting temperature in our UV melting data (Table 1).On the other hand, in 100 mM K + , (GGG) 4 delineated two stages of transformation, characterized by a hybrid conformation over temperature range from 15 °C to 60 °C and antiparallel over 65 °C-70 °C with a transitioning temperature around 60 °C as well, consistent with our UV melting (Table 1).This thermal melting is higher than (GGG) 4 in 100 mM Na + .It eventually folds into a parallel topology between 75 °C-85 °C (Figure 4D).
Interestingly, for (IGG) 4 and (GGI) 4 in Na + , the antiparallel state is less prominent and the transition from antiparallel to parallel GQ topology happens at a much lower temperature than the native one, between 20 °C-25 °C for (IGG) 4 (Figure 4B), and 30 °C-35 °C for (GGI) 4 (Figure 4C).The parallel conformation is still strong even at higher temperatures, as indicated by the (+260 nm, −245 nm) bands.In K + , while both (IGG) 4 and (GGI) 4 follow the same trend as in Na + buffer, the transition temperature is higher, between 35 °C-40 °C for (IGG) 4 (Figure 4E), and between 45 °C-50 °C for (GGI) 4 (Figure 4F).(GIG) 4 tends to maintain a stable parallel topology over the entire temperature range of 15 °C-85 °C (Supplementary Figures S3A, S3D) in both Na + and K + buffers.For cases with multiple inosine substitutions, (IGI) 4 , and (GII) 4 showed stable parallel topology over the temperature range of 15 °C-85 °C (Supplementary Figures S3B, S3C and S3E, S3F), while (IIG) 4 and (III) 4 showed no clear conformation (Supplementary Figures S4A-D), which is consistent with UV melting data where no clear structural transitions were observed for both Na + and K + buffers.

In silico modeling studies of inosine substituted G-quaduplexes
In light of the various experimental results, we employed in silico modeling studies to explore the structures that these sequences could adopt.Specifically, we used experimentally determined structures of telomeric repeats as reference and generated inosine substituted structures for parallel, antiparallel and hybrid topologies (Supplementary Figures S6-S8).The hybrid conformation is a G-quartet dimer from previous NMR studies in the presence of K + ions at neutral pH (Brahmachari, 1994).In the parallel form, the substituted inosines in the GGG repeat sequence, at any of the three positions, end up as part of the same G-quartet.However, in the antiparallel topology, G-I mutations at G 1 and G 3 are distributed over two quartets while a mutation at G 2 leads to an "I-quartet".In the hybrid topology with the two quartet conformation, inosine substitutions also result in the inosines either distributed between the quartets or as part of the connecting loops or stabilizing triplets in the ends (Figures 5A-C).Within the G-quartets of the quadruplex, a G-I mutation leads to a loss of one hydrogen bond (as shown in Figures 5D,E).
In our experiments we observed that the native quadruplex forms an antiparallel topology and a hybrid topology in the presence of Na + and K + ions respectively, in agreement with the previously determined structures (Brahmachari, 1994;Zhang et al., 2003;Rujan et al., 2005;Antonacci et al., 2007;Tucker et al., 2018).Interestingly, the CD melting data shows that (IGG) 4 , (GGI) 4 transition from antiparallel to parallel quadruplex at higher temperatures in both cation buffers.The structure models of these sequences suggest that the one I-quartet in parallel quadruplexes is better accommodated compared to mixed IGquartets in anti-parallel structures.Extending this same rationale, while (GIG) 4 starts off and continues to maintain a parallel topology at higher temperatures, our modeling study suggests that it has the potential to be stabilized as an antiparallel quadruplex as well.Both (GII) 4 , (IGI) 4 also maintain their parallel topologies at higher temperatures, showing that multiple I-quartets are well tolerated in the structure as long as there is one intact G-quartet, while (III) 4 does not show any CD spectra, consistent with quadruplex formation, suggesting all I-quartets cannot hold the structure in either parallel or antiparallel topologies.An exception is the (IIG) 4 strand which behaves like (III) 4 instead of (GII) 4 even though (GII) 4 and (IIG) 4 both are expected to form symmetric structures.The loop nucleotides adjacent (IIG) 4 are adenosines (As) while those adjacent the I-quartet in (GII) 4 are thymines (Ts).This difference could be a contributing factor towards formation or melting of these structures.

Conclusion
Overall, our study suggests that inosine substitutions in the telomere tandem repeats of (TTAGGG)n sequence could alter local base pairings within the core of the quadruplex and affect their overall conformation and stability in a site-dependent fashion with different ions, specifically contingent on the number of Gs per quartet.We demonstrated that by only replacing G 1 or G 3 terminal guanine in the repeats, i.e., (IGG) 4 or (GGI) 4 , the quadruplex formation is well retained, although the structure is less stable and goes through a potential conformational change during melting.The replacement of central guanine G 2 with inosine does not contribute to quadruplex formation (GIG) 4 .Two or three inosines disfavor quadruplex formation.Replacement with central inosine and two or more inosines in the repeats (GII) 4 , (IGI) 4 (IIG) 4 and (III) 4 do not show any specific quadruplex fingerprints in our CD data.A simple model is that terminal guanine lesions can be compensated by the other two G-quartets and stabilized by Na + or K + ions, while this is not the case with inosine substitution in the central guanine or more than one guanine.
Furthermore, it has been previously reported that both extracellular and intracellular Na + and K + cations can facilitate hydrophobic stacking of two quartets by reducing the electronic repulsion induced by central oxygen atoms.Therefore, the polarity and the relative orientations of stacking quartets can modulate spectra outcomes (Debmalya et al., 2016).The biological significance of these structures is linked to the stability and extent of interaction with the telomerase (Blackburn et al., 1997).Based on our UV-melting results specifically for native and central inosine modified sequences, the structures formed in K + generally have higher stability than the ones in Na + , which is consistent with previous literature (Largy et al., 2016;Zaccaria et al., 2016).However, this is not the case when two terminal Gs are mutated to I, which might be critical when considering that the stability of G-quadruplex can also influence their interactions with telomerase.Less stable quadruplexes allow telomerase to reform Watson-Crick pairing with RNA template, leading to further destabilization (Wang et al., 2011;Carrino et al., 2021).With an expanded base pairing preference (inosines can base pair with T, C and A), inosine can contribute significantly to this structural destabilization and consequently duplex formation.In summary, our current study demonstrates that G-quadruplexes can be reorganized, destabilized or structurally transformed by inosine substitutions, all of which could affect the affinity to bind with small molecule drug targeting G4.

FIGURE 1
FIGURE 1Structures of dI and dG (A).Diagrammatic representation of telomeric repeat sequence forming G-quadruplex consists of antiparallel, parallel, or hybrid topology (B) and inosine substituted oligonucleotide used in the folding studies (C).

FIGURE 3 UV
FIGURE 3UV melting profiles of telomeric DNA and its analogs.Absorbance at 290 nm as a function of temperature in either 100 mM Na + (A) or K + (B) ions conditions.

FIGURE 4
FIGURE 4Comparison of CD melting spectra of telomeric DNA and its terminal modified analogs (GGG) 4 , (IGG) 4 , (GGI) 4 .Topological configuration of DNAs reported at the wavelength of 200nm-300 nm with mean residue ellipticity as function of temperature in the presence of 100 mM Na + (A-C) and 100 mM K + (D-F).The intensity of ellipticity has unit of degree × cm 2 × dmol −1 .

TABLE 1
Results from CD spectroscopy and UV melting experiments for the native telomeric DNA and its inosine modified analogs used in this study.