Investigations Into Chemically Stabilized Four-Letter DNA for DNA-Encoded Chemistry

DNA-encoded libraries are a prime technology for target-based small molecule screening. Native DNA used as genetic compound barcode is chemically vulnerable under many reaction conditions. DNA barcodes that are composed of pyrimidine nucleobases, 7-deazaadenine, and 7-deaza-8-azaguanine have been investigated for their suitability for encoded chemistry both experimentally and computationally. These four-letter barcodes were readily ligated by T4 ligation, amplifiable by Taq polymerase, and the resultant amplicons were correctly sequenced. Chemical stability profiling showed a superior chemical stability compared to native DNA, though higher susceptibility to depurination than a three-letter code based on pyrimidine DNA and 7-deazaadenine.


Table of Contents Page General methods and materials S4
Computational details S6 Calculation of tautomer populations S7 Synthesis of chemically stabilized nucleoside phosphoramidites A-C S10 NMR spectra S12 Chemical stability screening of DNA barcodes S15 Representative procedures S15 HPLC traces and MALDI-MS spectra S17   Tables  Table S1 -Calculated ΔG and populations for selected tautomeric forms of guanine derivatives S7 Table S2 -Results (in kcal mol -1 for energies) of EC-RISM and vacuum calculations S8 Table S3 -Results (in kcal mol -1 ) of PCM (MP2/6-311+G(d,p)) and TI calculations S9 Table S4 -Stability of chemically modified oligonucleotides 5 and 6 S16 Table S5 -Sequences of DNA oligonucleotides I -IV/IV' S35 Table S6 -Sanger sequencing results S38 Table S7 -Overview of diverse chemical reactions on CPG-bound stabilized barcode S57 MP2/6-311+G(d,p)/EC-RISM [7,8] for computing the solvation free energy relative to MP2/6-311+G(d,p) in the gas phase, and to rigid-body thermodynamic integration (TI) calculations in order to provide an alternative, molecular dynamics-based approach to the solvation free energy. EC-RISM calculations were performed using the computational setup developed during the SAMPL6 blind prediction challenge [9] (140 3 grid points with 0.3 Å spacing, PSE-2 closure, [10] modified SPC/E water model, GAFF force field (version 1.7) [11,12] with Lorentz−Berthelot mixing rules for Lennard-Jones (LJ) interactions, and exact periodicitycorrected solute-solvent electrostatics) on the MP2/6-311+G(d,p) level of theory in Gaussian 09 rev. E.01. [13] For the TI calculations, 4167 SPC/E [14] water molecules were placed in a 50 3 Å cube around the molecule using packmol 1.1.2.023. [15] The NAMD 2.11 [16] software was used for the simulations together with AM1-BCC charges, GAFF 1.7 [11,12] parameters for LJ interactions, and a timestep of 2.0 fs. Each setup was minimized followed by 0.4 ns equilibration. The TI coupling parameter λ was scaled equidistantly in steps of 0.1 between 0 and 1 first for the LJ terms using soft-core scaling and afterwards linearly, using the same step size, for the electrostatic interactions, followed by a hysteresis estimation in the reverse order. For each λ step the system was equilibrated for 60 ps simulated for 0.4 ns. Langevin temperature and pressure control was used for setting the temperature to 298.15 K and the pressure to 1 bar. A smooth cutoff switching scheme for LJ interactions between 10 and 12 Å and a 4 th order particle mesh Ewald algorithm (1.0 Å grid spacing) for the electrostatic interactions were employed. The water geometry was constrained using the SETTLE algorithm as implemented in NAMD.

Calculation of tautomer populations
The strategy used for the calculation of the tautomer populations of guanine and its derivatives followed closely Refs. [1,2] on the basis of the thermodynamic cycle shown in Figure 4 of Ref. [2] There are multiple routes to calculate the reaction free energies, two "direct" routes were the free energy differences of the species in solution only are considered (PCM and EC-RISM), and three "indirect" routes were the solvation free energies per species are calculated (PCM, EC-RISM, TI), and the cycle is completed with the gas-phase reaction free energies taken from CCSD(T) calculations including thermal corrections on the B3LYP/6-311+G(d,p) level. The results for the guanine, 7-deazaguanine, 8-aza-7deazaguanine and 8-azaguanine tautomers are presented in Table S1, all referenced to the Watson-Crick tautomer. Since all five approaches revealed similar trends, the average reaction free energies and resulting tautomer populations over all methods were calculated.
The individual free energy components are given in Table S2 and S3; the structures are provided in machine-readable format in the accompanying zip file. As also mentioned in the main text, the uncertainties provided in Refs. [1,2] were erroneously reported to be too small by a factor of 5 1/2 = 2.236. Uncertainties are here correct, and the corrected values are also given for reference calculations [2] on canonical guanine I. This correction has no impact on energetic rankings and discussion of tautomer relevance.
Step 1: The solution of 7-deaza-8-aza-2'-deoxyguanosine (300 mg, 1.12 mmol, 1.0 eq.) in dry methanol (6 mL) and DMF-DMA (DMF-dimethyl acetale, 1.2 mL) was stirred at 50°C for 2.5 hours. Then, the reaction mixture was concentrated under vacuo, and the crude material was co-evaporated twice with each 3 mL dry methanol and 3 mL diethyl ether, dried under vacuum, and immediately used in the next step without further purification.
Step 2: To the solution of N 6 -DMF-2'-deoxy-7-deaza-8-azaguanosine (355 mg, 1.10 mmol,  The solution was filtered through a stringe filter and diluted with CH2Cl2 (10 mL). The organic phase was washed with saturated aq. NaHCO3 (2 x 20 mL) and brine (20 mL), then dried over anhydrous Na2SO4, filtered, and concentrated in vacuo. The product C was obtained as a colorless foam and as a diastereoisomeric mixture (448 mg, 97% yield).It was used without further purification for solid-phase oligonucleotide synthesis. 1

Synthesis of
S17 HPLC traces and MALDI-MS spectra of metal ion screens

CPG-oligonucleotide Analytical data
10mer T7De8a-dGC  Imaging of the gels was performed using the Bio-Rad Gel Doc™ XR system.

Purification of DNA by ethanol precipitation
After the first and second ligation, the DNA was precipitated by adding 1/10 volume of The DNA samples were dissolved in ddH2O.

Purification of DNA by gel extraction
After the third ligation, the DNA samples were gel extracted using the "QIAquick Gel Extraction Kit" (Qiagen) according to the manufacturer protocol.

PCR amplification
Following the third ligation, fully encoded DNA was amplified by PCR. Thereby, 5 μL of gel extracted DNA, 5 U of Taq DNA polymerase (Thermo Fisher Scientific), 1x Taq Buffer
Afterwards, the CPG-bound deprotected oligonucleotide was washed three times with each 200 µL of DMF, MeOH, ACN and CH2Cl2 and then dried in vacuo.
Step 2: CPG-bound oligonucleotide, carboxylic acid and HATU were dried in vacuo for 15 min. Stock solutions of all reactants in dry DMF were prepared before the reaction was Ugi four-component reaction on CPG-bound oligonucleotides (RP-05) [18] Prior to use, CPG-bound oligonucleotide aldehyde conjugate was dried in vacuo for 15 min.

Zn(II)-mediated aza-Diels-Alder reaction on CPG-bound oligonucleotides (RP-13) [19]
CPG-bound oligonucleotide 13, and ZnCl2 were dried in vacuo for 15 min. The catalyst Yb(PFO)3 was prepared according to a published procedure. [31] Prior to the reaction, the hydrazine was extracted with diluted NH3 solution and CH2Cl2, dried over