The Rapid Methylation of T-DNAs Upon Agrobacterium Inoculation in Plant Leaves

Agrobacterium tumefaciens has been foundational in the development of transgenic plants for both agricultural biotechnology and plant molecular research. However, the transformation efficiency and level of transgene expression obtained for any given construct can be highly variable. These inefficiencies often require screening of many lines to find one with consistent and heritable transgene expression. Transcriptional gene silencing is known to affect transgene expression, and is associated with DNA methylation, especially of cytosines in symmetric CG and CHG contexts. While the specificity, heritability and silencing-associated effects of DNA methylation of transgene sequences have been analyzed in many stably transformed plants, the methylation status of transgene sequences in the T-DNA during the transformation process has not been well-studied. Here we used agro-infiltration of the eGFP reporter gene in Nicotiana benthamiana leaves driven by either an AtEF1α-A4 or a CaMV-35S promoter to study early T-DNA methylation patterns of these promoter sequences. The T-DNA was examined by amplicon sequencing following sodium bisulfite treatment using three different sequencing platforms: Sanger sequencing, Ion Torrent PGM, and the Illumina MiSeq. Rapid DNA methylation was detectable in each promoter region just 2–3 days post-infiltration and the levels continued to rapidly accumulate over the first week, then steadily up to 21 days later. Cytosines in an asymmetric context (CHH) were the most heavily and rapidly methylated. This suggests that early T-DNA methylation may be important in determining the epigenetic and transcriptional fate of integrated transgenes. The Illumina MiSeq platform was the most sensitive and robust way of detecting and following the methylation profiles of the T-DNA promoters. The utility of the methods was then used to show a subtle but significant difference in promoter methylation during intron-mediated enhancement. In addition, the method was able to detect an increase in promoter methylation when the eGFP reporter gene was targeted by siRNAs generated by co-infiltration of a hairpin RNAi construct.


INTRODUCTION
Transformation efficiency using Agrobacterium can be dependent on many variables including the Agrobacterium strain, the host plant species, and the age and tissue type being used. Additional factors influencing the efficiency of T-DNA transfer include, but are not limited to, the optical density of the Agrobacterium solution, the concentration of the vir gene inducer, acetosyringone, time in co-cultivation media, infiltration/inoculation conditions including addition of surfactants and tissue culture conditions (Clough and Bent, 1998;Jin et al., 2005;Shrawat and Lörz, 2006;Lizamore and Winefield, 2014). The optimization efforts in many different crop species such as the cereals; rice, maize, wheat, barley, sugarcane, sorghum and rye, reviewed by Yang et al. (2000), citrus hybrids Cervera et al. (1998), apple James et al. (1993), and tomato Sharma et al. (2009) demonstrate the challenges faced for transformation efficiency. Once transformed, transgenic expression itself can often be variable through influences such as the T-DNA copy number, T-DNA integration site within the genome, the regulatory sequences used to drive transgene expression and transgene silencing (Butaye et al., 2005).
Variation in transgene expression based on its integration site within the plant genome is often referred to as the 'position effect.' It had previously been suggested that T-DNA integration favors transcriptionally active euchromatin regions over transcriptionally inactive heterochromatin (reviewed in Kim and Gelvin, 2007). However, other studies suggest transgene integration may well be random in higher eukaryotes, resulting in integration within both euchromatin and heterochromatin without bias (Forsbach et al., 2003;Kim and Gelvin, 2007;Singer S.D et al., 2012). In addition to the positional effect's theory, transgene expression can be influenced when integrated near endogenous regulatory elements. These elements can include enhancers or repressors which can influence transgene expression resulting in variable expression patterns. To minimize the effects of variation in expression, matrixassociated regions (MARs) have been included in transgenic constructs. These function by insulating the transgene from condensing chromatin and maintain transcription regardless of the integration site (Singer S.D et al., 2012). When transgenes flanked by a MAR were transformed in tobacco, poplar and maize, variation in transgene expression and spatial patterns between individual transformants was still evident, overall the MAR provided limited success in improving transformation efficiency (van Leeuwen et al., 2000(van Leeuwen et al., , 2001Brouwer et al., 2002;Li et al., 2008).
As well as these documented variables that affect transgene expression and transformation efficiency, it is also possible that methylation of the T-DNA during the early stages of plant transformation may play a role. Certainly, in transformed plants, T-DNA methylation has long been shown to be present and to impair T-DNA expression (Gelvin et al., 1983;Amasino et al., 1984). The early methylation of selectable marker and transgene-of-interest promoters may confer temporary or longterm transcriptional inhibition, resulting in reduced amenability to selection and gene expression.
In plants, cytosine methylation occurs at CG, CHG, and CHH sites (where H = A, C, or T). In Arabidopsis, de novo methylation at CG, CHG, and CHH sites can be established by the DOMAINS REARRANGED METHYLTRANSFERASE 2 (DRM2) (Cao and Jacobsen, 2002). Maintenance of CG methylation is carried out by METHYLTRANSFERASE 1 (MET1) (Saze et al., 2003). De novo methylation and maintenance in the CHG context is carried out by the co-ordinated action of DRM2 and CHROMOMETHYLASE 3 (CMT3) (Cao et al., 2003). Methylation maintenance in the CHH context is either re-established by DRM2 or CMT3 via an RNA directed DNA methylation (RdDM) trigger or is carried out independently of the RdDM trigger by CMT2 within transposons (Chan et al., 2005;Law and Jacobsen, 2010;Matzke and Mosher, 2014).
Transgenes can be silenced via post-transcriptional gene silencing (PTGS) and via transcriptional gene silencing (TGS). In PTGS, the targeted transcript is degraded by RNA interference (RNAi) machinery (Matzke and Mosher, 2014) and in many cases subsequently invokes TGS (Sijen et al., 2001;Nocarova et al., 2010). TGS inhibits the transcription of target genes and is strongly associated with promoter methylation. Briefly, the pathway in plants is as follows: the plant specific RNA Pol IV first produces the ssRNA from the transgene. Through the action of RDR6, this ssRNA is made into dsRNA. The ribonuclease III activity of DICER-LIKE 3 (DCL3) then processes the dsRNA into 24 nt siRNAs which are loaded into ARGONAUTE 4 or 6 (AGO4/6). Interaction in a sequence specific manner between the loaded AGO4/6, ssRNA RNA Pol V transcripts and DRM2 results in de novo methylation of the transgenic DNA template (Matzke and Mosher, 2014).
Including certain intron sequences in transgene constructs, in or near the 5 UTR, is known to improve expression, termed intron-mediated enhancement (IME) (Mascarenhas et al., 1990). IME is thought to be due to an increase in RNA Pol II binding, the recruitment of transcription factors, or a combination of both (Morello and Breviario, 2008;Laxa et al., 2016). The orientation of the intron, position within the transcribed region, proximity to the transcription start site (TSS) and specific sequence motifs may all contribute to IME (Rose et al., 2008;Laxa, 2017). But little is known about the role promoter methylation may play in the phenomena.
Here, we investigate the methylation of T-DNA in the early stages following transient Agrobacterium infiltration of Nicotiana benthamiana leaves. The status of two transgene promoters, CaMV-35S and AtEF1α-A4 (with and without its 5 UTR intron) was followed over the first few days and weeks post-infiltration.

Constructs Used
A 35S-eGFP vector was generated by cloning a cassette containing the 35S promoter followed by eGFP and the OCS terminator into pORE 03 (Coutu et al., 2007) via the HindIII and EcoRI sites (Supplementary Figure S1). A hairpin construct, pSSU-eGFP-F-hp targeting fragment 'F' of eGFP was used to enhance PTGS in the time course (McHale et al., 2013).
The 35S promoter was excised via the HindIII and XhoI sites from the 35S-eGFP vector to maintain the same vector backbone. To generate the 5 UTR intron containing construct, the 1417 nt sequence upstream of the AtEF1α-A4 (AT5G60390.1) gene to the TSS was isolated using the primers EF1a-4_Prom_GA_F and EF1a-4_Prom_+Int_GA_R. To generate the 5 UTR intronless version, a reverse primer EF1a-4_Prom_-Int+CT_GA_R was used in combination with EF1a-4_Prom_GA_F. Polymerase chain reaction (PCR) fragments were introduced into the digested vector via Gibson Assembly (NEB) and transformed into electrocompetent DH5α using standard molecular techniques. Primer sequences are presented in Supplementary Table S1.

Transformations and Preparation of Agrobacterium tumefaciens GV3101 (MP90)
Sequence verified plasmids were transformed into electrocompetent Agrobacterium tumefaciens GV3101 (MP90) using standard molecular techniques (Wise et al., 2006). A single colony was then grown in 5 mL of LB culture supplemented with the appropriate antibiotic selection for 16 h at 28 • C with shaking at 180 rpm. Then 100 µM acetosyringone (PhytoTechnology Laboratories) was added and cells returned to incubate at 28 • C with shaking at 180 rpm for 90 min. Cells were then washed twice in 5 mL of infiltration buffer (10 mM MgCl 2 , 10 mM MES hydrate, pH 5.7) supplemented with 100 µM acetosyringone. Cultures were then adjusted to an OD 600 of 0.2 for agro-infiltrations.

Agro-Infiltrations
For infiltrations, the Agrobacterium solutions were mixed in equal volume ratios prior to infiltration. The mixtures were agro-infiltrated via a 1 mL syringe onto the underside of N. benthamiana LAB (Bally et al., 2015) leaves at 4 weeks of age. The growth conditions are as described in Philips et al. (2017). For the sequencing platform comparison experiment, a single leaf was infiltrated. For the base-pair resolution experiment, three plants with three leaves each were infiltrated per time point. The plants were allowed to grow for a further 21 days for 35S-eGFP experiments and 7 days for AtEF1α-A4 ± 5 UTR intron experiments. These infiltrated leaves were collected at various time points, imaged, uniform patches cut and stored at −80 • C for DNA and RNA extractions.

Imaging
Infiltrated leaves were visualized using a Dark Reader Hand Lamp HL32T (Clare Chemical). Images were captured using a Canon EOS 550D DLSR camera using an EF-S 60 mm lens attached with a Hoya HMC O(G) amber filter blocking blue light transmission (Supplementary Figure S2). Captured images were processed using ImageJ v1.50i to determine the mean gray value indicative of eGFP fluorescence intensity as described in Wydro et al. (2006).

DNA Extractions, Precipitation, and Bisulfite Conversion
Approximately 100-200 mg of infiltrated leaf were ground under liquid nitrogen and used in DNA extractions. The NucleoSpin Plant II (Macherey-Nagel) and the DNeasy Plant Mini Kit (Qiagen) extraction kit were used for the sequencing platform comparison and the final experiments, respectively. DNA was precipitated by 1 / 10 v/v sodium acetate and 2.5x v/v ethanol and resuspend in elution buffer to a concentration of ∼110 ng/µL. Between 1250 and 1750 ng of this DNA was converted using the EpiTect Fast DNA Bisulfite Kit (Qiagen) with the 60 • C incubation extended to 20 min to achieve maximum conversion. The bisulfite converted DNA was then diluted with molecular grade water to ∼5 ng/µL and used as a template for PCR.

Sanger Sequencing
Polymerase chain reaction amplifications were carried out as follows; an initial denaturation step at 95 • C for 3 min, followed 40 cycles comprising of 95 • C for 15 s, 55 • C for 15 s, and 72 • C for 15 s, followed by a final extension for 1 min at 72 • C. Reactions contained 1x KAPA2G Robust HotStart ReadyMix (KAPA), 0.5 µM each of dBS F4.1 and dBS R7.1 primers (Supplementary Table S1), 4 µL (∼20 ng) of diluted bisulfite converted DNA as template and the volume made up to 20 µL with molecular grade water.

For Ion Torrent
Touchdown PCR amplifications were carried out as follows; an initial denaturation step at 95 • C for 3 min, followed by 10 touchdown cycles comprising of 95 • C for 15 s, 70 • C for 15 s, and 72 • C for 15 s, with the annealing temperature decreasing by 1 • C per cycle, followed by 29 cycles comprising of 95 • C for 15 s, 55 • C for 15 s, and 72 • C for 15 s, followed by a final extension for 1 min at 72 • C. Reactions contained 1x KAPA2G Robust HotStart ReadyMix (KAPA), 0.5 µM each of forward and reverse Ion Torrent fusion primers with adapters (Supplementary Table S1), 3 mM MgCl 2 , 4.5 µL (∼22.5 ng) of diluted bisulfite converted DNA as template and the volume made up to 20 µL with molecular grade water.

Extension Temperature Optimization Using Illumina MiSeq
Touchdown gradient PCR amplifications were carried out as follows; an initial denaturation step at 95 • C for 3 min, followed by 10 touchdown cycles comprising of 95 • C for 20 s, 67 • C for 20 s, and 66-75 • C for 45 s, with the annealing temperature decreasing by 1 • C per cycle, followed by 29 cycles comprising of 95 • C for 20 s, 54 • C for 20 s, and 66-75 • C for 45 s, followed by a final extension for 2 min at 66-75 • C. To reduce variability, reactions cycled at the same time. Column 4 within the Mastercycler nexus GSX1 (Eppendorf) was selected for 67.5 • C (actual 67.6 • C) and column 8 was selected for 72 • C (actual 72.

Base-Pair Resolution Experiment Using Illumina MiSeq
Touchdown PCR amplifications and cycling using EpiMark Hot Start Taq DNA polymerase (NEB) was carried out as described for the extension temperature optimization experiment with an extension temperature of 67.5 • C and scaled down to a volume of 15 µL.

NGS Library Preparation
Polymerase chain reaction amplicons were submitted to the Central Analytical Research Facility (CARF) at QUT. For Ion Torrent, the PCR amplicons were purified, pooled in equimolar amounts to prepare the library and then diluted to ∼26 pM. Sequencing was carried out on the Ion 318 Chip v2 with the Ion PGM Hi-Q View Sequencing Kit (400 base reads). For Illumina MiSeq tagmentation, the amplicons were treated with the Nextera transposome and adapters added using the Nextera XT index kit v2. The purified amplicons were then pooled in equimolar amounts and diluted to a concentration of 2 nM. The library was denatured, diluted to ∼6 pM with 10% PhiX spikein, loaded on a MiSeq flow cell and sequencing carried out on the Illumina MiSeq with MiSeq v3 reagent chemistry. For the Illumina MiSeq fusion approach, the Nextera XT indices v2 were added in a secondary PCR. The library was then treated and sequenced as described for the tagmentation procedure with the exception of 25% PhiX spike-in. All procedures were followed as per manufacturer's instructions. These data are accessible at www.benthgenome.com.

Methylation Analysis
For Ion Torrent PGM Ion Torrent reads were processed using Trimmomatic v0.32.2 (Bolger et al., 2014). The parameters included sliding window trimming of four bases and an average quality Phred score of 12 (2 and 3 dpi) and Phred score of 22 for all other time points. Reads shorter than 395 bp were excluded to ensure only full length reads were analyzed. Trimmed reads were aligned and columns containing more than 90% gaps were stripped using Geneious 11.0.2 (Biomatters, Ltd.) Methylation analysis was performed on these processed reads using the web-based tool Kismeth (Gruntman et al., 2008) and the output used to tabulate methylation percentages in all cytosine contexts.

For Illumina MiSeq -Sequencing Platform Comparison Experiment
Pair-end MiSeq reads were merged using PEAR v0.9.6.0 (Zhang et al., 2014). Residual Illumina adapter sequences were removed using Cutadapt v1.6 (Martin, 2011). These reads were then processed using Trimmomatic as described for Ion Torrent but with an average Phred score of 30. Due to the tagmentation approach full length reads of 418 bp were not possible so all reads below 250 bp were excluded. For Illumina fusion, reads below 395 bp were excluded. Processed reads were analyzed as per the Ion Torrent procedure.

For Illumina MiSeq -Base-Pair Resolution Experiment
MiSeq reads were merged as described earlier. Reads were then processed within Geneious 11.0.2 (Biomatters, Ltd.) using the BBDuk plugin with the following settings: Phred score of 25 and minimum read length of 395 bp for 35S-eGFP time courses and Phred score of 20 and minimum read length of 525 bp for AtEF1α-A4 ± 5 UTR time courses. The processed reads were then aligned, columns stripped and analyzed as per the Ion Torrent procedure. The number of total raw reads generated is presented in Supplementary Table S2. RNA Extraction, cDNA Synthesis, and qRT-PCR RNA from each replicate was extracted, reverse transcribed by PrimeScript RT reagent Kit (Perfect Real Time, TaKaRa Bio, Inc.) and processed as described in Tashiro et al. (2016). Assays were designed and executed according to MIQE guidelines (Bustin et al., 2009). PP2A and L23 as described in Liu et al. (2012) were used as normalization factors. Relative expression assays were run on the CFX384 Touch Real-Time PCR detection system (BioRad) and the data analyzed with the supplied software. Final reaction volumes were 10 µL that consisted of: 4 µL of diluted cDNA, 0.2 µM each of forward and reverse primer (Supplementary Table S1), and 1x SYBR Premix Ex Taq II (Tli RNase H Plus, TaKaRa Bio, Inc.). All liquid handling steps were performed using an Eppendorf epMotion 5075. PCR efficiency was determined via a standard curve of linearized plasmid as described by Tashiro et al. (2016). The samples, standard curve and no-template controls were repeated in technical triplicates.

Statistical Analysis
General linear model (two-way ANOVA) using combined CG, CHG, and CHH methylation was compared at each time point between the time courses. ANOVA on the level of methylation intensity was performed on the data presented in Figure 6. All tests were performed using Minitab 17 (Minitab, Inc.).

RESULTS eGFP Fluorescence and Transcript Decrease Over Time
The levels of transient eGFP fluorescence and transcript was measured over a 21-day time course after infiltrating N. benthamiana leaves with a 35S-eGFP construct. As the loss of fluorescence and transcript in transiently infiltrated plants has been implicated with PTGS (Voinnet et al., 2003), a second experiment where a hpRNA construct targeting eGFP itself was co-infiltrated to enhance PTGS. The enhanced PTGS is due to the production of dsRNA which are processed by the DICER-LIKE (DCL) proteins to generate the siRNAs needed for gene silencing (Helliwell and Waterhouse, 2003). In both time courses, the accumulation and subsequent reduction of eGFP fluorescence and relative transcript abundance is observed (Figure 1). Following infiltration, eGFP fluorescence and relative transcript abundance peaked at 3 dpi and fluorescence was FIGURE 2 | Sequencing platform comparison between Sanger sequencing, Ion Torrent PGM and Illumina MiSeq using the tagamntation approach. The percentage of methylated cytosines (% mCs) is presented for the 14 dpi 35S-eGFP+hp sample, n = number of reads used from one infiltrated leaf to build the profile after quality and length filtering.
detectable over the first week in the 35S-eGFP time course. In comparison, a reduced eGFP fluorescence and transcript abundance was observed in the 35S-eGFP+hp time course and fluorescence was undetectable at an earlier time point of 5 dpi (Figure 1).

Sequencing Platform Comparisons
We then assessed DNA methylation of the T-DNA to determine if this correlated with the loss of eGFP fluorescence and transcript observed in Figure 1. The 14 dpi sample from the 35S-eGFP+hp time course was selected and tested on different sequencing platforms to determine which would most appropriately obtain DNA methylation from a heterogeneous population of T-DNA.
Cytosine methylation was then assessed using the Ion Torrent PGM sequencer with the same bisulfite-treated DNA amplified with degenerate Ion Torrent fusion primers (Supplementary Table S1). Sequencing the amplicon produced a total of 457,552 individual sequences of various lengths. These sequences were filtered for a minimum length of 395 bp which represents the length comprising of all the cytosines in the amplicon. To assess whether the methylation profile was affected by altering filtering quality threshold, two different Phred quality scores of 12 and 22 were selected, representing a base call accuracy of 93.67 and 99.37%, respectively. The lower Phred score allowed a higher number of T-DNA amplicon sequences to be analyzed (20,010 as opposed to 6734) but did not affect the deduced cytosine methylation profile of 32% (CG), 64% (CHG), and 70% (CHH) for Phred 12 and marginally lowered the CG context to 31% for Phred 22 with the CHG and CHH contexts remaining unchanged (Figure 2 and Table 1). To improve the number of reads after filtering the Illumina MiSeq sequencing platform was then used with the same bisulfite-treated DNA amplified with degenerate primers (dBS F4.1 and dBS R7.1, Supplementary Table S1). A tagmentation approach was then used for adapter and index addition to differentiate multiple samples. A reduced 31% (CG), 58% (CHG), and 64% (CHH) cytosine methylation profile was observed using this approach compared to the profiles attained by both Ion Torrent and Sanger sequencing (Figure 2).

Comparisons Between Ion Torrent and Illumina MiSeq
As more T-DNA amplicon sequence reads are able to be costeffectively assessed by NGS rather than Sanger sequencing, a time course of the methylation profile compared only the Ion Torrent and Illumina MiSeq sequencing platforms. A similar trend in T-DNA methylation over time was observed using both NGS platforms. There was a greater increase in CG, CHG, and CHH methylation from 3 dpi to 5 dpi followed by a steady increase in accumulation to 21 dpi (Figure 3). However, for Ion Torrent data, the samples at the earlier time points (prior to 5 dpi inclusive) showed a comparatively higher level of methylation. Furthermore, the Ion Torrent time course experiment showed a correlation between the number of reads and the degree of methylation. For example, there were fewer reads with low methylated profiles in early time points compared to later time points which had a higher number of reads and increased methylation (Figure 3). There was also a high level of truncated reads on the Ion Torrent platform in these earlier time points, for example, the 2 dpi sample had 3.8% of reads over 395 bp, whereas the 14 dpi sample had 18.8% of reads over 395 bp prior to any Phred score filtering (Supplementary Table S3). A limitation of the tagmentation approach was the reduction in full length amplicon reads due to the fragmentation required for index addition to the PCR amplicon. Following quality score filtering of Q30 and selecting amplicons larger than 250 bp, only 21.6% of the generated reads were available for cytosine methylation profile analysis ( Table 1). Thus to increase the number of full length reads available for analysis, the same bisulfite-treated DNA was amplified with the degenerate bisulfite primers with adapters added as overhangs (fusion approach, Supplementary Table S1). Following quality score filtering of Q30 and selecting full length amplicon reads, 75.4% of the generated reads were now available for cytosine methylation profile analysis (Table 1). A similar 31% (CG), but higher 62% (CHG) and 70% (CHH) cytosine methylation profile was observed for the fusion approach compared to the tagmentation approach (Figure 2).
In summary, NGS platforms produce several 1000 reads that are able to be analyzed for their methylation status. The protocol allowed for barcoding and multiplexing which is otherwise laborious when sequencing individual colonies via Sanger sequencing. Having a greater sequencing depth is particularly important when a heterogeneous pool of differentially methylated T-DNAs may exist, such as that in transiently infiltrated leaf tissue. Further, we have found the Illumina fusion based approach to be most suitable and this was used in subsequent experiments.

PCR Extension Temperature Comparison
To assess whether the PCR extension temperature affected the inferred cytosine methylation profile, the extension temperature of the bisulfite-treated DNA in the KAPA2G Robust reaction was reduced from 72 • C (72.2 • C actual) to 67.5 • C (67.6 • C actual) and the amplicons visualized on an agarose gel ( Figure 4B) followed by analysis using the Illumina MiSeq fusion approach. Interestingly, irrespective of the polymerase (KAPA2G Robust or NEB EpiMark), using an extension temperature of 67.6 • C resulted in a twofold reduction in the methylation profile of all cytosine contexts when compared to the methylation profile of . Sequencing was carried out on the Illumina MiSeq with the fusion protocol, n = three biological replicates. General linear model (two-way ANOVA) followed by the Tukey test was used to compare all cytosine contexts between the two treatments at each time point. Statistically significant differences, ns not significant, * P < 0.05, * * P < 0.01, * * * P < 0.001.
Frontiers in Plant Science | www.frontiersin.org the KAPA2G Robust reaction with PCR extension taking place at 72.2 • C ( Figure 4A).
Taken together, the results from the sequencing platform comparison experiment provide confidence that the developed assay was reproducible and demonstrated methylation accumulation over time. The time course experiment was then repeated at base-pair resolution with biological replication with sequencing taking place on the Illumina MiSeq using the fusion approach. For this experiment, amplification from bisulfite converted DNA was carried out using the EpiMark Hot Start Taq polymerase (NEB) with PCR extension taking place at 67.5 • C.

Rapid T-DNA Methylation Following Agro-Infiltration
Methylation analysis in base-pair resolution of the 3 region of the 35S promoter and part of the 5 region of eGFP was carried out over a 3 week time course with collections at 2, 3, 5, 7, 14, and 21 dpi with biological replication (Figure 5). In these experiments, the methylation increased over the time course as previously observed in the sequence platform comparison experiments (Figure 3) and was detected in as little as 2 dpi. In the first week (2, 3, 5, and 7 dpi) there is a higher combined cytosine methylation profile when the hpRNA was co-infiltrated with 35S-eGFP (P < 0.01). At 14 dpi although higher, the methylation profile of the hpRNA containing time course was not significantly different to the time course without the hpRNA (P > 0.05). However, a significant increase in the combined cytosine methylation profile was evident at the 21 dpi time point in the hpRNA containing time course (P < 0.05). Overall, T-DNAs are rapidly methylated once transiently infiltrated into N. benthamiana leaf tissue. The cytosine context comparison at each time point and the methylation of individual bases is presented in Supplementary Figures S3, S4.

T-DNA Methylation From a Heterogeneous Population
A range of T-DNAs are likely to exist following agro-infiltration into N. benthamiana leaves. These include: (1) un-nicked T-DNA residing on the Ti plasmid still within the Agrobacterium, (2) nicked single-stranded T-DNA (ssT-DNA) either within the Agrobacterium or transferred to the plant cell, (3) linear double-stranded dsT-DNA (dsT-DNA) within the plant cell, (4) circularized and unintegrated dsT-DNA within the plant cell, as described by  and (5) integrated T-DNA within the plant genome. As we have isolated total genomic DNA from the agro-infiltrated patch, the deduced methylation profiles will contain a contribution from some DNA still within the Agrobacterium. To determine the extent of Agrobacterium T-DNA contribution, individual T-DNAs amplicons at each time points in the 35S-eGFP+hp experiment were analyzed for methylation intensity. The amplicon generated using the dBS F4.1 and dBS R7.1 Illumina fusion primers (Supplementary Table S1) contains 92 cytosine sites. After bisulfite conversion and PCR, individual T-DNA reads were grouped based on their number of cytosines into low (0, 1, and 2), medium  Table S1) contains 92 cytosine sites.
(3-30 and 31-60), and high (61-92), which is indicative of the intensity of cytosine methylation. Individual T-DNA reads with zero to low levels of cytosines most likely originate from the T-DNAs either within the Agrobacterium or those being transferred into the plant. Analysis from the hpRNAcontaining time course was used as the hairpin will expedite gene silencing such that once in the plant, methylation of the T-DNAs is more likely to occur. Therefore, this time course rather than 35S-eGFP by itself will better serve to indicate the maximum level of Agrobacterium-retained T-DNA in the heterogeneous population.
During the early stages following agro-infiltration, 36 and 32% of all T-DNA reads had zero methylation at 2-and 3dpi, respectively. Interestingly, at the later time points (5, 7, 14, and 21 dpi) a consistent level of 22, 19, 20, and 21% of zero methylation is observed, respectively (P > 0.05, Figure 6) and this probably reflects the contribution to the analysis from T-DNA within Agrobacterium. Importantly, at these same time points, the percentage of highly methylated reads (containing between 61 and 92 methylated cytosines) steadily increases from 4 to 27% (Figure 6). This demonstrates a continuing methyltransferase activity within the plant rather than a loss of T-DNAs still retained by the Agrobacterium.

Inclusion of AtEF1α-A4 5 UTR Intron Results in IME and Decreases Promoter Methylation
Intron-mediated enhancement was seen with increased eGFP fluorescence and relative transcript abundance when under the control of the AtEF1α-A4 promoter containing the 5 UTR intron compared to the intronless construct (Figure 7). The mean relative gray value, indicative of eGFP fluorescence, was threefold higher at the first time point (2 dpi) with the intron-containing construct ( Figure 7B) and increased to nearly fourfold by 4 dpi (Figure 7B). At 7 dpi, there was a decrease in eGFP fluorescence as seen with the 35S-eGFP time course experiment (Figure 1). An intron-spanning RT-PCR confirmed that the 5 UTR was correctly processed (Supplementary Figure S5). qRT-PCR was performed to test whether an increase in eGFP fluorescence was due to an increase in mRNA transcript abundance. Comparatively, there was a higher eGFP transcript abundance in the 5 UTR intron containing construct at all time points tested ( Figure 7C).
To test whether IME correlated with a decrease in methylation, analysis of the promoter involving biological replication was carried out over a 1 week time course with collections at 2, 3, 4, 5, and 7 dpi (Figure 8). As observed in our 35S-eGFP time course experiments (Figure 5), T-DNA methylation of the promoter was detected after agro-infiltration into the plant. There was a significant difference in the combined cytosine contexts (P < 0.05) between the 5 UTR intron-containing and intronless promoters at the 2, 3, and 4 dpi time points. This indicates that the inclusion of the 5 UTR intron, in addition to IME, causes a reduction in the level of promoter methylation. At 5 and 7 dpi, there was no significant difference (P > 0.05) in the methylation profile of the promoters between these two treatments. The cytosine context comparison at each time point and the methylation of individual bases is presented in Supplementary Figures S6, S7.

Sequencing Platform Comparisons
Three different sequencing platforms were compared in this study with the same bisulfite-treated DNA sample used from the infiltrated N. benthamiana leaves. Sequencing was carried out at different times over the course of the comparison experiment with the sample stored at −80 • C to ensure viability. The methylation profile, regardless of the platform used, was comparable when PCR extension took place at 72 • C, indicating the reproducibility of the bisulfite-treated DNA sample on different platforms (Figure 2). The total number of reads in each experiment varied as this was dependent on the total number of unique samples pooled in the sequencing run.
Degenerate bisulfite primers targeting the T-DNA, without any overhangs necessary for NGS indexing, were first utilized and the methylation profile assessed via Sanger sequencing. The results confirmed that the T-DNA is methylated in the early stages following Agrobacterium mediated infection. As the analysis was from transiently infiltrated leaves, a heterogeneous population of differentially methylated T-DNAs are likely to exist. Thus to ensure sufficient read depth as well as reduce the costs of analyzing multiple samples, NGS was employed and first explored using the Ion Torrent PGM.
Sequencing bisulfite amplicons from the 35S-eGFP+hp time course on the Ion Torrent PGM showed an increase in methylation over time ( Figure 3A). However, at earlier time points of 2 and 3 dpi a large proportion of reads were truncated (Supplementary Table S3), most likely due to the read terminating when it approached a homopolymer track (Quail et al., 2012). This was confirmed when the fragmented reads were assembled against the reference. In the assembly, many of these reads terminated at homopolymer regions now created by bisulfite conversion of unmethylated DNA. This was especially noticeable at the region of the TATA box (data not presented). Similarly, termination of reads has been observed FIGURE 8 | Methylation analysis comparison in the time course of the AtEF1α-A4 promoter with (left panel) or without the 5 UTR intron (right panel) regulating the expression of eGFP. Percentage cytosine methylation is presented on the y-axis and individual cytosine positions in the 546 bp region indicated by the black bar on the x-axis. Sequencing was carried out on the Illumina MiSeq with the fusion protocol, n = three biological replicates. General linear model (two-way ANOVA) followed by the Tukey test was used to compare all cytosine contexts between the two treatments at each time point. Statistically significant differences, ns not significant, * P < 0.05, * * P < 0.01, * * * P < 0.001. Note the difference in the scale used for the y-axis at 2 and 3 dpi compared to 4, 5, and 7 dpi. by Quail et al. (2012) when sequencing a highly AT-rich genome using the Ion Torrent platform.
To compare across the entire length of the amplicon, full length sequences generated by the Ion Torrent were selected by excluding all reads below 395 bp and with Phred scores equal to or below Q22. This inevitably biased reads, favoring a methylated state (lower homopolymers) especially during the earlier time points as the majority of reads were truncated. This bias is noticeable with the number of reads correlating to the level of methylation. For example, at 2 dpi, after filtering and employing a lowered Phred score of Q12 only 1143 out of 290,883 reads were available for analysis (0.39%) compared to 6734 out of 457,552 reads after filtering at 14 dpi with a higher Phred score of Q22 (1.47%). The methylation profile of this later time point at 14 dpi was then compared using a lower Phred score of Q12. Lowering this parameter resulted in 20,010 sequences (4.37% of available reads). This increase in the total number of reads did not result in a different observed methylation profile with 31% (CG), 64% (CHG), and 70% (CHH) cytosine methylation with Q22 compared to 32% (CG), and unchanged CHG and CHH cytosine methylation with Q12 (Figure 2). However, increasing the number of available reads by lowering the Phred score at earlier time points resulted in a lower methylation profile with sequences having a higher proportion of unmethylated cytosines (data not presented). As such, the Ion Torrent PGM is not suitable for highly unmethylated samples due to its bias of read termination at homopolymers. In addition, we found that less than 5% of all reads were available for methylation analysis after length and quality filtering ( Table 1).
To utilize a higher proportion of the raw reads the bisulfite amplicons were sequenced on the Illumina MiSeq. Here two approaches were undertaken differing in the procedure used for adapter and index addition. Sequence reads originating from the MiSeq had a mean Phred score of approximately Q30 and this was used as a selection criterion. The first approach, tagmentation, randomly fragmented the bisulfite amplicons and then incorporated the Illumina Nextera XT v2 indices. As a result, the full length PCR product was only sequenced if tagmentation occurred at the product ends. Using this approach, raw reads totalled 374,865 sequences from the 14 dpi+hp sample, which were reduced to 80,940 (22% of available reads) after filtering with a minimum length of 250 bp out of the 418 bp (60%) and a Phred score of Q30 (Table 1).
This tagmentation approach was compared to the Ion Torrent sequencing platform for all time points in the 35S-eGFP+hp time course (Figure 3). The observed methylation profile trend is similar for both platforms showing an increase in methylation over the time course. Based on the MiSeq tagmentation results, there was no correlation between the number of reads and level of methylation as seen with the Ion Torrent. This indicates that sequencing on the MiSeq, unlike the Ion Torrent, did not cause the sequence to prematurely terminate at earlier time points due to homopolymers.
There is however a reduced methylation profile in the earlier time points (2-5 dpi) when analyzed using the Illumina platform. This may be because the highly unmethylated (AT-rich) sequences, previously excluded due to sequence fragmentation and quality score filtering, are now available for analysis. This difference is not seen at the later time points (7-21 dpi) most likely due to a higher proportion of highly methylated T-DNAs present in the amplicon. The benefit of using Tagmentation is that degenerate primers, already validated via Sanger sequencing, may be utilized when full length sequences are not needed. Stably integrated transgenes where the methylation marks are fixed will be suitable for such a sequencing approach.
To attain full length reads, a fusion approach involving adapters included on the degenerate primers as overhangs was used. The unique indexes were then added to the adapter sequences in a second round of PCR. Using this approach, the raw reads from the 14 dpi+hp sample totalled 19,942 sequences. After filtering for full length reads (395 bp) and a Phred score of Q30 there were 15,027 sequences (75%) available for methylation analysis ( Table 1). This result showed the greatest improvement over both the Ion Torrent and tagmentation approach. At the 14 dpi+hp time point a reduced methylation profile of 31% (CG), 58% (CHG), and 64% (CHH) was observed in the tagmentation protocol when compared to the fusion protocol [31% (CG) 62% (CHG) and 70% (CHH) cytosine methylation]. This reduction was only evident in the CHG and CHH contexts. These differences could be due to the selection criteria of shorter 250 bp length reads compared to 395 bp.

Extension Temperature
We determined that the intensity of T-DNA methylation in transiently infiltrated N. benthamiana leaves is dependent on the PCR extension temperature. Lowering the extension temperature between 65 and 68 • C rather than the traditional 72 • C for Taq polymerase results in efficient amplification of AT-rich samples (Yang et al., 2015). As bisulfite converted DNA is similar to AT-rich samples, when we lowered the extension temperature we found PCR bands of a higher intensity similar to that observed by Yang et al. (2015) (Figure 4B). In addition, the presence of primer-dimers was also reduced and this was more prominent from samples collected at the earlier time point of 2 dpi (Supplementary Figure S8). This indicates that more, possibly unmethylated T-DNA templates (AT-rich) were being efficiently amplified in the bisulfite converted DNA. A similar methylation profile was seen by both the suppliers (KAPA and NEB) at 67.6 • C indicating that the extension temperature and not the PCR components affect the deduced methylation profile. In addition, the EpiMark Hot Start Taq polymerase (NEB), specifically formulated to amplify bisulfite converted templates has the recommended extension temperature of 68 • C and as such was not tested at 72 • C. These results indicate that there is a PCR bias toward methylated DNA sequences when extension takes place at 72 • C when a range of differentially methylated T-DNA sequences exist within the DNA isolated from the infiltrated leaf patch. It is therefore likely that there may be fewer methylated T-DNAs in the transiently infiltrated leaf patch than the amplification profile generated from a 72 • C PCR extension would suggest. To get a true sense of plant DNA methylation with reduced bias, methylation analysis of the sample should be performed without bisulfite conversion or PCR. With the improvement in long read NGS sequencing algorithms such as those employed by the Oxford Nanopore and Pacific Biosciences PacBio, such methylation analysis of plant DNA can one day accurately take place (Simpson et al., 2017). Although these algorithms are currently in their infancy, and are therefore inaccurate in differentiating methylated and unmethylated targets from a heterogeneous population (Barros-Silva et al., 2018). Until such a time, we suggest that lowering the PCR extension temperature following bisulfite conversion be taken into consideration when whole genome bisulfite sequencing (WGBS) is employed. This would be particularly relevant when treatments targeting a change in global methylation are induced such that a range of differentially methylated targets exist.

T-DNA Methylation Evident Over a Time Course Regardless of the Experiment Undertaken
Depending on the length of time with the vir gene inducer, acetosyringone, prior to co-cultivation, transcription from T-DNAs can occur within 24 h (Narasimhulu et al., 1996). During this time the T-DNA must be excised from the Agrobacterium Ti plasmid, be transported into the plant cell, enter the nucleus and become double-stranded T-DNA (dsT-DNA) prior to transcription at a detectable level (Narasimhulu et al., 1996). Typically, transcription from infiltrated T-DNAs is known to peak around 2-3 dpi (Goodin et al., 2008;Bally et al., 2018) and this was also evident in our 35S-eGFP time course experiment ( Figure 1C). Thus, if dsT-DNA can act as the substrate for transcription it could also act as a substrate for T-DNA methylation. Methylation of the promoter is known to disrupt the binding sites of transcription factors and RNA polymerase II (Chan et al., 2005;Brenet et al., 2011) and can lead to TGS. To determine the extent and time taken for T-DNA methylation in transiently infiltrated N. benthamiana leaves a time course assessing T-DNA methylation was carried out. T-DNA methylation was detected at 2-3 dpi in all time course experiments [35S-eGFP with and without hpRNA ( Figure 5) and AtEF1α-A4 with and without the 5 UTR intron (Figure 7)]. After 2-3 dpi, regardless of the experiment, a steady rate of T-DNA methylation accumulated over the first week post-infiltration. In the 35S-eGFP time courses this steady rate of methylation accumulation appears to plateau after 14 dpi (Figure 5). This plateau at the later time points may be due to the age of the leaf where regular biological functions, such as continued de novo and maintenance methylation, are not as active. It is already known that older N. benthamiana leaves have reduced leaf weight and protein content (Robert et al., 2013), and that in N. tabacum there is decreased potential for protein synthesis in older leaves compared to younger leaves (Buyel and Fischer, 2012).

Methylation Distribution
The methylation profiles in the transient time course experiments show that methylation in the CHH context is the most prominent, followed by CHG and then CG (Figures 2, 3). This profile is in agreement with induced gene silencing and detection taking place in the same generation and is a hallmark of RdDM (Fultz and Slotkin, 2017). However, in literature where methylation has been assessed in stably integrated T-DNAs or endogenous genes, the inverse profile is seen with CG methylation being the most prominent (He et al., 2009;Haque et al., 2010;Mishiba et al., 2010;Dalakouras et al., 2012;Dadami et al., 2013Dadami et al., , 2014Du et al., 2015). The prominence of CG methylation in stable integrations is most likely due to its symmetrical context allowing for methylation maintenance. Following DNA replication, the daughter strands in the CG and CHG context are hemimethylated allowing for it to act as a template for CG and CHG methylation to be maintained by MET1 and CMT3, respectively (Wendte and Pikaard, 2017). However, in the asymmetrical context (CHH) there is an absence of the hemimethylated template. Thus, maintenance is not carried out by methyltransferase genes (MET1 and CMT3) in these contexts and possibly another signal or trigger of asymmetrical methylation maintenance is required (Chan et al., 2005). In transiently infiltrated leaves, it appears that de novo T-DNA methylation is favored in the asymmetrical contexts possibly due to the conformational changes to the DNA (Klimasauskas et al., 1994) resulting in steric hindrance when MET1 is attempting to induce de novo methylation on both DNA strands.

Addition of hpRNA Targeting eGFP
A time course containing a hairpin RNA (hpRNA) construct targeting the eGFP transgene was included in the co-infiltration solution and compared against the eGFP transgene alone (Figure 5). The hpRNA construct was included as it was believed that it would lead to dsRNA production and accelerate the T-DNA methylation such that it could be detected in the early stages post-infiltration. These dsRNA are the substrate required by the DICER-LIKE (DCL) proteins to generate the siRNAs involved in RNA directed DNA methylation (RdDM). The addition of the hpRNA did increase de novo T-DNA methylation compared to the treatment without the hpRNA and demonstrated the sensitivity of the developed assay.
Over the first week post-infiltration, there was a reduction in eGFP fluorescence and relative transcript abundance (Figure 1) coupled with a higher methylation profile in the hpRNA containing time course (P < 0.01 Figure 5). This indicated that the dsRNA production led to siRNA generation and expedited both PTGS and RdDM. However, these effects were short lived and no significant differences were observed at the later time point of 14 dpi (P > 0.05, Figure 5). During these later time points siRNA production may have been saturated such that the dsRNA produced from the hpRNA construct was no longer having an effect. Alternatively, it may be that the hpRNA was no longer being transcribed due to itself being silenced.

T-DNA Methylation From a Heterogeneous Population
In our study, we isolated total DNA from agro-infiltrated N. benthamiana leaf patches across a time course. Each sample will be a heterogeneous population of T-DNAs from within the infiltrated Agrobacterium as well as differentially methylated T-DNAs from the plant cells. The T-DNA molecules with zero to low levels of cytosine methylation are probably largely from within the Agrobacterium as a study on A. tumefaciens strain C58 in Arabidopsis galls (predominantly bacterial rather than plant cells) estimated the cytosine methylation of T-DNA (for oncogenes IaaH, IaaM, and Ipt) to be less than 1% (Gohlke et al., 2013). During the early stages following agro-infiltration (2 and 3 dpi) a higher proportion of individual T-DNAs possess zero methylation than those at later time points (5, 7, 14, and 21 dpi) (Figure 6). However, a consistent level of low methylation (0, 1, and 2 methylated cytosines) from 5 to 21 dpi (P > 0.05, Figure 6) is coupled with a transition from medium to highly methylated T-DNAs (Figure 6). Taken together with the results presented in Figure 5 and Supplementary Figures S3, S4, this indicates that individual T-DNAs that are methylated at the earlier time points continue to accumulate further methylation as time progresses rather than the apparent increase in methylation being due to loss of unmethylated T-DNA from the declining levels of Agrobacterium in the samples.

Inclusion of a 5 UTR Intron
In studies by Meng et al. (2003Meng et al. ( , 2006) the inclusion of the first untranslated exon and intron of the maize ubiquitin1 promoter improved transgene stability in a large proportion of transgenic barley. Two siblings from a homozygous T 3 population were identical in copy number and integration site. However, one sibling was silenced while the other remained transcriptionally active. Upon investigation, the silenced line had an increased and different methylation pattern at the 3 end of the promoter. Thus, in this case, intron inclusion may not provide complete methylation protection to the promoter region and screening for stable lines may still need to take place albeit at a much lower frequency and the transcriptionally active line shows that the positive effects imparted by the intron is maintained across generations. Transgenes with introns within the coding sequence (CDS) tended to more stable, delayed onset of local and systemic silencing and produced less siRNAs (Vermeersch et al., 2010;Dadami et al., 2013).
Introns that provide enhancement to the transgenes increase transcription and promote translation (Rose, 2004). In our study, we included a 5 UTR intron in the transgene and hypothesized that it would limit the spread of methylation to the promoter (transitivity) originating from the CDS by acting as a physical spacer. The elongation factor 1-alpha (EF1α) family in Arabidopsis consists of four genes (AtEF1α-A1 through to AtEF1α-A4) and all possess a 5 UTR intron. The A1 and A3 promoters with the 5 UTR introns are required for high levels of gene expression (Curie et al., 1991(Curie et al., , 1993Chung et al., 2006). However, the A4 promoter is largely uncharacterized to date and is the highest expressing member based on EST and cDNA information (TAIR). The inclusion of the A4 promoter and its 5 UTR intron resulted in a significant reduction in methylation of the promoter at 2, 3, and 4 dpi (P < 0.05, Figure 8). A reduction was also observed at 7 dpi but was not significantly different (P > 0.05). The reduction in promoter methylation indicates that the 5 UTR intron may be acting as a physical barrier to impede transitivity. Alternatively, intron splicing may compete with the DNA methyltransferases thereby reducing T-DNA methylation. When the pre-mRNA is bound in the spliceosome complex to undergo intron splicing it is not available as a template for RNA-DEPENDENT RNA POLYMERASE 6 (RDR6) involved in the PTGS pathway to produce siRNAs (Christie et al., 2011). Thus, it is also not available to produce the 24 nt siRNAs needed for RdDM. In addition, the reduction in siRNA production against the transgene due to intron splicing could also account for the increase in fluorescence and eGFP transcript (Figure 7).

CONCLUSION
One of the limitations in producing GM crops is the variability in transformation efficiency and transgene expression. Several methods have been explored to overcome this limitation as briefly discussed in the introduction. However, gene silencing still remains an issue. Because of this, large numbers of plants need to be screened to find the elite lines with stable and consistent transgene expression. This is possible for multinational companies but may be limiting for smaller research groups working on recalcitrant crops with limited plant growth space, and modest programs and budgets (Butaye et al., 2005;James, 2013).
Here we focused on DNA methylation as one of the factors contributing to the variability in producing GM crops. We report that Agrobacterium delivered T-DNAs undergo de novo methylation in the early stages following transformations into the leaves of N. benthamiana and as such may be responsible for the difficulties in obtaining consistent expression after transformation. For detecting methylation at base-pair resolution, we recommend the use of NGS to attain suitable read depth as a heterogeneous population of differentially methylated T-DNAs are likely to exist in these leaves. This transient assay, using the Illumina MiSeq platform, shows that the rate of de novo methylation can vary depending on the infiltration treatments and on the PCR extension temperature following bisulfite treatment. Mitigating de novo methylation of the T-DNA prior to integration is hypothesized to improve transformation efficiency by avoiding TGS. Thus, using the developed assay, several treatments and transgenic constructs can be rapidly and cost-effectively tested to ascertain the ideal conditions resulting in reduced de novo methylation as a way to improve transformation efficiency and potentially CRISPR-Cas9 editing in plants.