Genome Editing in Trees: From Multiple Repair Pathways to Long-Term Stability

The CRISPR technology continues to diversify with a broadening array of applications that touch all kingdoms of life. The simplicity, versatility and species-independent nature of the CRISPR system offers researchers a previously unattainable level of precision and control over genomic modifications. Successful applications in forest, fruit and nut trees have demonstrated the efficacy of CRISPR technology at generating null mutations in the first generation. This eliminates the lengthy process of multigenerational crosses to obtain homozygous knockouts (KO). The high degree of genome heterozygosity in outcrossing trees is both a challenge and an opportunity for genome editing: a challenge because sequence polymorphisms at the target site can render CRISPR editing ineffective; yet an opportunity because the power and specificity of CRISPR can be harnessed for allele-specific editing. Examination of CRISPR/Cas9-induced mutational profiles from published tree studies reveals the potential involvement of multiple DNA repair pathways, suggesting that the influence of sequence context at or near the target sites can define mutagenesis outcomes. For commercial production of elite trees that rely on vegetative propagation, available data suggest an excellent outlook for stable CRISPR-induced mutations and associated phenotypes over multiple clonal generations.

INTRODUCTION CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)-based genome editing is rapidly becoming the system of choice for targeted mutagenesis in a growing variety of woody species, including forest trees. Forest trees are an invaluable commodity, providing fiber, energy, materials and climate buffering to the global community, and CRISPR has the potential to further enhance these important traits. Previous-generation methods for gene silencing in plants rely on expression of antisense RNAs, small interfering RNAs or microRNAs to base-pair with target mRNAs for degradation, often with unpredictable and unstable outcomes (Alessandra and Shihshieh, 2010). The specificity and efficiency of CRISPR for targeted DNA mutations, and the ease of adoption in virtually any species are behind the current revolution in genomic editing (Jiang and Doudna, 2017). Meanwhile, CRISPR's popularity is driving the discovery and characterization of new CRISPR-associated (Cas) endonucleases with novel properties that make the system even more versatile (Burstein et al., 2016;Murovec et al., 2017). This review will focus on recent applications of CRISPR in woody species, with a special focus on forest trees, the mutation patterns observed at target sites, and the long-term stability of CRISPR/Cas9-edited outcomes.

CRISPR APPLICATIONS IN WOODY SPECIES
Phytoene desaturase (PDS) has been a popular marker for evaluating CRISPR in new study systems (Table 1). Its mutation disrupts chlorophyll biosynthesis, allowing for visual assessment of knockout (KO) efficiency. CRISPR/Cas9-induced albino mutants have been reported in poplar (Fan et al., 2015), citrus (Jia and Wang, 2014;Zhang et al., 2017), apple (Nishitani et al., 2016), grape (Nakajima et al., 2017), cassava (Odipio et al., 2017), coffee (Breitler et al., 2018), and kiwifruit . Successful implementation of CRISPR has also been demonstrated by targeting potential developmental and biosynthesis pathway genes in grape (Ren et al., 2016) and the tropical tree Parasponia andersonii (van Zeijl et al., 2018; Table 1). New CRISPR reagents have been developed to expand genome editing capabilities. One such reagent, SaCas9 from Staphylococcus aureus, was shown to effectively generate mutations in Duncan grapefruit (Jia et al., 2017a). Compared to the most commonly used SpCas9 from Streptococcus pyogenes, SaCas9 is considerably smaller and recognizes a distinct 5 -NNGRRT protospacer adjacent motif (PAM) sequence (versus 5 -NGG of SpCas9). Using alternative CRISPR/Cas systems such as SaCas9 can increase the number of potential guide-RNA (gRNA) target sites, especially in AT-rich regions which may facilitate promoter editing.
Besides proof-of-concept studies, the CRISPR/Cas9 system has been used to develop disease resistant fruit trees with promising results ( Table 1). The devastating citrus canker disease is caused by Xanthomonas citri subsp. citri (Xcc) through effectoractivation of a canker susceptibility gene LOB1 of the Lateral Organ Boundaries transcription factor family (Hu et al., 2014). When the LOB1 promoter was targeted by CRISPR/Cas9 to disrupt the effector-binding element, canker symptoms after Xcc infection were reduced in Duncan grapefruit (Jia et al., 2016) and Wanjincheng orange (Peng et al., 2017). CRISPR-KO of LOB1 also increases Xcc resistance in Duncan grapefruit (Jia et al., 2017b). KO-mutations in other susceptibility genes for powdery mildew and fire blight disease have also been achieved in grape and apple protoplasts, respectively (Malnoy et al., 2016), potentially allowing for the regeneration of disease-resistant plants. Several WRKY transcription factors involved in defense regulation have also been targeted for mutagenesis. CRISPR-KO of two positive regulators PtrWRKY18 and PtrWRKY35 compromised resistance to Melampsora rust in Populus , whereas KO of grape VvWRKY52 increased resistance to necrotrophic Botrytis cinerea (Wang X. et al., 2018).
To date, the greatest progress in woody species has been made with poplar, the first stably transformed tree to be genomeedited by CRISPR with high efficiency (Zhou et al., 2015). Allelesensitive bioinformatics resources to facilitate genome editing in heterozygous species quickly followed, again based on the poplar system . The majority of CRISPR studies in poplar have targeted phenylpropanoid metabolism or cell wall traits ( Table 1). Mutations of individual 4-coumarate:CoA ligase (4CL) genes decreased the levels of structural (lignin) or non-structural (proanthocyanidin) phenylpropanoid polymers. CRISPR-KO of MYB transcription factors either increased (PtoMYB156 and PtrMYB57) or decreased (PtoMYB115 and PtoMYB170) phenylpropanoid flux, affecting in turn lignin deposition (PtoMYB156 and PtoMYB170) or flavonoid accrual (PtrMYB57 and PtoMYB115), respectively (Wan et al., 2017;Wang et al., 2017;Xu et al., 2017;Yang et al., 2017). Secondary cell wall synthesis was also compromised by CRISPR-KO of a brassinosteroid biosynthetic gene, supporting a role for brassinosteroids in wood formation (Shen et al., 2018). CRISPR-KO of BRANCHED1-1 (BRC1-1) and BRC1-2 belonging to the TCP family of transcription factors resulted in altered shoot architecture, and revealed an additional role of BRC2 in leaf development not previously reported for its Arabidopsis ortholog (Muhr et al., 2018). A recent study reported successful mutation of essential flowering genes in both male and female poplar genotypes (Elorriaga et al., 2018). The study also collated a large mutation dataset from over 500 transgenic events (Elorriaga et al., 2018) which should prove of value to understanding CRISPR/Cas editing patterns (see below). Although phenotypic evaluation of the flowering traits will require follow-on and multiyear studies in the field, the work underscores a powerful social application of CRISPR in containment of transgenic trees.

DIVERSE INDEL PROFILES INDICATIVE OF cNHEJ, MMEJ, AND TMEJ ACTIVITIES
Small frameshift indels are the most common repair outcomes of single gRNA-directed Cas9 cleavage in trees, with 1 bp insertions (+1), especially +T and +A, predominant in many cases, similar to findings from other plants and animals (Bortesi et al., 2016). However, considerable variations and case-dependent repair outcomes are also noted, suggesting potential influences of target site sequences and/or their genomic contexts Xu et al., 2015). Meta-analysis of mutation patterns across published tree studies is necessary to gain further insight, but that is made difficult by different reporting formats (not all studies report multi-allele data), and by the use of detection methods that differ in their sensitivity, accuracy, and allele discrimination (Sentmanat et al., 2018). We combined amplicon sequencing data from CRISPR-edited P. tremula x alba IRNA 717-1B4 (717) generated in our lab (Zhou et al., 2015) with the large 717 dataset from Elorriaga et al. (2018), along with manual inspection of other published tree studies for mutation profile analysis (Figure 1). In aggregate, +1 insertions constituted the greatest fraction of mutation types, followed by −1, and then −2, although stereotyped repair patterns are evident ( Figure 1A). Interestingly, insertions were limited to +1 and +2 across all sites, whereas deletions spanned a much broader size range, though with decreasing frequencies for larger deletions. Small mutagenic indels have often been ascribed to the classical non-homologous end-joining (cNHEJ) DNA repair pathway, but recent studies have demonstrated involvement of the alternative end-joining (alt-EJ) pathway as well (Rodgers and McVey, 2016). It now appears that cNHEJ contributes to the most common +1 insertions and other small indels, whereas larger indels are due to alt-EJ. This is based on studies where impaired cNHEJ drastically changed the repair outcomes of CRISPR/Cas9 in yeast, human cells and Arabidopsis, such that the typically predominant +1 insertions as well as other small (<3 bp) indels were greatly reduced, while rates of large indels increased, apparently independently of cNHEJ (van Overbeek et al., 2016;Shen et al., 2017;Lemos et al., 2018). In yeast, the vast majority of the +1 insertions from cNHEJ were templated from 1 bp 5 overhangs at the Cas9 cleavage site (fourth base from the PAM), and dependent on POL4, a low-fidelity X-family DNA polymerase with terminal transferase activity (Lemos et al., 2018). POL4-deficient yeast also lost +2 and +3 insertions, many of which are homonucleotides and apparently templated from the Cas9 cleavage site as well (Lemos et al., 2018). Templated insertions could also explain the majority of +1 events in a large Cas9-induced indel dataset from human cells (van Overbeek et al., 2016), suggesting a conserved mechanism underlying +1 insertions in CRISPR/Cas9-edited organisms (Lemos et al., 2018). In the combined 717 dataset, the majority of +1 insertions were +T as reported in many CRISPR studies. However, evidence in support of templated +1 insertions was weak, and appeared to be target site-dependent ( Figure 1A). Clearly, much more data with greater target site diversity and coverage are necessary before a conclusion can been drawn, but such data from trees will require significant and perhaps community-wide efforts. Regardless, the small target site collection used in our analysis supports involvement of more than one mechanism for the commonly observed +T insertions, at least in Populus.
cNHEJ-independent repair likely involves different alt-EJ pathways, including microhomology-mediated end-joining (MMEJ), single-strand annealing (SSA), or polymerase theta (POLQ)-mediated end-joining (TMEJ) (Rodgers and McVey, 2016). Both MMEJ and SSA require end resection or unwinding to expose short homologous sequences for annealing (up to ∼20-30 bp for MMEJ and longer for SSA) and subsequent repair, and always result in deletions (Sfeir and Symington, 2015). The presence of microhomologous sequences at the deletion junctions can therefore serve as evidence of MMEJ/SSA repair. Indeed, microhomologies of 1-5 bp are readily identifiable in most of the deletion (≥5 bp) alleles we examined, but are rarely found for small deletions (3-4 bp) that might have arisen from cNHEJ ( Figure 1B). MMEJ has also been associated with abnormal chromosomal translocations and inversions (Sfeir and Symington, 2015). Such modifications have been reported in

bp insertion) bar indicate the fraction that were T insertions. The fraction of templated +1 insertions that deviate from T is shown in parentheses. (B)
Representative examples of different mutation types and the potential DNA repair pathway involved in each case. PAM sequences are bold underlined, triangles denote predicted Cas9-cleavage sites, indels are shown in red, yellow-shaded regions denote microhomologies, and gray sequences in (6) and (7) were appended from P. tomentosa cDNA (GenBank accession KC954700) and P. tremula x alba 717 genomic sequences , respectively. Note, the region in (7)  several studies -including two from Populus -where two or more gRNAs were designed to target the same gene to produce large deletions (Fan et al., 2015;Elorriaga et al., 2018). Moreover, large deletions are sometimes accompanied by small insertions (Fan et al., 2015;Nakajima et al., 2017), a pattern that is characteristic of the recently discovered TMEJ pathway (Koole et al., 2014). TMEJ depends on POLQ, an error-prone A-family DNA polymerase that can extend microhomologies in a template-dependent (either in cis or in trans) or independent manner (Kent et al., 2016). TMEJ is the essential repair pathway in animal germ cells, as embryos of zebrafish polq mutants are hypersensitive to DSB-inducing treatments, with low levels of repair producing only +1 insertions (Thyme and Schier, 2016). In Arabidopsis, TMEJ is required for T-DNA integration following Agrobacterium transformation of either flowers or roots (van Kregten et al., 2016). We found evidence of in cis or in trans templated insertions in the complex indels reported for poplar and grape (Figure 1B; Fan et al., 2015;Nakajima et al., 2017), supporting an active TMEJ pathway in somatic cells of plants.
Examination of published mutation profiles of Populus and other tree species suggests differential involvement of multiple repair pathways, probably with cNHEJ contributing to +1, +2 and small (1-4 bp) deletions, MMEJ (and SSA) to larger deletions, and TMEJ to complex indels ( Figure 1B). The varying dependency of these pathways on sequence contexts (microhomologies) likely underpins the non-random nature of CRISPR/Cas9 repair outcomes reported in many studies, including trees van Overbeek et al., 2016;Vu et al., 2017;Elorriaga et al., 2018). Incorporation of microhomology modeling into the gRNA design workflow (Bae et al., 2014;Segar et al., 2015) should enable prediction of potential DNA repair outcomes for informed selection of target sites.

LONG-TERM STABILITY OF CRISPR-EDITED TREES THROUGH VEGETATIVE PROPAGATION
For many herbaceous species where CRISPR editing efficiencies are low, or where monoallelic/mosaic mutations predominate in the first-generation (T0) transformants, multi-generation progeny screening is necessary to obtain homozygous mutants (Xu et al., 2015). Although initial transmission rates vary depending on the study system and the nature of CRISPRinduced (somatic or germinal) mutations carried by the founder plant, stable mutation inheritance can be expected once homozygous lines are obtained, as reported for Arabidopsis, rice, tomato and potato (Brooks et al., 2014;Feng et al., 2014;Zhou et al., 2014;Butler et al., 2015). For woody perennials, however, the issues are rather different. Cross-generational screening is difficult to implement for transgenic trees owing to their long generation times and strict regulation of flowering transgenic trees (Strauss et al., 2015). The predominantly outcrossing nature of trees, many of which are also dioecious, adds further challenge to rapid-cycle breeding and introgression of CRISPRderived mutations into elite germplasms. While advances of early-flowering induction in contained environments (Hoenicka et al., 2014;Klocko et al., 2016) promise to accelerate progress, commercial production of many forest, fruit and nut trees relies on clonal propagation of elite genotypes. For woody perennials, therefore, it is pertinent to address long-term stability of CRISPR editing, both on-target and off-target, in vegetatively-propagated T0 transformants.
In theory, CRISPR-induced DNA modifications should lead to permanent mutations in edited cells that can be inherited mitotically during clonal propagation, yet experimental data are rare. One study used tissue culture to clone CRISPRderived mutations from T0 diploid and tetraploid potato, and reported stable maintenance of targeted mutations across clonal generations, and in three selected cases, through the germline as well (Butler et al., 2015). In that study, however, somatic mutations were prevalent in T0 plants, as fewer than half of the originally-detected mutation types were captured as single mutations in clonally-propagated plants (Butler et al., 2015). The high levels of somatic mutations likely reflect a high proportion of chimeras, a common problem in tissue culture when plants are regenerated from multiple cells, in this case, with heterogenomic modifications. Fortunately for Populus, the proven efficiency of CRISPR (Zhou et al., 2015;Elorriaga et al., 2018) means null mutations with biallelic KO can be readily obtained in T0 transformants and stably inherited through clonal propagation. Wang et al. (2017) reported faithful maintenance of PtoMYB115 mutations in tissue culture-propagated Populus tomentosa somaclones, though in one case low frequencies of new mutations not seen in the parent line were detected, indicative of chimeras. Similarly, the CRISPR editing outcomes of BRC1-1 and BRC2-1 were also stable over multiple cycles of vegetative propagation in tissue culture (Muhr et al., 2018). We have maintained a subset of the previously-characterized 717 mutants (Zhou et al., 2015) in the greenhouse for over 4 years by repeatedly cutting back the original transformants and/or propagation using rooted cuttings. The reddish-brown wood discoloration of lignin-reduced 4cl1 mutants has been stable in all re-sprouted shoots or clonallypropagated plants. Repeated amplicon-sequencing of randomly selected lines re-confirmed the targeted 4CL1 mutations 4 years later, with no off-target changes to the paralogous 4CL5 (Supplementary Table S1). Another group of transgenic plants harbors a non-functional gRNA for 4CL5 due to SNPs (one per allele) between the genome-sequenced P. trichocarpa and the transformation host 717 that prevented Cas9 cleavage as confirmed by amplicon sequencing (Zhou et al., 2015). It should be noted that one of the 717 SNPs alters the PAM site from NGG to NGA, the latter is a non-canonical PAM of SpCas9 thought to cause off-target cleavage in human cells (Zhang Y. et al., 2014b). Reanalyzing this group of plants will inform as to whether the imperfectly-matched 4CL5-gRNA exhibited any off-target activity over the long term. We found no evidence of CRISPR/Cas9 cleavage after 4 years of coppicing and regrowth (Supplementary Table S1). These findings echo other tree studies that showed no or very rare off-targeting (Jia et al., 2017a;Nakajima et al., 2017;Elorriaga et al., 2018; see also Table 1), as well as reports from Arabidopsis, rice and tomato based on whole-genome re-sequencing (Feng et al., 2014;Zhang et al., 2014a;Peterson et al., 2016;Rodríguez-Leal et al., 2017). The data provide support for long-term stability and specificity of CRISPR/Cas9-mediated mutagenesis, with extremely low off-target potential during vegetative propagation in poplar.

BROAD-SPECTRUM MUTAGENESIS BEYOND KO
Nullizygous mutations harboring either identical (homozygous) or distinct (heterozygous) mutations in all alleles of the genome are the ideal repair outcomes for gene KO investigation. However, monoallelic, in-frame and/or mosaic mutations can expand the phenotypic spectrum to enhance the power of functional analysis. For instance, transgenic grapevine with monoallelic mutations of a defense-related WRKY gene exhibited intermediate levels of disease resistance between WT and biallelic mutants (Wang X. et al., 2018). Similarly, monoallelic or in-frame mutations of PDS led to partial albino phenotypes in both poplar and apple (Fan et al., 2015;Nishitani et al., 2016). Given the abundance of duplicate genes in plant genomes, and the proven successes of CRISPR in multi-allele as well as allele-specific editing (Jia et al., 2016;Elorriaga et al., 2018), there is exciting potential to exploit CRISPR for development of allelic series mutations to address functional redundancy of duplicate genes or tandem repeats, and to investigate the allele-dose response of agronomic traits. Thus, the ability to generate novel germplasms is invaluable not only for tree improvement but also for basic functional genomics research.
In contrast to CRISPR-mediated KO, site-specific gene targeting or replacement remains a major challenge in plants, due to the inefficient homology-directed repair pathway. Geminivirus replicons have been shown to increase site-specific gene knockin (KI) efficiencies by orders of magnitude in tobacco, tomato and hexaploid wheat (Baltes et al., 2014;Čermák et al., 2015;Gil-Humanes et al., 2017). In animal systems, the MMEJ and SSA pathways, along with a new CRISPR/Cpf1 system have been harnessed for targeted KI with success (Sakuma et al., 2016;Tóth et al., 2016). These and other emerging approaches represent promising options for developing efficient KI systems in trees. Finally, many economically important tree species or genotypes remain recalcitrant to transformation and/or tissue culture regeneration, hindering applications of CRISPR. Recent breakthroughs in morphogenic regulator-mediated regeneration (Lowe et al., 2016(Lowe et al., , 2018 have already stimulated similar research in trees. Direct delivery of pre-assembled Cas9-gRNA ribonucleoproteins into protoplasts for genome editing as already deployed in apple and grape offers a transgene-free alternative to Agrobacterium transformation (Malnoy et al., 2016). At the present time, however, protoplast regeneration for other tree species remains a challenge. There is strong incentive to overcome this challenge since avoiding the footprint of foreign DNA and the associated negative perceptions will improve the outlook for integration of CRISPR technology with commercial deployment of designer trees.

AUTHOR CONTRIBUTIONS
C-JT conceived the idea. WPB, DC, and C-JT collected background information and analyzed data. WPB and C-JT wrote the manuscript with contributions from DC. All authors approved the manuscript.

FUNDING
The CRISPR research and associated genomic resource development in the Tsai Lab were supported by the National