Progresses, Challenges, and Prospects of Genome Editing in Soybean (Glycine max)

Soybean is grown worldwide for oil and protein source as food, feed and industrial raw material for biofuel. Steady increase in soybean production in the past century mainly attributes to genetic mediation including hybridization, mutagenesis and transgenesis. However, genetic resource limitation and intricate social issues in use of transgenic technology impede soybean improvement to meet rapid increases in global demand for soybean products. New approaches in genomics and development of site-specific nucleases (SSNs) based genome editing technologies have expanded soybean genetic variations in its germplasm and have potential to make precise modification of genes controlling the important agronomic traits in an elite background. ZFNs, TALENS and CRISPR/Cas9 have been adapted in soybean improvement for targeted deletions, additions, replacements and corrections in the genome. The availability of reference genome assembly and genomic resources increases feasibility in using current genome editing technologies and their new development. This review summarizes the status of genome editing in soybean improvement and future directions in this field.


INTRODUCTION
Soybean [Glycine max (L.) Merr.] is becoming an important agricultural commodity and grown worldwide for feed and food products. It is one of major protein source for human nutrition as food, as well as feed for livestock and fish since soybean seed contains about 40% protein and about 20% oil (Singh, 2017). Recently, soybean is also used as a source of biofuel. Soybean root and a rhizobacterium, Bradyrhizobia japonicum, can normally establish rhizobia-legume symbiosis which fixes nitrogen and improves soil quality. The United States, Brazil, and Argentina produced more than 80% of global soybean annually. China and India are the two major soybean growing countries in Asia 1 . Soybean has been one of the fastest growing major crops for several decades and its production is boosted recently by increasing demand from China. About 30% of world's production is consumed in China which is becoming the largest soybean importer in the world (Hart, 2017). Therefore, the global soybean market is driven by two major producers (United States and Brazil) and one major consumer (China) (Gale et al., 2019).
Taxonomy of the genus Glycine is classified and well characterized using morphological evaluation, cytogenetic analysis and molecular phylogenetics (Chung and Singh, 2008). The genus includes two subgenera, one of which contains cultivated soybean (G. max) and its wild relative (G. soya), both of which are annual Asian species and are descendants from an ancient genome duplication events (Shoemaker et al., 2006). Therefore, soybean is classified as a paleopolyploid and has 40 chromosomes (2n = 40) (Karpechenko, 1925), which are small size (1.42-2.84 lm) with similar and distinguishing morphology (Sen and Vidyabhusan, 1960). Twenty molecular linkage groups (MLGs) have been developed using primarily restriction fragment length polymorphism (RFLP), amplified fragment length polymorphism (AFLP) and simple sequence repeat (SSR) loci (Xia et al., 2007) and single nucleotide polymorphisms (SNPs) (Choi et al., 2007;Akond et al., 2013). Soybean has very limited genetic diversity since most cultivars are found to be selected from the original same group of progenitors (Singh, 2017). The limitation of genetic resource is the major challenge for soybean improvement to overcome the significant constraints for farming and production caused by climate changing, reduced agricultural land availability and increased biotic and abiotic stresses. Therefore, improved molecular-based breeding and genetic engineering technologies are necessary to break through the bottleneck for further improvement of soybean agronomical traits and to guarantee yield increases for satisfying future demands of soybean in global market to feed nearly 10 billion people by 2050. Except for introducing genetic source from wild relatives, scientists have continuously worked to modify soybean genome using molecular genetics and genomics approaches. Transgenesis based biotechnologies has extensively been used in soybean to improve its agronomic traits. For the past four decades, transgenesis have been used to understand basic plant biology and can break the bottleneck of reproductive isolation, which transfers exogenous genes into elite variety background to generates novelty traits. They have been used for soybean improvement and made soybean to be one of the major transgenic crops grown commercially in the world. However, like other transgenic crops, the random integration of transgenes into the host genome and multiple copies can cause unstable and off-target effects, which also cause public concern for human consumption, and commercialization of soybean as genetically modified crop is restricted by tedious and costly regulatory evaluation processes.
Mutagenesis is another way to expand soybean germplasm. Conventionally, soybean gene can be mutated using random mutagens including radiation such as X-rays, fast neutrons, and gamma rays, chemicals such as EMS (ethyl methanesulfonate) and NMU (N-nitroso-N methylurea), and biological mutagenesis such as T-DNA insertion and transposons O'Rourke et al., 2017). Random mutagenesis is heritable and stable but requires intensive screening and specific techniques such as targeting induced local lesions in genomes (TILLING) to identify mutant phenotypes. Such techniques are time consuming and can be expensive . In most cases, it is impossible to obtain specific alleles known to confer certain phenotypes due to imprecise mutation. In the last 2 decades, site-directed nucleases (SDNs) or site-specific nucleases (SSNs) based new biotechnologies such as Zinc Finger Nucleases (ZFNs), Transcription Activator-Like Effector Nucleases (TALENs) or the more recent Clustered Regularly Interspaced Short Palindromic Repeat (CRISPR), has been developed for mutagenesis. As very useful tools, multiple SDN platforms have been integrated into the plant breeding programmers  including soybean. SDNs have been developed for genome editing (GE) and have induced mutations with unprecedented precision, which includes all type mutations existed during crop evolution processes including domestication and breeding. Hence, novel genome editing technologies are expected to accelerate the speed of breeding programs as the main option for revealing gene function and producing new varieties. In this review, we will summarize the status of soybean genome editing, address current bottleneck and discuss future perspectives in this field.

GENOME EDITING TOOL DEVELOPMENT AND AVAILABILITY FOR PLANT GENOME EDITING
The basic concept of SDNs based genome editing is that nucleases can be designed to recognize the desired target site in DNA and induce a cleavage which make a double stranded break (DSB), then the DSB can be naturally repaired by the DNA own repair mechanism in cell either by endogenous repair pathways through non-homologous end-joining (NHEJ) or through homologousdirected repair pathways (HDR) (Ran et al., 2017; Figure 1). As illustrated in Figure 1, the NHEJ repair is the error prone pathway and possible to induce random insertions and deletions which disrupt the reading frame and lead to targeted gene knockouts; the HDR pathway, a precise exchange of homologous sequence involved process using an externally added homologous DNA repair template, results in gene replacement or targeted insertion (Voytas and Gao, 2014;Bortesi and Fischer, 2015). GE technology is becoming increasingly diversified and sophisticated (Chen F. et al., 2019). Based on genome editing difference processes occur during repair DNA breaks, the basic outcomes of genome editing can be divided into three categories (Sprink et al., 2016). SDN1 (the approach involves DNA breaks repair through DNA repair mechanisms in the host cellular without using an added repair template), SDN2 (the approach involves the break repair via HR using an added homologous repair template), and SDN3 (the approach involves DNA break repair via either HDR or NHEJ pathway using an added DNA template containing nonhomologous sequences but with homologous ends). Emerging of new technology makes base editing and transcriptional regulation of target gene as additional outcomes (Figure 1).
Zinc Finger Nucleases and TALENs are earlier GE platforms generations and each customized ZFN or TALEN protein needs to be genetically manufactured to generate DSBs at the targeting location, and the GE using these platforms has been demonstrated in many plants (Joung and Sander, 2013). However, some drawbacks of these platforms has limited their applications, which include the difficulty to engineer ZFNs and TALENs due to the highly repetitive sequences and complex nature of the interaction between ZFN and DNA (Bortesi and Fischer, 2015), and the complication to make them due to minimal requirement of a pair of ZFNs or TALENs for both FIGURE 1 | Genome editing platforms and editing outcomes. Each editing platform (arrow) and its outcomes (rectangular) are coded with the same color. ZFN, zinc-finger nuclease; TALEN, transcription activator-like effector nuclease; CRISPR, clustered regulatory interspaced short palindromic repeat; DSB, double strand breaks; SSB, single strand breaks; Outcomes of GE created by site-directed nucleases (SDN) includes: SDN1-the approach involves DNA breaks repair through DNA repair mechanisms in the host cellular without using an added repair template; SDN2-the approach involves the break repair via HR using an added homologous repair template; and SDN3-the approach involves DNA break repair via either HDR or NHEJ pathway using an added DNA template containing nonhomologous sequences but with homologous ends. the up-stream and the down-stream regions of the targeting site (Beumer et al., 2013). Many ZFNs or TALENs would be required to achieve multiplexing which edit several targets simultaneously. Since zinc finger nucleases were used in tobacco in Wright et al. (2005), various GE technologies have gradually been adapted in plant along with their development, such as TALENs which editing activity was confirmed in plant (Cermak et al., 2011). However, the application of GE in plant mutagenesis and trait improvement using ZFNs and TALENs have been restricted due to the technique limitation of the ZFNs and TALENs (Petolino, 2015;Weeks et al., 2016). The CRISPR/Cas system, a newly developed GE platform (Doudna and Charpentier, 2014), comprises Cas proteins and a single guide RNA (sgRNA) with a hairpin structure targeting a 20-base pair (bp) DNA sequence site (Figure 1). Based on phylogenetic, structural and functional characteristics of their Cas genes and the nature of the interference complex, CRISPR/Cas systems have been classified into class1 and class 2 systems. Class 1 systems involve in multi-Cas protein complexes for interference and are further divided into type I, III and IV, whereas Class 2 systems involve in interference with single effector proteins in the pre-CRISPR RNA (pre-crRNA) processing and is composed of subclass type II, V, and VI, which include Cas9 (type II), Cas12a-e (type V) and Cas13a-d (type VI) . The type II CRISPR/Cas9 system, which is based on RNA-guided interference with DNA, has been adapted for genome editing (Koonin et al., 2017) and has been the first system confirmed to cleave DNA in vitro and in eukaryotic cells (Jinek et al., 2012;Gilbert et al., 2013;O'Connell et al., 2014). It is a revolutionized mutagenesis system due to its easy design, flexible and easy operation, robust activity and cost saving property (Doudna and Charpentier, 2014;Murugan et al., 2017). Cas9 is required to assemble with the single guide RNA, the complex then recognize and bind to the targeting DNA sequences with a protospacer adjacent motif (PAM), finally the Cas9 nuclease induces a DSB in the 20 bp targeted DNA sequence adjacent to the PAM. SpCas9 (Streptococcus pyogenes Cas9) based CRISPR/Cas9, which recognizes a PAM (NGG) (N means any nucleotide), is the most commonly used GE system. CRISPR system has been evolved during the last decade. Recent advance of CRISPR technology includes: The Expanding GE Toolbox Comprises Precise Cas9 Variants and Orthologues, Wider Genome Accessibility by Recognizing a Simpler PAM Cas9 enzymes originated from other bacteria have been discovered and nearly 10 Cas9 orthologues have been evaluated and developed as tools for genome editing, which includes Staphylococcus aureus (SaCas9), Campylobacter jejuni Cas9 (CjCas9), and others ( Table 1). Natural Cas9 proteins identified can be modified to recognize different PAMs and engineered Cas9 with various PAMs is available for GE such as EQR-Cas9 (NGAG PAM), SaKKH-Cas9 (NNNRRT), SpCas9-NG (NG), VQR-Cas9 (NGA), and others ( Table 1). Many of these SpCas9 variants have been used in GE for plants (summarized in .

The Discovery of Various Cas Enzymes With Unique PAMs and Engineering of CRISPR/Cas Components for Improved GE
Recently, a new class 2 type V-A Cas enzyme Cpf1 (formally known as Cas12a) from Prevotella and Francisella 1 was identified in the type II CRISPR systems and functionally characterized (Zetsche et al., 2015). Unlike Cas9 which prefer G-rich PAM, Cpf1 recognizes a T-rich region in target DNA sequence and induces a DSB with sticky-ends with a 5-nucleotide 5 overhang downstream from the PAM (TTTC) site, which is able to make DNA DSB continuously and may result in insertion mutation through NHEJ pathway. Cpf1 owns both DNAase and RNase activity, which allows process a CRISPR array for multiplex. Cpf1 does not need tracrRNAs for crRNA biogenesis and it also has a shorter guide crRNA with about 43 bp compared to ∼80 bp of Cas9 sgRNA which lead easy synthesis and engineering of crRNA (Zetsche et al., 2015). Cpf1 variants such as FnCpf1 from F. novicida, AsCpf1 from Acidaminococcu ssp., and LbCpf1 from Lachnospiraceae bacterium, have been used in genome editing for many plant species . To broaden the target ranges for Cpf1 to recognizes PAMs different from the TTTV identified initially, modified variants with different PAM recognition have been generated and used for GE, such as AsCpf1-RR (TYCV PAM), and -RVR (TATV) ; LbCpf1-RR (CCCC and TYCV) and LbCpf1-RVR (TATV) (Li et al., 2018); FnCpf1-RR (CCCC and TYCV) and FnCpf1-RVR (TATV) Li et al., 2018;Zhong et al., 2018). Orthologues from diverse bacteria species have been discovered such as Mb3Cpf1, BsCpf1, and TsCpf1 ( Table 1). The utility of Cas12a in genome editing is expending (Zetsche et al., 2015(Zetsche et al., , 2017Teng et al., 2019). A specific group of class 2 type V CRISPR enzyme named as Cms1 was identified from Microgenomates and Smithella. They are smaller than Cpf1, recognize AT-rich targeting sequence with PAM like TTN (SmCms1) and make cleavage without requirement of a trans-activating crRNA. Successful GE with Cms1 was confirmed  Frontiers in Plant Science | www.frontiersin.org in rice (Begemann et al., 2017). A distinct type V-B system from Alicyclobacillus acidiphilus (AaCas12b) (formerly known as C2c1) has also been functionally defined and adapted to editing mammalian genomes, in which the nuclease is able to active at temperature between 31 to 59 • C (Teng et al., 2018). Similar to Cpf1, Cas12b recognizes a distal 5 -T-rich PAM, but it requires both crRNA and tracrRNA for target cleavage. Like in Cas9 system, a single guid RNA can be engineered for Cas12b (Cong et al., 2013). More and more Cas enzyme orthologous have been discovered with specific PAM recognition and cleavage outcomes such as CasX (known as Cas12c) with a 5 -TTCN PAM and an overhang of approximately 10nt at sticky-end, a deactivated CasX with a mutations introduced to the RuvC domain and CasY (also known as Cas12d) with a 5 -TA PAM recognition and dsDNA cleavage (Burstein et al., 2017;. Cas13a (formerly C2c2) a class 2 type VI CRISPR system is characterized and modified to target RNA precisely (Abudayyeh et al., 2016. Unlike Cas9, Cas13a owns 2 enzymatically distinct ribonuclease activities required for RNA degrading process. One is to catalyze crRNA maturation, whereas the other RNase is responsible to make RNA-guided singlestranded RNA (ssRNA) cleavage using the catalytic sites in the two separate domains of higher eukaryote-and prokaryotebinding (HEPN). Cas13 variants have been identified, such as LshCas13a with a protospacer flanking sequence (PFS) of H (H denotes A, U or C) to recognize a 22-28nt target sequence, whereas LwaCas13a and PspCas13b without requiring specific PFS Cox et al., 2017). The feasibility of Cas13a RNase activity for processing crRNA arrays make the system to target multiple RNAs simultaneously (Abudayyeh et al., 2016). Most of these Cas enzymes have been used in plant GE . For instance, Cas13a has been used to plant virus resistance (Aman et al., 2018;Zhang T. et al., 2019).

An Invention of Cytidine or Adenine Base Editors and Development of New Base Editors
Base editing systems, consist of cytidine base editors (CBE) and adenine base editors (ABE), depend on CRISPR system and can make specific base changes without involving DNA DSBs and going through HDR pathway with a donor. The cytosine base-editor (CBE) system, composed of an catalytically inactive CRISPR/Cas9 domain including a guide RNA and a Cas9 nikase (nCas9) or dead Cas endonuclease (dCas9) fused with a cytidine deaminase inhibitor, makes conversion of a targeted cytosine into an uracil at targeting site in genomic DNA, which subsequently is replaced by a thymine during a DNA synthesis (Komor et al., 2016; Figure 1). CBE1 is the original cytosine base editor with which the desired cytosine at the targeting site in DNA is deaminized first and converted to uracil, leading to a U-G mismatch, which then can be repaired and substituted with a T-G in a newly synthesized strand through DNA repair pathway. Based on CBE1, an uracil glycosylase inhibitor (UGI) is fused to the dCas9 or nCas9 that inhibits uracil DNA glycosylase and prevent the transformation of uridine into an apurinic/apyrimidinic site, which makes CBE2. CBE3 is constructed with 2 fusion domains of a nickase Cas9 D10A, one with a rat cytosine deaminase rAPOBEC1 (apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like) fused to its N terminus using a 16-amino acid XTEN linker and the other with a UGI fused to the C terminus using a 4-amino acid linker (Komor et al., 2016). The significant increase of the base conversion efficiency using BE3 is mainly due to substitution of the Cas9 in BE2 with a nickase dCas9 (nCas9) which nicks the untargeted strand in BE3. Based on BE3, BE4 (S. pyogenes Cas9-derived SpBE4 and S. aureus Cas9-derived SaBE4) is made by replacing the 16 aa linker with a 32-aa linker for rAPOBEC1 fused to Cas9D10A and using a 9-aa linker to fuse UGI one each to C and N terminal of Cas9 nickase, respectively, which enable repairing the non-edited strand using the edited strand as a template in cells and reducing undesired by-products through inhibiting base excision repair by using UGI (Komor et al., 2016. Efficient targeted C-to-T base editing with expanded PAM recognition in CRISPR/Cas9 system has also been achieved by fusing nCas9 to other orthologues from cytidine deaminase family members including APOBEC1, activation induced cytidine deaminase (AID), Petromyzon marinus cytosine deaminase 1 (PmCDA1) and APOBEC3A (antiviral cytidine deaminases of the human APOBEC3 (hA3)) (summarized in Chen K. et al., 2019). The ABE system, composed of Escherichia coli TadA (transfer RNA adenosine deaminase) and dCas9 or nCas9 (D10A), makes targeted adenine (A) change to Guanine (G) base editing in genomic DNA. The first-generation ABE, ABE1.2, is developed by fusing the TadA, evolved from E. coli TadA which catalyzes adenine deamination, to a nCas9 . The later generation ABEs are made using various TadA mutations such as TadA * , and fusion of the heterodimeric TadA (TadA-TadA * ) with nCas9 (D10A) made modified ABEs including ABE7.10, enabling A to G targeted base editing with increased efficiency and specificity in a wide range of targets (Mishra et al., 2020). RNA base editors (RBE) are developed by combining a catalytically inactive Cas13 (dCas13) with a naturally occurring adenosine deaminase acting on RNA (ADAR) for programmable adenosine to inosine substitution in mammalian cells . The later RBE version such as REPAIRv2 shows high specificity than previous one. RBE has been used for editing mammalian cells

Newly Developed Prime Editing Method
Recently David Liu's group at Broad Institute of Harvard developed a new editing method based on CRISPR system. The new prime editing (PE) system is composed of a Cas9 nikase conjugated with a reverse transcriptase (RTase) and a prime editing guide RNA (pegRNA). The pegRNA contains a classic sgRNA with a Cas targeting spacer region, a primer binding site (PBS) for reverse transcription (RT) initiation and a RT template with edits for targeting DNA changes. The pegRNA leads the prime editor to the target site in genomic DNA, Cas9 nickase generates a nick adjacent to the PAM, RTasemediated primer extension from the 3' end of the nick using the RT template with edits for targeting DNA changes. The reverse transcriptase element reads the RNA extension following the sequence designed for mutation in the template and newly synthesized strand is able to incorporate the corresponding DNA nucleotides with edits into the target sequence (Anzalone et al., 2019; Figure 1). Prime editing can achieve all 12 possible base changes or small indels or some combination of all of these . There are few restrictions on the edited sequence with this method. Prime editing is also able to introduce precise single base substitutions in target sequences and achieve changes with all types which is hard for current base editors to accomplish. PE induces less off-site targeting changes compared to other GE platforms. The versatile and precise editing outcomes have been confirmed in rice and wheat Lin et al., 2020;Tang et al., 2020;Xu et al., 2020). Since the CRISPR system established in plant in Li et al. (2013), Nekrasov et al. (2013), Shan et al. (2013), much progress in basic plant science and crop improvement have been made, and various editing outcomes can be achieved in plant with adoption of the new CRISPR approaches including CRISPR/Cpf1 and other orthologues , nucleotide substitution tools for base editing (Mishra et al., 2020) and prime editing Lin et al., 2020;Tang et al., 2020;Xu et al., 2020). CRISPR system has rapidly superseded the earlier editing systems because the CRISPR system-based technologies are robust, low cost, simple to operate, easy to use and were widely adopted in plants . These new approaches will accelerate crop breeding with designed and accurate gene modifications directly in an elite cultivar background. The applications of GE in genetic research and variety improvement of crops have been intensively reviewed (Petolino, 2015;Weeks et al., 2016;Chen K. et al., 2019;.

GENERAL PROCEDURE OF GENOME EDITING IN SOYBEAN AND FACTORS FOR SUCCESS
In soybean, the first successful genome editing was done in hairy roots in which GmDcl4a and GmDcl4b genes were targeted using ZFNs . The first fertile GE soybean plants with mutation of GmDcl4 gene (either GmDcl4a or GmDcl4b) was also created using ZFNs (Curtin et al., 2011). Haun et al. (2014) reported the first TALENs mediated GE events with 2 target sites simultaneously. The first successful CRISPR GE in soybean was reported in Jacobs et al. (2015). Initially most work with CRISPR/Cas9 focused on establishing GE system and evaluating its targeting efficiency in hairy roots (Cai et al., 2015;Jacobs et al., 2015;Michno et al., 2015;Sun et al., 2015) and the multiplex property with CRISPR to targeting pairs of genes simultaneously was also confirmed ( Table 2). Meanwhile, the success of target gene knockout (Jacobs et al., 2015) as well as homology-directed recombination (HDR) in whole plants was achieved . Since then, CRISPR has been used as a major method for soybean genome editing ( Table 2). Cpf1 was used in soybean by Kim et al. (2017), who created mutations successfully in FAD2 paralogues using CRISPR/Cpf1 RNP system and provided possibility to recover edited soybean events without involvement of DNA integration from reagents with plasmids, suggested a future direction for GE application in soybean. Like the trend of GE platforms used in other crops, ZFNs and TALENs had very limited use in soybean, but the CRISPR system is the most popular tool and it has been used extensively in soybean for functional genomic study and trait improvement ( Table 2). A general procedure to recover GE events in soybean is illustrated in Figure 2. The key steps and factors affecting its success are: Selection of a Target Trait (Figures 2A,B) The function and property of the genes controlling the target trait should be fully understood, which includes sequence data, transcription data, copy number in target materials and variations compared with reference genome. Soybean genome sequencing and gene discovery paves the way for GE. Prediction of more than 46,000 genes in the soybean genome has been done based on a soybean reference genome assembly using DNA sequences of Williams82 (Schmutz et al., 2010) 2 . Recently, hundreds of accessions of G. max and allied species have been sequenced for more reference genomes including the recent assembly high-quality reference genome of a wild soybean W05 and a popular Chinese cultivated soybean Zhonghuang 13 (ZH13) (The Genome Warehouse 3 ) Lam et al., 2010;Chung et al., 2014;Li B. et al., 2014;Li Y. H. et al., 2014;Zhou et al., 2015;Asaf et al., 2018;Shen et al., 2018;Xie et al., 2019). Moreover, hundreds of regulatory non-coding RNA loci, such as loci for microRNAs (miRNAs) and phased small interfering RNAs (phasiRNAs), have also been characterized using the soybean reference genome assemblies (Arikit et al., 2014). All of the sequence information can be evaluated using comparative genomics to identify potentially useful genes. Since soybean is a paleopolyploid and the two duplication events occurred 59 and 13 million years ago, respectively, more than 70% of these genes have been duplicated and exist as multiple copies. It is difficult to identify genes associated with important agronomical traits such as yield, protein, oil, as well as biotic and abiotic stress tolerances, which often makes soybean breeding programs complicated (Shoemaker et al., 2006;Yin et al., 2013;Zhu et al., 2014;Lakhssassi et al., 2017;Anguraj Vadivel et al., 2018;Chen K. et al., 2019). Therefore, trait selection for soybean genome editing depends on discovery of the genes which control important agronomic traits. The main challenge facing researchers for soybean improvement has been the limitation of understanding the functions of genes and their contributions to target phenotypes of agronomic importance. Based on the current knowledge, GE in soybean has focused on traits with clear genetic background such as GmFAD2 for oleic oil content.   (2) Pathway in blue showed a GE procedure based on CRISPR system with DNA editing reagents and Agrobacterium-mediated delivery method.
each of them in a separate plasmid as shown in previous reports (Jiang et al., 2013;Li et al., 2013;Brooks et al., 2014). In most cases in CRISPR system for soybean, the Streptococcus pyogenes GmU6 promoters in soybean and Arabidopsis and found that GmU6-4, 7, 8, 10 and 11 had high performance. Compared to the AtU6-26 promoter, soybean GmU6-16-1 promoter was more efficient in simultaneous editing of multiple homoeoalleles (Du et al., 2016). sgRNA module vectors for soybean are usually based on common transformation vectors either for Agrobacteriummediated delivery or biolistic delivery methods. For multiplex, more than two sgRNA expression cassettes could be assembled in each of these vectors (Du et al., 2016;Do et al., 2019). Usually, a number of engineered editing reagents for ZFNs, TELENs, or CRISPR/Cas systems, does not show editing activity and cannot create any mutation event in vivo without knowing any cause (Li R. et al., 2019). Therefore, identification of mutations in vivo by validating sgRNA expression and nuclease activity for specific editing reagents via transient assay could save time and resources, and increase success rate before transformation for creating genome edited plants (Do et al., 2019;Bai et al., 2020). This step is especially important for those complicated GE platforms such as ZFNs and TALENs. Transient expression systems such as the hairy root induction system via Agrobacterium rhizogenes K599, callus tissue expression via biolistic delivery, agro-infiltration using leaves and protoplast transfection system, have been employed for the purpose (Figure 2). The assay for editing activity in hairy roots is a popular transient system for soybean (Curtin et al., 2011;Haun et al., 2014;Jacobs et al., 2015;Du et al., 2016;Cheng et al., 2019;Do et al., 2019;Bai et al., 2020), A. rhizogenes K599 containing editing reagents were delivered into seedlings of target genotypes and hairy roots can be recovered within 15 days. Editing activity of the reagents can be evaluated in those hairy roots. For protoplast transfection, polyethylene glycol (PEG) is used to deliver editing reagents into soybean protoplasts and then the targeted mutation events can be detected in genomic DNA extracted from the transfected protoplast after 48 h of incubation in darkness at room temperature Demorest et al., 2016). Callus tissue is one of best target tissue for biolistic delivery and the transformed callus cells can be harvested within 5 weeks for evaluating GE including HDR events Bonawitz et al., 2019 System for Recovering Target Gene Edited Whole Plants ( Figures 2H-K) The success of GE in soybean is highly dependent on availability of an efficient regeneration and transformation system. Like in other crops, biolistic and Agrobacterium-mediated soybean transformation methods have mainly been used to recover GE events ( Table 2). Agrobacterium-mediated soybean transformation combining organogenesis-based regeneration is developed for soybean transgenesis and the transformation efficiency (TE) in several protocols has been improved (Yamada et al., 2012;Hada et al., 2018;Yang et al., 2018), but it is still low (∼10%) compared to the high efficiency in rice (∼40%) (Mohammed et al., 2019). Cotyledons of soybean mature seeds were usually utilized as explants for regeneration of transformed cells. However, genotype dependency is still the major bottleneck among these protocols. For SDN1 mutations in soybean, Agrobacterium-mediated transformation system is currently an efficient and popular method (Table 2) due to convenient mature seed used as explants. GE with multiplex have been achieved in soybean to recover simultaneously target multiple genes or genomic sites in a single transformation event through either making one construct with the expression of multiple sgRNAs using a single Cas9 Do et al., 2019) or infecting target tissue with pooled Agrobacterium strains each containing one sgRNA (Kanazashi et al., 2018;Cheng et al., 2019;Do et al., 2019;Wang J. et al., 2020;Bai et al., 2020). Biolistic delivery plus embryogenesis-based regeneration system is an alternative method to recover GE evens with various outcomes. In soybean, all SDN2 and SDN3 currently reported were created using this method ( Table 2). Availability to deliver multiple editing reagents such as pooled sgRNA constructs and RNP (RNA and protein complex), and large quantity of donor DNA fragments may have facilitated the achievement of SDN2 and SDN3. Biolistic transformation in soybean is highly genotype dependent (Homrich et al., 2012) since the regeneration is based on embryogenic suspension initiated from immature embryo of the target genotype such as Jack (Campbell et al., 2019) and 3B86 . Although embryonic axes can be used for biolistic transformation, the efficiency is very low (Rech et al., 2008). A. rhizogenes-mediated transformation for recovery of GE whole plant was occasionally used (Curtin et al., 2011;Haun et al., 2014) and specific regeneration system need to be established for this method.

Screening of Mutation Events Created by GE (Figures 2L-M)
Generally, the genomic DNA was extracted from transgenic soybean plants or plants regenerated from explants in which GE reagents was delivered. PCR primers were designed to amplify an amplicon containing the target sequence. PCR/RE assay (PCR products containing target sequence region with a restriction enzyme (RE) cut site cannot be digested with the RE if target editing is success), T7EI (T7 endonuclease I) assay and sequencing are commonly used to identify GE events (Shan et al., 2014). PCR/RE assay detection method was used frequently in soybean (Michno et al., 2015;Sun et al., 2015;Kanazashi et al., 2018;Di et al., 2019). T7EI assay was occasionally used (Cai et al., 2015;Du et al., 2016) and PCR/Sequencing assay is now the most popular method used to detect mutation in soybean (Haun et al., 2014;Campbell et al., 2019;Cheng et al., 2019;Do et al., 2019;Han et al., 2019;Li R. et al., 2019;Wang J. et al., 2020). PCR/RE and T7EI are cost efficient method when large number putative mutations need to be screened, but limitation of restriction enzyme cut sites can restrict design of sgRNA. Therefore, the sequencing-based method is a popular way to detect any target site in genome.
CRISPR system is efficient to induce mutation both in soybean hairy roots and regenerated plants. The editing frequency could reach up to 61.41% in hairy roots  and 72.20% in T0 generation (Cai et al., 2018). However, Bao et al. (2019) reported low efficiency in T0 plants. It may be to do with sequence composition at target site, sgRNA design, Cas9 codon optimization, and construction of GE reagents. Various type mutations including biallelic, homozygous (individuals homozygous /biallelic for all copies of the allele mutated will display a mutant phenotype), heterozygous (individuals heterozygous with at least one copy of the wild type allele will not display a mutant phenotype) and chimeric mutation can be obtained in soybean. GE events in most soybean T0 plants could be transmitted to next generation, but some may be lost in T2 plants (Do et al., 2019). To date, a couple of protocols for soybean genome editing based on CRISPR have been published, including one for recovery of GE whole plants  and one for GE hairy roots (Alok et al., 2018).

Achievement of Various Editing Outcomes
Most editing outcomes, including large fragment deletion, multiplexing, base editing, HDR editing, HDR insertion and knockout any given endogenous gene or genomic site, have been achieved in soybean ( Table 2). Cai et al. (2018) designed a dual CRISPR sgRNAs and successfully deleted targeted DNA fragments in both soybean GmFT2a and GmFT5a gene. Fragments varying between 599 to 1618 bp in GmFT2a was deleted with a 15.6% frequency and 1069 to 1161 bp in GmFT5a were achieved with 15.8%. Furthermore, a target fragment larger than 4.5kb in GmFT2a were also deleted with a 12.1% frequency. Multiplex with various range of target sites have been made (Cai et al., 2015;Jacobs et al., 2015;Kanazashi et al., 2018;Bai et al., 2020). For example, Bao et al. (2019) assembled four sgRNAs driven by the AtU3 or AtU6 promoter in one binary CRISPR/Cas9 plasmid and achieved simultaneous targeting multiple sites in four genes in SPL9 (Squamosa Promoter Binding Protein-Like (SPL)) transcription factor family in soybean, and plants carrying various combinations of mutations including homozygous quadruple mutants in T4 generation were recovered using Agrobacterium-mediated transformation. Base editing at target sequences in the first exon of soybean flower control gene GmFT2a and fourth exon of GmFT4 was successfully achieved using BE base editor combined the Cas9n (D10A) nickase, rat cytosine deaminase (APOBEC1), and uracil glycosylase inhibitor (UGI) . Both C-T and C-G base substitutions was obtained but only the side effect C-G substitution in GmFT2a gene made the proline of its amino acid changing to alanine in the mutant, resulting in altered flowering phenotype. Homologous directed recombination has been achieved for both precise gene editing and site-specific knock-in using biolistic delivery . A directed P178S mutation of acetolactate synthase1 (ALS) gene in soybean was made through HDR using a donor DNA template with a 1,084-bp ALS1 sequence fragment containing five nucleotides AG-T-C-T changes along with a construct containing ALS1-CR1 gRNA and Cas9 through co-transformation of soybean with chlorsulfuron selection. Meanwhile, precise homology-directed gene insertion by Cas9-gRNA was also achieved by co-transform soybean with a donor DNA construct carrying a hygromycin phosphotransferase (hpt) gene driven by a soybean S-adenosyl methionine synthetase (SAMS) gene promoter to confer hygromycin resistance and a Cas9-gRNA targeting a soybean genomic site DD43 on chromosome 4. The homologous HDR was transmitted into next generation. Bonawitz et al. (2019) reported integration of ballistically delivered DNA to a targeting site in GmFAD2-1a (the Fatty Acid Desaturase 2-1a) gene in soybean and demonstrated targeted integration of multiple transgenes into a single locus in soybean via either HDR or NHEJ using a ZFN. A hygromycin resistant gene hpt and its regulatory elements were inserted into the target site through HDR and a NHEJ-mediated accurate insertion was achieved with a 16.2kb donor containing four transgenes, hpt, dgt-28 (glyphosate tolerance marker), aad-1 (2,4-D tolerance marker) and dsm-2 (glufosinate tolerance marker). These integrations in T0 plant was successfully transmitted to T1 generation. Success of DNA-free GE with Cpf1 RNP was demonstrated in soybean protoplast (Kim et al., 2017).

Editing Efficiency Improvement by Using Appropriate Promoters
Promoter for sgRNA in CRISPR system is one of factors affecting GE efficiency. One major progress is the discovery of soybean U6 promoters for driving sgRNA. Although U6 promoters have highly efficient transcription, it is difficult to use the same U6 promoter among various distantly related species because endogenous sequences are less susceptible to silencing associated DNA methylation than transgene sequences in plants (Wang et al., 2008). Various transcription activities were discovered when the same U6 promoter was used in divergent species (Shan et al., 2013). This effect was also confirmed in soybean , in which two types of vectors using either the GmU6-10 or AtU6-26 promoter were constructed to target several soybean genes. Significant different mutation efficiencies, 3.2-9.7% with AtU6 vector and 14.7-20.2% with GmU6-10 vector, were observed. Even the different U6 promoters from the same species showed various activities (Domitrovich and Kunkel, 2003). Soybean U6 promoters (GmU6-8 and GmU6-10) with high editing efficiency have been selected from 11 candidate promoters in hairy roots (Di et al., 2019). Du et al. (2016) compared targeting efficiency using both TALENs and CRISPR to knock out both GmPDS11 and GmPDS18 in hairy roots. In CRISPR/Cas9, when AtU6-26 promoter was used, the single targeting efficiency was similar to that achieved by TALENs. The efficiency was doubled by using GmU6-16g-1. Meanwhile, using the AtU6-26 and GmU6-16g-1 promoter in CRISPR/Cas9 achieved targeting efficiency 2 times and 8 times higher, respectively, than that by TALENs, indicating high efficiency of GE can be achieved by CRISPR system if an appropriate promoter is used to drive sgRNA. It is the fact that use of the DD45 (egg cell and early embryo), Yao (shoot apical and root meristem-active), tomato Lat52 (pollen) and EC (egg cells, embryo) promoters for driving Cas9 can reduce the frequency of somatic mutations and increases the rate of heritable edits in the T2 generation Yan et al., 2015;Mao et al., 2016). Zheng et al. (2020) used both Arabidopsis and soybean egg-cell specific promoters to create knockout mutation of GmAGO7a gene in soybean. Successful mutations with T2 generation was achieved using AtEC1.2e1.1p promoters, but no mutants were recovered with GE using soybean egg cell promoters.

Functional Genomics Study
Functions of many genes in soybean have been evaluated using GE ( Table 2). For example, mutations for genes involved in small RNA processing were created using both CRISPR and TALENs for evaluating the role of small RNA processing in stress tolerance in soybean. CRISPR/Cas9 was employed to generate a biallelic double mutant of the two paralogous Doublestranded RNA-binding2 (GmDRB2a and GmDRB2b) genes, a heterozygous mutant for Dicer-like3 gene (GmDCL3a) and the homoeologous mutations of soybean Hen1 locus (GmHen1a; GmHen1b), and TALENs was used to induce mutant for dicerlike gene GmDCL2b. Some of the mutants in T0 plants can transmitted into T1 generation (Curtin et al., 2018). Li C. et al. (2019) reported successful targeting 3 different genes encoding two major storage protein families, conglycinins (7S) and glycinins (11S) accounting for about 70% of total soybean seed protein, and detected DNA mutations at a ratio ranging from 3.8 to 43.7% in the three storage protein genes in soybean hairy roots. Again Li R. et al. (2019) used pooled CRISPR/Cas9 technique to create single and double mutants of 2 plastidial phosphoglycerate kinase PGKp1 and PGKp2 gene. Normal performance of the single mutants and lethal phenotype of the double mutant confirmed that PGKs play redundant role in carbon fixation and metabolism. Paralogous sugar transport gene GmSWEET15a and GmSWEET15b from the SWEET (Sugars Will Eventually be Exported Transporter) family in soybean were targeted using CRISPR/Cas9  and the knockout mutations showed abnormal growth of embryo and persistent endosperm, leading seed abortion. Multiplex mutagenesis populations for gene function evaluation was also created in soybean using a pooled CRISPR-Cas9 platform . Soybean was transformed with pooled Agrobacterium strains each containing one of 70 vectors with gRNA to target 102 candidate genes (4 to 5 stains each batch) and all targeted mutations have been achieved.

Modification of Agronomy Traits
Genome editing technology has been applied to edit various genes controlling soybean agronomic traits ( Table 2). Soybean oil nutrition was improved by knocking out fatty acid desaturase gene (GmFAD2-1A and B) using TALEN. The fatty acid composition was significantly changed in Bert seeds of fad2-1a1b homozygous double mutation plants, oleic acid was increased 4 times and reached to 78% and linoleic acid was reduced to less than 4% from original 50% (Haun et al., 2014). The third fatty acid desaturase 3A (FAD3A) gene was mutated in the double fad2-1a1b mutant created by Haun et al. (2014) using TALENs, and linolenic acid and linoleic acid in seed oil of these plants with triple fad2-1a1b3a homologous mutations were further reduced nearly by a half (2.5 and 2.7%, respectively) and oleic acid was increased significantly (82.2%) compared to those in the double fad2-1a1b mutants (Demorest et al., 2016). Similar work done in variety Maverick using CRISPR, which resulted in dramatic increase in oleic acid content to over 80% and decrease in linoleic acid to 1.3-1.7% (Do et al., 2019). Double knockout of GmFAD2-1A (Glyma.10G278000) and GmFAD2-2B (Glyma.19G147300) in variety JN38 has resulted in increase of oleic acid content in seeds from 19.15% to 72.02%; decrease of linoleic acid from 56.58% to 17.27% in the T3 generation (Chen et al., 2011). Moreover, the percentage of protein in the seeds was increased from 37.52% to 40.58% . Seed lipoxygenasefree soybean was created by mutating three lipoxygenases genes (LOXs, including LOX1, LOX2, and LOX3) using CRISPR/Cas9 since beany flavor restricts human consumption of soybean . Soybean is a short-day (SD) plant and it tends to flower when the day length reduces to a certain extent. Therefore, photoperiod regulates soybean to initiate flowering and to adapt in different environment conditions. FLOWER LOCUS T (FT) encodes florigen which induces floral initiation at the shoot apex (Kardailsky et al., 1999). FT also integrates signals in flowering pathways to control flower time (Corbesier and Coupland, 2006;Turck et al., 2008). Soybean FT homologous genes including GmFT2a and GmFT5a have been recognized, and their basic functions especially the photoperiod responsive effect have been evaluated (Kong et al., 2010). These roles of the FTs in soybean need to be further confirmed by using reverse genetics. Cai et al. (2018) evaluated the function of GmFT2a by knockout this gene using CRISPR/Cas9 and found that the homozygous GmFT2a mutants delayed flowering in any photoperiod condition. Knockout GmFT2b also delays flowering time under long day (LD) conditions . Moreover, both GmFT2a and GmFT5a were found to control flowering time collectively when single mutant plants ft2a and ft5a, and double mutants ft2aft5a were assessed together with transgenic plants overexpressing GmFT2a or GmFT5a in photoperiod conditions including SD and LD. GmFT2a plays more important role for flowering than that of GmFT5a under SD conditions, and vice versa for GmFT5a and GmFT2a under LD conditions . Unlike GmFT2a, GmFT5a, make soybean for high latitude adaption. When grown under SD condition, the ft2aft5a double mutants delayed flowering by 31.3 days, leading significant increases in numbers of pods and seeds per plant (Cai et al., 2020b). The circadian clock related gene, LONG ELONGATED HYPOCOTYL (LHY) and CIRCADIAN CLOCK ASSOCIATED 1 (CCA1), regulates flowering under different daylength conditions (Wang and Tobin, 1998). LHY-CCA1-LIKE orthologs in soybean, GmLCLa1, GmLCLa2, GmLCLb1, and GmLCLb2, were identified and CRISPR/Cas9 was used to knock out all the 4 orthologs simultaneously to investigate their circadian rhythm related function in soybean . The quadruple mutant GmLCLa1a2b1b2 delays flowering and showed very short-period circadian rhythms. Early flowing mutations under natural longday (NLD) conditions was also created by knockout E1gene (Han et al., 2019) and GmPRR37 encoding a pseudo-response regulator protein which is related to photoperiod sensitivity using CRISPR system (Wang L. et al., 2020). These flowering time related mutations can be consistently inherited in next generation (Cai et al., 2018;Han et al., 2019;Cai et al., 2020a;Wang J. et al., 2020) and will be a useful resource for developing elite soybean varieties in the future. Plant architecture can be modified for improving yield. Cheng et al. (2019) mutated 4 Late Elongated Hypocotyl (LHY) genes again in soybean and obtained a homozygous quadruple mutant of GmLHY which is similar to GmLCLa1a2b1b2 described above and showed reduced plant height and shortened internodes. As a class A gene in the ABCE model in plant, APETALA1 (AP1) involves in floral organ development. All 4 soybean AP1 homologous genes have been targeted using CRSPR/Cas9 and the homologous quadruple gene knockout events delayed flowering time under SD and showed increase in plant height with increased node number and internode length, indicating potential yield increase for the mutation events . Bao et al. (2019)  and GmRIC2 that encode two nodule-enhanced Cavata3/Embryo Surrounding Region-Related (CLE) peptides using CRISPR. Two different types of double homozygous gmric1/gmric2 mutant plants demonstrated significant nodule number increase. Meanwhile, mutants for soybean Root Determined Nodulation1 (GmRDN1) were created as well. Down-regulation of all three target genes in the triple mutant gmrdn1-1/1-2/1-3 plants confirmed GmRDN1 negative regulation of nodule numbers in the roots. CRISPR has also been used to create herbicide resistant soybean. An HDR directed P178S mutation of acetolactate synthase1 gene in soybean was created using CRISPR system, which is resistant to chlorsulfuron .
CALYXT has performed field trials with GE soybean in Argentina since 2015 and launched its first commercial soybean variety edited by GE in 2018. This is the first GE soybean product in the world 4 .

CHALLENGES AND PROSPECTIVE FOR GE AND RELATED PRODUCT DEVELOPMENT IN SOYBEAN
Recent popular transgenic technology used in the last 4 decades has introduced foreign genes into crops including soybean for desired traits, and it has indeed made an alternative way to expand genetic resource. However, the random integration of transgenes in genome has raised public concerns and strict government regulation, which have dramatically increased cost and time for developing a new variety. GE technology provides a very efficient tool for crop breeders to introduce a desired trait into an elite background with precise and predictable manner rather than going through multiple back crossing to transfer a nature mutation in a typical conventional breeding process. The mutations created by GE is indistinguishable from these introduced by traditional mutagenesis breeding. This can also avoid the issues related to transgenic technology. Although various GE technology platforms have been extensively used in soybean and many editing outcomes can be created as summarized above, there are some difficulties. Similar to other crops (Scheben and Edwards, 2018), the biggest bottleneck for GE application in soybean is the deficit of GE candidate target genes due to insufficient fundamental study in soybean as stated above. The other bottlenecks include technical issues such as lack of guarantee to precise mutate any target site, the limitation of ways to deliver the genome-editing reagents into soybean cells, the low efficiency to select desired events and regenerate intact plants with targeted mutation, and offsite targeting. Many attempts have been made to minimize the limitations and improve efficiency to recover GE events through using newly developed GE technologies and soybean regeneration system. There are also some additional concerns for GE product development such as transgenic GE events, restriction of intellectual property and government regulation for GE. These issues need to be resolved before GE can play an important role in soybean improvement. This is a common GE issue for all plant crops. Most genome editing research using current available GE technologies still focus on gene knockout or generating a null mutation (SDN1). In most cases, multiplex editing is desired to overcome gene duplications in soybean due to its paleopolyploid nature, whereas single gene editing is used to resolve functional redundancy and unique role from each of the gene paralogs. Loss of gene functions can be easily identified from phenotypes or by molecular tools such as PCR. Due to lack of understanding of HDR mechanism and mature methods, only a couple of studies reported achieving SDN2 and SDN3 through CRISPR and ZFNs Bonawitz et al., 2019). New technology such as base editing has potential to achieve the same outcomes as SDN2. However, it has not been fully adapted in soybean despite a success base editing case with a BE base editor in soybean reported recently . The less success of base editing in soybean highlight the need to develop this technology in soybean. PAM site dependence and editing window may be the key factors. The possible solution is to use new types of Cas proteins, engineering Cas variants with altered PAM and modify the linker between deaminase and nCAS9 Mishra et al., 2020). Recently, another powerful GE technology-primer editing (PE) has been developed (Anzalone et al., 2019). Theoretically, it has possibility to make GE at any target site in a target gene sequence. To date, PE is used successfully in rice and wheat Lin et al., 2020;Tang et al., 2020;Xu et al., 2020). If the system can be fully operated in soybean, GE with all types of editing outcomes can be readily achieved.

System to Recovery of GE Whole Plant From Any Type Explants of Any Genotype
Among editing reagent delivery methods, Agrobacteriummediated method and biolistic method are commonly used for GE in soybean with all available GE platforms ( Table 2). These systems depend on tissue culture procedure with either organogenesis, i.e., multiple shoots regenerated from embryonic cotyledons of mature seeds, or embryogenesis, i.e., shoots regenerated from embryogenic callus derived from immature cotyledons (Yamada et al., 2012). The genotype dependent has been a very well know issue for Agrobacterium-mediated transformation since the method was implemented in soybean. It is also a big issue for using the biolistic method in soybean and the specific explant requirement for embryogenic tissue restricted application of GE in any genotype of soybean. Target genotype dependent, explant specificity and GE outcome dependent are the main limitations existed in current soybean transformation system for GE application. There are several ways to overcome the genotype dependent issue. Embryogenic booster genes such as BBM (BABY BOOM) and WUS (WUSCHEL) has been used in maize and other monocot plants to promote plant regeneration from various tissues (Lowe et al., 2016;Mookkan et al., 2017) and plant regeneration booster GRFs (GROWTH-REGULATING FACTORs)-GIF1 (GRF-INTERACTING FACTOR 1) complex has been used in both monocot and dicot (Debernardi et al., 2020). This can potentially broad explant types and to break through the genotype limitation if the similar booster genes can be found to improve regeneration in soybean. The other way is to develop in planta transformation method, which does not depend on genotype, such as A. thaliana floral dip transformation. Inplanta transformation methods has been developed in soybean (Mangena, 2019) but they are not ready for GE due to its current low efficiency. Improvement of Agrobacterium-mediated delivery method and transformation efficiency for recovery of various editing outcomes, development of other delivery and regeneration system such as protoplast transfection and regeneration system or in planta transformation system will expand the GE application in soybean.

Off-Site Targeting
Off-site targeting is caused by introduction of unintended mutations at off-target sites during genome editing process (Hahn and Nekrasov, 2019). In plants, this issue is not considered as the same important as in mammals since abnormal offsite target mutations can normally be identified and discarded through offspring segregation using backcrossing. Nevertheless, remove of off-site targeting can be time consuming in plant breeding. For CRISPR system, different sgRNAs structure (Mali et al., 2013) and specificity of Cas9 such as high-fidelity SpCas9 variants  can affect the cleavage on target and off targets sites. In soybean, off-site targeting was not detected in mutants created using ZFNs and TALENs, but it was evaluated and screened in edited events created using CRISPR system . For example, two possible off-target sites was detected in the genome of the soybean cultivar Williams82 using the web tool CRISPR-P 5 when targeted mutation for FAD2 genes was designed using CRISPR (Do et al., 2019). The off-site targeting in plant can be avoided through evaluating and predicted using various web-based tools such as Cas-OFFinder (Bae et al., 2014), CROP-IT (Singh et al., 2015), CRISPOR (Haeussler et al., 2016), and other soft tools (Hahn and Nekrasov, 2019). The effect can be reduced by improving specificity of CRISPR system using high-fidelity SpCas9 variants Zhong et al., 2019), nCas9 (nickase) with two sgRNAs (Shen et al., 2014;Zhang et al., 2015), delivering purified Cas9 ribonucleoproteins (RNPs) into cells (Kim et al., 2017;Andersson et al., 2018), and modified sgRNA (Young et al., 2019). For soybean, there has been a system established for the scientific society to shared genome-wide databases and to identify off site targets (Zou et al., 2020). In this system, specificity score and off-target number for each CRISPR/Cas9 targeting site can be calculated and evaluated, which would help to minimize the off-site targeting for GE in soybean during its applications in the future.

Transgene in GE Events
If GE product is not covered by genetically modified organism (GMO) regulation, the cost of field test and data collections would be massively reduced. It could also dramatically save time in release GE product and will reduce the public concerns 5 http://cbi.hzau.edu.cn/crispr/ on consuming GMO crops. Therefore, transgene-free or DNAfree GE plants are a pre-request for product development. Generally, genome edited plants are transgenic plants since GE events are normally recovered using transformation system and the form of editing reagents is DNA. Using next generation sequencing (NGS) analysis, Michno et al. (2020) found that three different CRISPR/Cas9 transgenes and their respective induced mutations in segregating soybean families have both expected and unexpected patterns of inheritance in different progeny lines at T0 and T1 generation. However, it is possible to obtain GE events without transgene integration in next generations through segregation. Transgene-free events obtained in T2 or T3 generation through transgene segregation is the major way to have transgene-free GE plants in soybean (Haun et al., 2014;. Some homozygous mutant soybean plants without transgene can be easily identified in the T1 mutant population such as GmFAD2 GE soybean (Do et al., 2019). Like GE technologies used in other crops, RNA or RNA and protein complex (RNP) editing reagents can be used to obtain DNA-free GE soybean. This can avoid transgenic and off-site targeting issue since biolistic delivery method established in soybean can be used for this purpose (Woo et al., 2015;Svitashev et al., 2016;Liang et al., 2017;Andersson et al., 2018). Development of new transformation methods such as protoplast transfection or in planta transformation including various GE methods bypassing tissue culture (Ji et al., 2020) will guarantee achieving RNP-mediated DNA-free GE soybean product.

Government Regulation for GE Product and Intellectual Property Preparation
It is one of major concerns for soybean breeders if GE is under same regulation framework as that of GMO. It normally based on socio-economic considerations rather than scientific evidence, which delays the adoption of GM crops leading to a negative impact on global agricultural innovation (Biden et al., 2018). In many countries, a variety developed through precise targeting mutation such as SDN1 created using technologies like ZFNs, TALENs, and CRISPR does not need to go through the regulation process used for GMO (Friedrichs et al., 2019;Metje-Sprink et al., 2020;Schmidt et al., 2020). GE plants and the products can be cultivated and sold free from regulatory monitoring in the United States, Brazil, Argentina, Chile, Australia, and Japan. In most countries, the regulation of GE plants is based on assessment of the product except for that in the EU, Brazil, Indian and New Zealand, which is dependent on the biotechnological processes to produce the organism. Several countries made decision to follow a product-based approach and some countries like Australia and China tend to follow in near future (Friedrichs et al., 2019;Metje-Sprink et al., 2020;Schmidt et al., 2020). Soybean breeders should collect such information and develop new varieties aiming mainly at countries which already have low or no regulation on GE plants or countries that will remove GE from GMO regulations. Another concern is GE intellectual property for commercial soybean product. Selecting one of the highly efficient, easy to operate and low-cost technology will accelerate GE product development for soybean. In contrast to other genome editing techniques such as ZFNs and TALENs, which had clear intellectual property ownership, CRISPR does not have clear ownership yet. Several institutes and companies have claimed rights to this system and this issue currently remains unresolved (Brinegar et al., 2017). Since its development, the number of patents related to CRISPR products has increased at an unprecedented rate compared to other editing technologies, such as CPF1, CMS and a recently developed Prime editing technology which have been successfully used in plant Lin et al., 2020;Tang et al., 2020;Xu et al., 2020).

CONCLUSION REMARKS
Precise and predictable modifications of desired targeting gene sequences in an elite background without change other traits by genome editing can accelerate plant breeding. In crops with duplicated genes or genomes such as soybean, it can avoid tedious and complicated procedure of crossing and screening through conventional breeding. GE technologies especially CRISPR based systems have evolved fast and most have been adopted to provide efficient tools for soybean improvement. The recent field trial of high oleic soybean using TALENs has demonstrated the bright future of soybean improvement if this technology is well implemented in plant breeding programs. At present, discovery of more GE target genes related agronomic important traits, adoption of newly developed GE technologies, simplification and renovation of editing reagent delivery and improvement of target mutant recovery method in soybean will expand editing outcomes, save time and reduce cost for product development. The cost-efficient preparation of intellectual property of GE technologies worked for soybean and understanding of GE related government regulation by breeders and farmers will promote GE product development. Transgene-free or DNA-free edited plants are considered as non-genetically modified events in several countries which will facilitate GE soybean production. Soybean is a commercial import and export crop with huge seed production. In the past, new technologies like transgenesis have been more widely and intensively applied to this crop compared to other crops. The recent advances in genome editing in soybean can potentially make it a leader once more in the era of new development in crop biotechnology.