DNA Engineering and Hepatitis B Virus Replication

Recombinant DNA technology is a vital method in human hepatitis B virus (HBV), producing reporter viruses or vectors for gene transferring. Researchers have engineered several genes into the HBV genome for different purposes; however, a systematic analysis of recombinant strategy is lacking. Here, using a 500-bp deletion strategy, we scanned the HBV genome and identified two regions, region I (from nt 2,118 to 2,814) and region II (from nt 99 to 1,198), suitable for engineering. Ten exogenous genes, including puromycin N-acetyl transferase gene (Pac), blasticidin S deaminase gene (BSD), Neomycin-resistance gene (Neo), Gaussia luciferase (Gluc), NanoLuc (Nluc), copGFP, mCherry, UnaG, eGFP, and tTA1, were inserted into these two regions and fused into the open reading frames of hepatitis B core protein (HBC) and hepatitis B surface protein (HBS) via T2A peptide. Recombination of 9 of the 10 genes at region 99–1198 and 5 of the 10 genes at region 2118–2814 supported the formation of relaxed circular (RC) DNA. HBV DNA and HBV RNA assays implied that exogenous genes potentially abrogate RC DNA by inducing the formation of adverse secondary structures. This hypothesis was supported because sequence optimization of the UnaG gene based on HBC sequence rescued RC DNA formation. Findings from this study provide an informative basis and a valuable method for further constructing and optimizing recombinant HBV and imply that DNA sequence might be intrinsically a potential source of selective pressure in the evolution of HBV.


INTRODUCTION
Infection with human hepatitis B virus (HBV) remains a public health problem around the world. More than 250 million people across the globe are estimated to be chronically infected with HBV (WHO, 2017;Polaris Observatory Collaborators, 2018;Schmit et al., 2020). Evidence shows that chronic HBV infection is a high-risk factor for the development of hepatocirrhosis and hepatocellular carcinoma.
HBV is a prototypical member of the Hepadnaviridae family, characterized by a relatively compact relaxed circular (RC) DNA genome and a special genomic replication mechanism via reverse transcription of a redundant RNA intermediate (Summers and Mason, 1982). The RC DNA genome of HBV is repaired after infection, converted into covalently closed circular DNA (cccDNA) in the nucleus of hepatocyte, and transcribed to pregenomic mRNA (pgRNA), precore mRNA, preS1 mRNA, S mRNA, and X mRNA. PgRNA functions as a bicistronic mRNA directing the synthesis of hepatitis B core protein (HBC) and polymerase (Pol) and is the template for reverse transcription. HBV replication begins with the encapsidation of pgRNA, whereby Pol binds to the stem-loop structure, epsilon (ε), at the 5 end of pgRNA, triggering the assembly of HBC and packaging of the ribonucleoprotein complex into an icosahedral nucleocapsid (Bartenschlager et al., 1990;Bartenschlager and Schaller, 1992;Pollack and Ganem, 1993;Shin et al., 2002). The host chaperone proteins mediate encapsidation, and the nucleocapsid provides a microenvironment for reverse transcription . Briefly, the minus-strand primer, estimated at 3 or 4 nucleotides (nt) long, is synthesized in a protein primed process with the bulge of epsilon as a template. The minus-strand primer links to the tyrosine residue at the 63rd amino acid of Pol (Pol 63Y) via a covalent bond ( Figure 1A; Nassal and Rieger, 1996;Lanford et al., 1997) and translocates to the complementary region, DR1, at the 3 end of pgRNA ( Figure 1B; Rieger and Nassal, 1996), where the synthesis of minus-strand DNA resumes. Subsequently, pgRNA is reverse transcribed into minus-strand and simultaneously degraded by the RNase H domain of Pol, leaving the 11-16 oligonucleotides at 5 -terminal of pgRNA undigested ( Figure 1C; Summers and Mason, 1982). These oligonucleotides function as primers for plus-strand synthesis. A fraction of plus-strand primers retained in situ extends to the 5 end of the minus-strand DNA and forms duplex linear DNAs (DL DNAs) ( Figure 1D). Most plus-strand primers translocate to DR2 and extend to the 5 end of the minus-strand, copying a 10-nt redundant sequence named 5 r ( Figure 1E; Haines and Loeb, 2007). Subsequently, the nascent 3 end of plus-strand pairs with the 3 -proximal redundant region (3 r) of minus-strand, continuing elongation of plus-strand to form RC DNA (Figures 1F,G).
In this view, we herein systematically analyzed the strategy of HBV engineering. First, a scan on the whole genome with a 500-bp deletion from position 1919 to position 1198 revealed two regions, nt 2,118-2,814 and nt 99-1,198, welltolerable to deletion. Second, 10 exogenous genes, including puromycin N-acetyl transferase gene (Pac) (Lacalle et al., 1989), blasticidin S deaminase gene (BSD) (Kimura et al., 1994), Neomycin resistance gene (Neo) (Southern and Berg, 1982), Guassia luciferase (Gluc) (Verhaegent and Christopoulos, 2002), NanoLuc luciferase (NLuc) (England et al., 2016), copGFP (Shagin et al., 2004), mCherry (Shaner et al., 2004), UnaG (Kumagai et al., 2013), eGFP (Cormack et al., 1996), and tTA1 (Baron et al., 1997), were inserted at position 2120 and position 155, respectively, via Thosea asigna virus 2A peptide (T2A) (Wang et al., 2015). Insertion of a majority of these genes supported RC DNA formation. We also systematically analyzed HBV RNA transcribed from different constructs to provide insight into how engineering affects the formation of RC DNA. The results showed that pgRNA splicing is common among those deletion variants and recombinants. However, this can hardly explain the failure of RC DNA formation of some recombinants. We thus provided a hypothesis that secondary structures induced by some of these exogenous genes might be potentially associated with abolishing RC DNA. This hypothesis was supported because sequence optimization of the UnaG gene based on HBC sequence, which is predicted to relieve the nonoptimized UnaG on the structure of minus-strand DNA, rescued RC DNA formation.

Constructs
HBV DNA was derived from HBV subtype ayw (GenBank accession number v01460) and numbered according to the only EcoRI site of the genome. The "C" of the EcoRI site (GAATTC) was designated as position 1. Pch9/3091, constructed by Nassal et al., transcribes pgRNA under the control of the cytomegalovirus immediate-early promoter (pCMV) (Nassal, 1992). All HBV variants were constructed based on Pch9/3091 using Golden Gate Assembly with a few modifications (Engler et al., 2009). Briefly, the Golden Gate Assembly system comprised the following components: 1 × NEB buffer 3.1 (NEB), 5 mM DTT, 1 mM ATP, 0.5 units/µL of BsmBI, or BsmBI-v2 (NEB), 150 units/µL of T7 DNA Ligase (NEB), and 3-5 ng/µL of each DNA fragment. The reaction was performed with 60 cycles of 37 • C for 5 min and 16 • C for 5 min, followed by a 5-min incubation at 60 • C. Plasmid pch9-G2016T contains a G2016T mutation, which terminates HBC translation at the 40th amino acid (HBC 40E). Pch9-ε has a deletion between nt 1,858 and 1,863 and two FIGURE 1 | Synthesis of HBV DNA. (A) Pol (P) binds to 5 -proximal epsilon, forming a ribonucleoprotein complex with pgRNA, encapsidated into nucleocapsid formed by HBC. In the nucleocapsid, using the bulge of epsilon as a template, Pol synthesizes the minus-strand primer. (B) The nascent minus-strand primer translocates to 3 -proximal DR1 of pgRNA via base pairing. (C) pgRNA was degraded by the RNase H domain of Pol during reverse transcription, leaving a 11-16 nt long oligonucleotides, which serves as the primer for plus-strand synthesis. (D) A few plus-strand primers are retained in situ and extend to the 5 -end of minus-strand DNA, forming duplex linear DNAs (DL DNAs). (E) Most plus-strand primers translocate to DR2. Here, the synthesis of plus-strand resumes and extends to the 5 -end of minus-strand. (F) The nascent plus-strand pairs with the 3 r of minus-strand to complete circularization. (G) The plus-strand is synthesized further to form RC DNA. (H) cis-Elements play a crucial role in HBV replication. h5E (1511-1568), hM (2820-2868), and h3E (1833-1844) are crucial for plus-strand primer translocation. (1767-1793) is important for minus-strand primer translocation.
substitutions (G1877T and T1878A). PgRNA transcribed from this construct would not be encapsidated but be translated to both HBC and Pol.

Core DNA Extraction and Southern Blotting
Plasmids were cotransfected, respectively, with pCH9-ε, which expresses HBC and Pol in trans. Cells were harvested on day 5 posttransfection and washed once with phosphate-buffered saline (PBS). Core DNA was extracted as described previously (Abraham and Loeb, 2006). Briefly, cells in each well were lysed with 200 µL lysis buffer [50 mM Tris-HCl (pH 8.0), 1 mM EDTA, 0.2% NP40] and incubated at 37 • C for 10 min. Lysed cells were centrifuged at 12,000 g for 5 min to remove the nuclei. The supernatant was supplemented with 5 mM CaCl 2 and 5,000 units/mL micrococcal nuclease (NEB) and incubated at 37 • C for 1.5 h. Subsequently, the micrococcal nuclease was inactivated with 10 mM EDTA. The reaction systems were supplemented with 0.5% sodium dodecyl sulfate and 0.5 mg/mL pronase and incubated at 37 • C for 1.5 h. Core DNA was extracted with phenol-chloroform precipitated with ethanol and dissolved into 20 µL 1 × EcoRI buffer. Core DNA (10 µL) was digested by EcoRI for 1 h and resolved with 10 µL undigested samples on 1.5% agarose gel at 4 V/cm for 1.5 h. Following electrophoresis, the gel was denatured by soaking into two volumes of 0.4 M NaOH for 15 min with gentle shaking and transferred onto a positively charged nylon membrane with 60 mL 0.2 M NaOH overnight. The membranes were neutralized by soaking into 100 mL neutralization buffer (2 × saline-sodium phosphate-EDTA (SSPE), 200 mM Tris-HCl, pH 7.5) for 15 min with shaking and then UV-crosslinked at 1,500 × 100 µJ/cm 2 . Hybridization was performed via the DIG Easy Hyb (Roche, Germany) according to the manufacturer's protocol using a digoxin-labelled probe (nt 1,199-1,814) at 30 ng/mL and 46 • C. Detection of digoxin was performed following the manufacturer's protocol (Roche, Germany), with some modifications according to a previously described method (Engler-Blum et al., 1993;McCabe et al., 1997). The NaCl concentration and the pH of the washing buffer were increased to 3 M and 8.0-9.0, respectively. In addition, 10 × blocking reagent was centrifuged at 13,000 g for 5 min, and the supernatant was added into washing buffer to reduce background.

Total RNA Extraction and Northern Blotting
Total RNA was extracted using the TIANGEN RNA extraction reagent (TIANGEN, China) with some modifications. Briefly, cells were harvested on day 2 posttransfection and washed once with PBS. Next, cells were lysed in 500 µL/well Buffer RZ, and the lysate was supplemented with 1/5 volume of chloroform. The mixture was centrifuged at 13,000 g at 4 • C for 10 min, and the supernatant was transferred into a new tube. Following phenolchloroform extraction, total RNA was precipitated using ethanol and dissolved into 20 µL 1 × RNA loading buffer (60-73% formamide deioned, 1 × DNA loading buffer, and 10 µg/mL EtBr). RNA samples can be stored at -20 • C for at least 1 month without degradation.
RNA samples were electrophoresed in TAE agarose gels according to a previously described method (Masek et al., 2005). The samples were first denatured at 65 • C for 5 min and chilled on ice for 5 min before loading. Then, the samples were resolved on 1.2% 1 × TAE agarose gels at 4 V/cm and visualized under UV to examine the rRNAs. The gels were soaked into 120 mL 20 × saline sodium citrate (SSC) for at least 20 min. After that, RNA on the gels was transferred onto positively charged nylon membranes using 20 × SSC overnight. The membranes were UVcrosslinked directly on the next day. Hybridization and detection were performed by a similar method as the Southern blotting procedure except that the hybridization temperature was 60 • C, and the low stringent washing temperature was 70 • C.

Reverse Transcription and Polymerase Chain Reaction
Reverse transcription was performed with TIANGEN FastKing RT Kit (TIANGEN, China) according to the manufacturer's instructions with modifications. The incubation time was extended to 10 min in the DNA removal step and 30 min in the reverse transcription step. The reverse transcription primers included R HBV 1370, R HBV 1547, R HBV 1680, and R HBV 1800. Polymerase chain reaction (PCR) was performed with PrimeStar Max Premix (Takara, Japan) according to the manufacturer's protocol.

Sequence Optimization and DNA Structure Prediction
Sequence optimization of the exogenous genes was performed with DNAMAN version 8, using HBC sequence as the reference. Briefly, the DNA sequence of HBC was translated into protein, and the two most frequently used codons were used as the reference to optimize the sequence of selected genes. For DNA structure prediction, the secondary structure of DNA is predicted via DNAMAN version 8 as follows: load the sequences into the channel; select "sequence" > "secondary structure" > "current sequence."

Blot Quantification and Statistical Analysis
According to the instructions on the website 1 , Southern and Northern blots were quantified by ImageJ 1.53e. The relative intensity of RC DNA and that of pgRNA were analyzed via a two-tailed Student t-test. p < 0.05 was considered statistically significant. QQ PLOT was used to determine the frequency distribution of the data analyzed using GraphPad Prism 8 (GraphPad Software, United States).

Scanning of Hepatitis B Virus Genome for Desirable Recombination Sites
A series of deletion variants of the HBV genome was constructed based on Pch9/3091 to identify desirable regions for exogenous gene recombination (Figure 2A). Each variant had an approximated 500-bp deletion, spanning from 1919 to 1198 and skipping the cis-elements (Table 1). Pch9-ε, in which the epsilon sequence was mutated to abort pgRNA encapsidation ( Figure 2C), transcomplemented HBC and/or Pol for the damaged open reading frame (ORF) of HBC and/or Pol of some variants. The translation of HBC from Pch9-G2016T was  terminated at the 40th amino acid by introducing a stop codon. Cotransfection of Pch9-G2016T and Pch9-ε served as the positive control. All the variants were cotransfected with Pch9-ε into HepG2. Core DNA was extracted on day 5 posttransfection and subjected to Southern blotting. Pch9-G2016T formed three major bands, including RC, DL, and single-stranded (SS) DNA ( Figure 2C, lane 1). The smear below SS DNA was either uncompleted intermediates of SS DNA or spliced products reverse transcribed from spliced pgRNA (Abraham et al., 2008). As expected, the RC DNA was linearized via EcoRI digestion, migrating to the same position as the undigested DL DNA. DL DNA was cut into two smaller linear fragments by EcoRI, with expected lengths of 1,365 bp (1814-3178) and 1,825 bp (1-1825), respectively, as 1814 is the transcription starting site of pgRNA from Pch9/3091 (Liu et al., 2004). Notably, the 1,365-bp fragment was not detected because the probe hybridizes with region 1199-1814. The 1,825bp fragment migrated faster than SS DNA ( Figure 2C, lane 2).
RC and DL DNA were detectable for all the variants except D2899-197 and D3099-396, which electrophoresed faster than their counterparts from Pch9-G2016T ascribed to a 500-bp deletion ( Figure 2C). SS DNAs of these variants were not detected. Following EcoRI digestion, the RC DNAs of these variants migrated to the same position as the undigested DL DNA, providing evidence of their circular configuration. DL DNAs of D1919-2416, D2018-2515, D2118-2617, D2217-2723, and D2315-2814 were cut into two fragments, including one with an expected length of 1,825 bp and the other of 865 bp (nt 1,814-3,178 with a 500-bp deletion, undetectable). DL DNAs of D99-596, D300-800, D499-996, and D700-1198 were cut into two fragments, including one with a length of 1,325 bp (nt 1-1,825 with a 500-bp deletion) and the other with 1,365 bp (1814-3178, undetectable) ( Figure 2C).
D3099-396 produced five bands ( Figure 2C, lanes 15 and 16, bands a-e), all of which were resistant to EcoRI digestion because the EcoRI site was deleted (Figure 2A). Plasmid D3099-397EcoRI, in which an EcoRI site (GAATTC) was introduced between 3098 and 398, was constructed to verify whether D3099-396 formed RC DNA. Like D3099-396, five bands, indicated as a, b, c, d, and e, were detected in D3099-397EcoRI ( Figure 3A, lane 3). Band "a" moved to the same position as band "c" after EcoRI digestion, while bands "b, " "d, " and "e" remained unchanged ( Figure 3A,  lanes 3 and 4). These results demonstrated that band "a" is RC DNA, band "c" is DL DNA, band "e" is SS DNA, and bands "b" and "d" are likely to be spliced products. Thus, D2899-197 is the only variant that does not support the formation of RC DNA.

Deletions Associated With Hepatitis B Virus RNA Splicing
It is intriguing that D2899-197 formed only weak SS DNA ( Figure 2C, lanes 13 and 14) without impairing known ciselements. One hypothesis is that the pgRNA expressed from this construct was spliced (Abraham et al., 2008). To address this question, we performed a systematic analysis of the influence of the deletions on HBV RNA splicing. Total RNA samples were assayed by Northern blotting and reverse transcribed into cDNA using primers R HBV 1370, R HBV 1547, R HBV 1680, and R HBV 1800, respectively, to identify whether the pgRNAs were spliced ( Figure 4A). The cDNA was further amplified via PCR using primers F HBV 1821 or F HBV 1851 plus R HBV 1370, or R HBV 1547, or R HBV 1680, or R HBV 1800, respectively, and PCR products were gel-purified and sequenced.
Moreover, splicing was detected in other deletion variants. D1919-2416 showed one splicing pattern, from nt 2,445 to 487 ( Figure 4B). D2018-2515 had two splicing patterns: one is from nt 2,983 to 487, and the other is spliced twice, from nt 2,983 to 280, and from nt 456 to 1,383 (Figures 4C,D). However, the splicing rates of pgRNAs of these two variants were not 100% because the two variants did support the formation of intact RC DNA (Figure 2C, lanes 3-6), which must be derived from complete pgRNA. Notably, the pgRNA of D3099-396 was spliced to a less extent but via a more complex mechanism. At least five splicing patterns, from nt 2,445 to 487, from nt 2,065 to 487, from nt 2,445 to 1,383, from nt 2,065 to 1,306, and from    nt 2,065 to 1,383, respectively, were revealed (Figures 4G-K). Some of these splicings are potential sources of the additional bands of D3099-396 detected via Southern blotting ("b" and "d" in Figure 2C, lane 15). Analysis of the splicing of pgRNA of D99-596, D300-800, D499-996, and D700-1198 is challenging because of the complicated bands. In addition, the amount of pgRNA of deletion variants (except D1919-2416, D2018-2515, D2899-197, and D499-996) was significantly higher than Pch9-G2016T, respectively (p < 0.05) ( Figure 5C). This observation provides evidence of these constructs' relatively higher amount of RC DNA (Figures 2D, 5C).

Deletions of 2,118-2,814 and 99-1,198 Produces More Relaxed Circular DNA Than Other Variants
Regions with the least impact on the formation of RC DNA were identified by comparing the relative amount of RC DNAs among these variants. Southern blotting was repeated five times, followed by analyzing the relative intensity of RC DNAs with ImageJ. Results demonstrated that the relative intensities of RC DNA of D2118-2617, D2217-2723, D99-596, and D300-800 were significantly higher than Pch9-G2016T (Figure 2D), whereas that of D1919-2416 and D2899-197 were significantly lower than Pch9-G2016T. Also, D2315-2814, D3099-396, D499-996, and D700-1198 produced a similar amount of RC DNA as Pch9-G2016T ( Figure 2D). These results demonstrate that two regions are suitable for recombination, including 2,118-2,814 and 99-1,198.

Influence of Recombination of Exogenous Genes at 2120 on Relaxed Circular DNA Formation
Given the data above, two positions, 2120 and 155, located in the ORF of HBC and HBS, respectively, were selected to insert foreign genes. First, three selection genes (Pac, BSD, and Neo), two luciferase genes (Gluc and Nluc), four fluorescent genes (copGFP, mCherry, UnaG, and eGFP), and one transactivating gene (tTA1) were inserted right after the valine residue at the 74th amino acid of HBC (HBC 74V, site 2120) ( Table 2). Each gene was fused via a T2A peptide at the N-terminal to separate its expression from the HBV genome. The focus was on the formation of RC DNA as it is the precursor of functional cccDNA.

Splicing of pgRNAs of the 2120 Recombinants
Different genes exert different effects on the formation of RC DNA. Therefore, to pursue the underlying reasons, we analyzed the RNAs from a part of the recombinants. All the recombinants tested, to some extent, showed pgRNA splicing (the bands right under pgRNA bands), although intact pgRNA was observed in all samples (Figure 6 and Table 2). The spliced pgRNAs possibly explain the small RC DNAs of 2120-T2A-BSD-2583, 2120-T2A-Gluc-2742, and 2120-T2A-Nluc-2700 ( Figure 3A). However, for 2120-T2A-tTA1-2820, the pgRNA was spliced approximately 80% (band "b" Figure 6, lanes 21 and 22). Sequencing results demonstrated that this RNA was spliced between HBV nt 2,065 and tetR nt 428 (sequence from HBV nt 2,065 to nt 428 of tetR was missing) (Figure 4L), providing potential evidence on why the RC DNA of 2120-T2A-tTA1-2820 electrophoresed faster than the RC DNA of Pch9-G2016T ( Figure 3A, lanes 1 and 23). The splicing sites of 2120-T2A-tTA1-2820 were removed by deleting nt 2,064-2,072 and mutating A426 of tetR to G426. However, this manipulation did not rescue RC DNA formation (Figure 3C, lanes 5 and 6).

Sequence Optimization Improves Relaxed Circular DNA Formation for Some Recombinant Constructs
It was reported that insertion of UnaG aborted RC DNA formation at nt 2,120 but supported that at nt 155. Furthermore, recombination of UnaG maintained DL and SS DNA formation, suggesting the blockade of the translocation or circularization of  the plus-strand primer, which is dependent on where the foreign genes were inserted. These results implied that the insertion of UnaG at nt 2,120 might form the undesirable secondary or higher-order structure of minus-strand DNA, which possibly affects either the translocation of the plus-strand primer or the circularization step. Guided by this hypothesis, we optimized the sequence of genes inserted according to the DNA sequence of HBC, expecting that a DNA sequence resembling HBC would reduce the influence on replication. Five genes (UnaG, Pac, Gluc, Nluc, and eGFP) were optimized ( Table 3) and inserted into the same positions as the unoptimized genes, respectively. Of note, the construct with optimized UnaG (UnaGco) formed RC DNA in a similar amount as that of the positive control (Pch9-G2016T). This band migrated to the same position as the RC DNA of Pch9-G2016T (Figure 9A, lanes 17 and 18). EcoRI digestion generated a fragment that electrophoresed faster than undigested DL DNA but slower than SS DNA of 2120-T2A-UnaGco-2607 ( Figure 9A, lanes 17 and 18). These findings affirmed RC DNA identity because there is an EcoRI recognition site in UnaGco. EcoRI digestion is expected to produce a detectable band of 2,145 bp (the other should be 759 bp undetectable by our probe). In contrast, 2120-T2A-UnaG-2607 formed only DL and SS DNA (Figure 9A, lanes  15 and 16). The weak DL DNA could be revealed solely via EcoRI digestion, which cut DL DNA to a detectable 1,825-bp fragment. Northern blotting assay of HBV RNA demonstrated a significantly lower relative amount of intact pgRNA of 2120-T2A-UnaGco-2607 than that of 2120-T2A-UnaG-2607, whereas a part of pgRNA of both could be spliced (Figures 10A,B) indicating that the improvement in RC DNA formation by sequence optimization must be explained by improvement in the steps after pgRNA production. To address this, we analyzed the secondary structure of the sequence from the minus-strand DNAs by using DNAMAN 8. Three sequences corresponding to wild type HBV (nt 2,121-2,868), chimeric UnaG-HBV, and optimized UnaG-HBV (Figure 11) were predicted for secondary structure. Wildtype UnaG sequence profoundly impacts the overall structure of HBV (Figures 11A,B). Especially, the structure of cis-element hM is different between these two. On the contrary, optimized UnaG does not significantly influence the structure of HBV DNA fused ( Figure 11C), with the hM showing a similar structure as that of wild-type HBV DNA. The influence of sequence optimization of Pac (Pacco) on HBV DNA replication seemed different. The SS DNA of 2120-T2A-Pac-2784 electrophoresed faster than the SS DNA of D3099-397 EcoRI (Figure 3A, lanes 3 and 5), demonstrating that the SS DNA of 2120-T2A-Pac-2784 was at least 487 bp shorter than the SS DNA of Pch9-G2016T. Evidence suggests that the shorter SS DNA is possibly an immature product paused by assumptive secondary RNA structures formed by the Pac gene, with a high content of "G and C" (72.8%). In line with this hypothesis, optimized Pac (Pacco, 2120-T2A-Pacco-2784) promoted SS DNA maturation ( Figure 9A, lanes 3-6). The RC DNA intensity was only slightly enhanced without significance by Pac optimization (p > 0.05). 2120-T2A-Glucco-2742 and 2120-T2A-Nlucco-2700 formed RC, DL, and SS DNA (Figure 9A, lanes 7-14). At the same time, sequence optimization of Nluc and Gluc did not augment the amount of RC DNA further. Also, the pgRNA amount of 2120-T2A-Glucco-2742 and 2120-T2A-Nlucco-2700 were lower than the unoptimized counterparts (Figure 10).

DISCUSSION
Previous studies indicate that HBV recombinant virus expressing foreign genes can be constructed successfully. However, a systematic exploration of the influence of different strategies of engineering on HBV replication is lacking. In the present study, the whole genome of HBV was scanned for regions suitable for engineering. "Suitable here" means supporting the formation of RC DNA, the precursor of functional cccDNA. This criterion allowed for the identification of two regions, 2118-2814 and 99-1198. Region 2118-2814 covers the C-terminal of HBC and the N-terminal of TP domain of Pol, and its deletion efficiently supports RC DNA formation. Notably, this region partially overlaps the region (nt 2,124-2,712) previously utilized to recombine NanoLuc into an HBV of genotype C (accession number AB246345) (Nishitsuji et al., 2015). Region 2118-2814 can hardly extend further because the hM (2820-2868) region must be retained. Previously, researchers successfully inserted a 52-aa polypeptide into 1982-2312 (Wang et al., 2002(Wang et al., , 2014Deng et al., 2009). Even so, we think that extending the recombination region toward the N-terminal of HBC is undesirable. In the present analysis, deletion of 1919-2515 formed only weak RC and DL DNA, and the relatively lower level of core DNA was potentially associated with a low amount of intact pgRNA. Intriguingly, 1919-2515 deletion resulted in almost complete pgRNA splicing. A similar phenomenon was evident for D2899-197, which formed only weak SS DNA and showed a large part of pgRNA splicing. Elsewhere, a study revealed that 55% pgRNA of wild-type HBV suffered from splicing in HepG2 cells (Abraham et al., 2008). Taken together, significantly higher splicing of D1919-2515 and D2899-197 implies that the two regions are functional to protect authentic pgRNA from splicing. It is speculated that such protection is potentially associated with secondary or higher-order RNA structures that can hide the splicing sites. Region 99-1198 covers the C-terminal of the spacer, RT, and RNaseH domains of Pol, overlapping preS2 and S ORF. In a previous study, BSD was inserted into the preS2 region, between XhoI (CˆTCGAG, nt 125) and BsrGI (TˆGTACA, nt 766) (Liu et al., 2009). Moreover, Untergasser et al. inserted GFP between nt 1,446 and 2,347 (Untergasser and Protzer, 2004). This region contains h5E (nt 1511-1568), which has been reported to play a crucial role in RC DNA formation (Lewellyn and Loeb, 2007). The deletion of h5E (nt 1,511-1,568) was responsible for the poor RC DNA formation in this construction of HBV. Preserving h5E FIGURE 10 | HBV RNA assay of the recombinants before and after sequence optimization. (A) Representative Northern blotting of the recombinants before and after sequence optimization. (B) The relative intensity of pgRNA of the recombinants before and after sequence optimization. The ratio of each pgRNA/18s rRNA was calculated, and this ratio was then divided into the sum of pgRNA/18s rRNA of all samples in the same membrane to calculate the relative intensity of pgRNA of each sample. N = 3.
Herein, in D3099-396, an additional band was detected between the RC and DL DNA. This band was likely pseudo-RC DNA, reverse transcribed from one of the spliced pgRNAs of D3099-396, because only spliced RC DNA could electrophorese faster than the unspliced RC DNA and slower than the unspliced DL DNA. Of note, all spliced pgRNAs of D3099-396 lacked hM (2820-2868), which was reported to be crucial for RC DNA formation (Lewellyn and Loeb, 2007). Furthermore, we fused 10 foreign genes, via T2A peptides, into the ORF of HBC and HBS, at 2120 and 155, respectively. This inframe arrangement allowed for the expression of foreign genes from two RNAs, preC RNA and pgRNA, or preS1 mRNA and S mRNA. T2A peptide fused at the N-terminal of the foreign genes reduced the potential impact of the fused HBV peptides on the function of foreign genes. The findings demonstrate that site 155 may be more tolerable to recombination than site 2120. Of the 10 genes, 9 were successfully inserted at site 155 without abolishing RC DNA formation, whereas 5 of the 10 genes inserted at site FIGURE 11 | Prediction of the secondary structure of minus-strand DNA of the recombinants. Secondary structures of the minus-strand DNA of wild-type HBV (A), unoptimized UnaG recombinant (B), and optimized UnaG (C) were predicated by using DNAMAN 8. The structure of cis-element hM (nt 2820-2868) is indicated. It is obvious that unoptimized UnaG profoundly affects the structure of HBV sequence, including hM. In contrast, optimized UnaG showed no significant impacts on the structure of the HBV sequence (nt 2607-2868). 2120 abrogated that. Besides, the deletion of HBV 2121-2819 did not abort RC DNA formation. Therefore, the failure of RC DNA formation of insertions of Neo, copGFP, mCherry, UnaG, and eGFP at 2120 was unlikely to be associated with deleting any ciselements. Notably, the recombination of these genes still allowed for DL DNA formation. As such, the genes must interfere with the plus-strand primer translocation or the circulation step. Inserting the foreign genes close to hM (nt 2,820-2,868) implies that this arrangement could interfere with the base pairing between hM and h3E. These events are crucial for the translocation of plus-strand primer (Lewellyn and Loeb, 2007), contributing to the abortion of RC DNA formation.
The most intriguing finding of this work is that optimization of the sequence of recombinant genes may improve RC DNA formation. Usually, codon optimization is used to improve the expression of proteins as this approach accommodates the codon bias of the host organism. However, we do not think the alteration in protein (UnaG) expression ameliorated RC DNA formation in the present case. First, the HBC and Pol were provided by transcomplementation. Second, the expression of UnaG or UnaGco was not essential for RC DNA formation. Third, 2120-T2A-UnaG-2607 produced a higher level of pgRNA and SS DNA than 2120-T2A-UnaGco-2607. These findings relay evidence that UnaG recombination has no adverse effects on the steps preliminary to synthesizing minus-strand DNA. However, it is plausible that UnaG impeded RC DNA formation by limiting the step of plus-strand synthesis or circularization at the DNA sequence level, for example, through the formation of some secondary structures. In line with this, structure prediction of the minus-strand DNA showed that wild-type UnaG sequence does disrupt the original structure of minusstrand DNA, whereas UnaGco profoundly ameliorates this impact (Figure 11). An interesting implication of our findings is that the primary sequence of minus-strand DNA intrinsically serves as a selection pressure during evolution. That is, mutations adversely impacting the secondary structure of minus-strand DNA would have less fitness and be weeded out.

CONCLUSION
In conclusion, the current study provides an informative basis and a valuable method for constructing and optimizing recombinant HBV. Efforts are being taken to obtain and characterize reporter HBV based on the recombinants constructed in the study.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author/s.