An Engineered Pathway for Production of Terminally Sialylated N-glycoproteins in the Periplasm of Escherichia coli.

Terminally sialylated N-glycoproteins are of great interest in therapeutic applications. Due to the inability of prokaryotes to carry out this post-translational modification, they are currently predominantly produced in eukaryotic host cells. In this study, we report a synthetic pathway to produce a terminally sialylated N-glycoprotein in the periplasm of Escherichia coli, mimicking the sialylated moiety (Neu5Ac-α-2,6-Gal-β-1,4-GlcNAc-) of human glycans. A sialylated pentasaccharide, Neu5Ac-α-2,6-Gal-β-1,4-GlcNAc-β-1,3-Gal-β-1,3-GlcNAc-, was synthesized through the activity of co-expressed glycosyltransferases LsgCDEF from Haemophilus influenzae, Campylobacter jejuni NeuBCA enzymes, and Photobacterium leiognathi α-2,6-sialyltransferase in an engineered E. coli strain which produces CMP-Neu5Ac. C. jejuni oligosaccharyltransferase PglB was used to transfer the terminally sialylated glycan onto a glyco-recognition sequence in the tenth type III cell adhesion module of human fibronectin. Sialylation of the target protein was confirmed by lectin blotting and mass spectrometry. This proof-of-concept study demonstrates the successful production of terminally sialylated, homogeneous N-glycoproteins with α-2,6-linkages in the periplasm of E. coli and will facilitate the construction of E. coli strains capable of producing terminally sialylated N-glycoproteins in high yield.


INTRODUCTION
Escherichia coli is a commonly used host for the production of biotherapeutic and other highvalue proteins. A favored expression strategy is to export the protein of interest to the periplasm to simplify downstream processing and facilitate disulfide bond formation (Karyolaimos et al., 2019;Tripathi and Shrivastava, 2019). Yields of several g/L of active human growth hormone, containing two disulfide bonds and exported to the periplasm using the Tat pathway, were recently demonstrated in E. coli "TatExpress" strains (Guerrero et al., 2019). The lack of a natural pathway to achieve N-glycosylation remains a major limitation of E. coli as an expression host, however. Furthermore, as E. coli does not naturally produce sialic acids, it remains a particular challenge to produce terminally sialylated N-glycans, characteristic of many human N-glycoproteins, in the organism.
Terminal sialic acid has a key impact on the properties of N-glycoproteins. Due to its strong electronegativity, sialic acids can increase the solubility or resistance to proteolytic degradation of a glycoprotein, as well as enhancing its residence time in blood and promoting transportation of drugs and ions into cells (Aquino et al., 1980;Raju et al., 2001;Bork et al., 2009;Meuris et al., 2014;Cuccui and Wren, 2015;Gupta and Shukla, 2018;Thi Sam et al., 2018). It is unsurprising, therefore, that extensive research efforts have focused on overcoming the traditional bottlenecks associated with recombinant production of terminally sialylated glycoproteins of biopharmaceutical importance.
Bacterial N-linked glycosylation systems have also been explored as a means of producing sialylated N-glycoproteins due to the constraints of long time and high costs associated with producing heterologous proteins in mammalian cells. The successful functional transfer of the N-glycosylation pathway from Campylobacter jejuni to E. coli in 2002 raised the possibility of producing N-glycoproteins with customized glycans in the E. coli periplasm (Wacker et al., 2002). Since then, some successes have been reported in engineering N-glycoproteins with more human-like glycan motifs than the original C. jejuni pattern, such as the Lewis × (Le x ) glycan epitope and eukaryotic Man3GlcNAc2 core glycan (Hug et al., 2011;Valderrama-Rincon et al., 2012). Sialylation is critical to enhancing the circulatory residence time of glycoproteins (Meuris et al., 2014), however, and the reported systems would require multiple additional engineering steps to generate humanized terminally sialylated N-glycoproteins (Cuccui and Wren, 2015). The characterization of an N-linking glycosyltransferase (NGTase) in the respiratory swine pathogen Actinobacillus pleuropneumoniae has provided an alternative approach to produce N-glycoproteins in E. coli. Keys and co-workers reported a biosynthetic pathway, based on NGTase, for sitespecific polysialylation of recombinant proteins with α-2,8-linked polysialic acid (polySia) chains in the E. coli cytoplasm, albeit with only approximately 20% of target molecules modified with polySia (Keys et al., 2017). Other workers have achieved approximately 62% glycosylation of N-glycoproteins with an α-2,3-linked, terminally sialylated N-glycan trisaccharide (Neu5Acα-2,3-Gal-β-1,4-GlcNAc-) (Tytgat et al., 2019). Scale-up of either approach is limited by the need to supply the expensive and unstable donor Neu5Ac, however. Meanwhile, no terminally sialylated homogeneous N-glycoproteins with α-2,6-linkages have yet been produced in the periplasm of E. coli.
In this work, we describe the production of a terminally sialylated homogeneous N-glycoprotein in the periplasm of an engineered E. coli host. The E. coli DH5α strain was engineered to synthesize CMP-Neu5Ac and assemble a terminally sialylated glycan on an undecaprenyl-pyrophosphate lipid carrier by combining biosynthetic pathways for CMP-Neu5Ac and a tetrasaccharide human glycan mimic. Sialylation was achieved using α-2,6-STase from Photobacterium leiognathi JT-SHIZ-145 (pl-ST6) (Yamamoto et al., 2007), and the sialylated glycan was transferred on a glyco-tagged acceptor protein by OTase PglB from C. jejuni. This approach offers a pathway to economical production of terminally sialylated homogeneous N-glycoproteins.

Production and Purification of Sialylated Proteins
Bacterial strains used are listed in Table 1. Primers used in this study are listed in Supplementary Table 1. E. coli JM109 was used for maintenance and propagation of plasmid DNA. To facilitate in vivo production of CMP-Neu5Ac, a nanKETA cluster knock out was created in E. coli K12-derivative DH5α by λ Red homologous recombination (Fierfort and Samain, 2008). The resultant DH5α nanKETA:kan strain which was confirmed by sequencing, termed DKK601, lacks β-galactosidase activity which could cleave N-linked lactose produced during glycosylation.  pC15-plsg "N-glycosylation pathway," includes rbs -pglB -rbs1 -wecA -wzzE rbs2 -pglK -ompA rbs -lsgCDEF -rrnB terminator, all under control of an Arabinose promoter, Cm R ; vector backbone is pC15. rbs1 and rbs2 both were from artificial synthesis sequences based on the rbs (core sequence: AGGA) of pET28a (Novagen) This study MK353498 pIG6-Sia "sialylation pathway," includes Lac promoter -ompA leader peptide -FN3 -DQNAT sequon -T7 terminator -P regulatory regionpl-ST6 -ompA rbs -neuBCA, Amp R ; vector backbone is pIG6. P regulatory region: the upstream sequence of the N-glycosylation pgl locus from Campylobacter jejuni

This study MN721873
Frontiers in Bioengineering and Biotechnology | www.frontiersin.org For production of terminally sialylated N-glycoprotein, a single, freshly transformed colony of E. coli DKK601 containing pIG6-Sia and pC15-plsg plasmids was inoculated from a Luria-Bertani (LB) agar plate containing 100 µg ml −1 ampicillin and 70 µg ml −1 chloramphenicol into 5 ml LB with the same antibiotics and grown overnight with shaking at 37 • C. This culture was used to inoculate 500 ml LB containing the same antibiotics, followed by shaking at 28 • C and the addition of 200 µg ml −1 L-arabinose and 1 mM isopropyl β-D-thiogalactoside (IPTG) when the OD 600 reached 0.6. Expression was allowed to continue for 5 h. Bacterial growth rate was monitored by recording the optical density of the culture at 600 nm over time using a spectrophotometer. The expression and sialylation of target proteins was monitored by western blot from 0 to 5 h induction. After harvesting of cells by centrifugation, FN3 acceptor proteins were purified using 1 ml HisTrap columns, and desalted using a PD-10 column (Hu et al., 2013). Unglycosylated FN3 and glycosylated FN3 with the tetrasaccharide glycan (Gal-β-1,4-GlcNAc-β -1,3-Gal-β-1,3-GlcNAc-) were produced and purified as previously reported (Ding et al., 2017). Protein concentrations were determined by BCA assay (Pierce).

Immunoblot Analysis
Purified FN3 proteins (sialylated, un-sialylated and unglycosylated formats) were separated on 15% SDS-PAGE gels and analyzed by Coomassie Blue staining and immunodetection using a monoclonal anti-FLAG M1 antibody (Sigma-Aldrich). HRP-conjugated rabbit anti-mouse IgG (Invitrogen) was used as the secondary antibody. Proteins of varying glycosylation status were also detected with ECA-HRP and SNA-I-HRP lectins (EY Laboratories). Blots were visualized using a Chemidoc TM XRS + system (Bio-Rad) and analyzed by Image Lab software. The data are reported from at least three replicate experiments.

LC-MS/MS Analysis
Approximately 50 µg of purified protein was loaded onto 7-kD Spin Desalting Columns and eluted with 50 mM ammonium bicarbonate (Sigma-Aldrich, pH 7.5) in a 20 µl volume, followed by incubation at 56 • C for 1 h. Samples were cooled to room temperature and digested with l µg of trypsin at 37 • C for 14 h, followed by inactivation of trypsin at 80 • C for 10 min. After cooling to room temperature, l µg of Glu-C endoproteinase was added and samples were incubated at 37 • C for 14 h, and dried. Following re-solubilization with 0.1% TFA, samples were quantified using a Nanodrop 2000 and 500 ng of peptides with glycans were analyzed by Thermo Orbitrap Exactive HF mass spectrometer (Thermo Fisher). The mobile phase comprised 0.1% formic acid in water (eluent A) and 0.1% formic acid in 99.9% acetonitrile solution (eluent B). Elution was at a flow rate of 0.6 µl min −1 using three linear gradients steps: from 6 to 30% acetonitrile in 34 min, from 30 to 40% acetonitrile in 7 min, and from 40 to 95% acetonitrile in 4 min, with constant 0.1% formic acid. For exact mass measurement with a lock spray, the capillary voltage was set at 2.2 kV, the temperature at 320 • C and the normalized collision energy at 50%. The AGC target of full scan MS (200-2500 m/z) was 3 × 10 6 and data acquisition was in the Q-Orbitrap at a resolution of 120,000.

Database Analysis and Identification of Modified Residues
Spectra of the digested glycopeptides were searched with the Byonic software, which is defined by the absolute quality of the peptide-spectrum match over 300. Both the full MS and MS/MS scans were with a tolerance of 15 ppm. The presence of oxonium ions for NeuAc (292.10) and NeuAc-H 2 O (274.09) in MS/MS spectra were used to scout for sialylated glycopeptides. Indexed databases for semi-cut trypsin and Glu-C digests were created, allowing for up to three missed cleavages, and the sequencing of the peptide was performed manually. Assignments of all spectra in samples were validated by manual inspection for the precursor isotope pattern and expected glycan fragments.

Production of Terminally Sialylated, Homogeneous N-glycosylated FN3 in the E. coli Periplasm
To explore the production of a terminally sialylated homogeneous N-glycoprotein, we utilized FN3 with an N-terminal ompA leader peptide, FLAG and hexahistidine tags and a single engineered glycosylation site (DQNAT motif) (Lizak et al., 2011) as an acceptor protein. The FN3 was expressed and exported to the periplasm through the Sec pathway. To date, monobodies based on FN3 have been generated with nanomolar and picomolar range affinities for use in a number of emerging therapeutic applications (Sullivan et al., 2013). The E. coli DH5α nanKETA:kan (DKK601) strain was generated and utilized to produce terminally sialylated FN3. Growth of the strain in shake flasks was determined to be unaffected by the manipulation compared to the parental E. coli DH5α (Supplementary Figure 2). Western blot analysis of initial FIGURE 1 | The proposed biosynthetic pathway for the production of terminally sialylated proteins in the periplasm of E. coli, and associated plasmid constructs. Synthesis of CMP-Neu5Ac is achieved by the sub-pathway constructed with overexpressed neuBCA genes from C. jejuni (1). Synthesis of the pentasaccharide LLOs Gal-β-1,4-GlcNAc-β-1,3-Gal-β-1,3-GlcNAc-β-pp-undecaprenol and Neu5Ac-α-2,6-Gal-β-1,4-GlcNAc-β -1,3-Gal-β-1,3-GlcNAc-is achieved by the sub-pathway constructed with GlcNAc-1-phosphate glycosyltransferase (WecA) enzyme (2), glycosyltransferases (LsgCDEF) (3), and the α-2,6-STase (pl-ST6) enzyme (4). The sialylated glycan is flipped by PglK (or Wzx) flipases from the cytoplasmic to the periplasmic side of the membrane (5). The final modification of the target protein with the synthesized glycan is achieved by PglB (6).
co-expression of sialylation pathway genes with the FN3 acceptor protein indicated that almost 100% of FN3 molecules exhibited an increase in molecular weight after 3 h induction (Figure 2A, upper panel), while FN3 molecules produced in the presence of pIG6-FN3-Gly-1 alone exhibited no molecular weight shift after up to 5 h induction (Figure 2A, lower panel). The highest yields of putative sialylated FN3 and FN3 were 1.49 ± 0.15 mg/L and 15 ± 0.88 mg/L, respectively, after 4 h induction of IPTG and arabinose ( Figure 2B).
To determine the reason for the low level of production of the putatively sialylated FN3, the growth rate of the E. coli DKK601/pIG6-Sia + pC15-plsg and DKK601/pIG6-FN3-Gly-1 cells was investigated. Growth rates of the two strains were similar before induction but cells harboring the pIG6-Sia and pC15-plsg plasmids slowed relative to those containing only the pIG6-FN3-Gly-1 plasmid after induction of the full sialylation pathway (Figure 2C), suggesting that a higher metabolic burden was associated with the sialylation procedures.
Gal-β-1,4-GlcNAc-specific ECA (Figure 2F lower panel), while the glycosylated FN3 containing the tetrasaccharide glycan (Ding et al., 2017) was strongly detected by ECA (Figure 2F lower panel) but not by SNA-I (Figure 2F upper panel). These results indicate the presence of α-2,6-linked sialylation in the glycoengineered FN3. Detection products in lectin blots normally were consistently less intense than in antibody-based Western blots, indicating a possibly lower sensitivity or affinity of the lectin for its glycan target compared to the anti-FLAG M1 antibody.

LC-MS/MS Analysis of Sialylated Glycoproteins
The sialylated FN3 was further analyzed by intact protein MS. This analysis indicated that 55 ± 5.8% (mean ± SD, n = 3) of the putative sialylated FN3 was modified with terminally sialylated glycan (Supplementary Figure 6). As sialic acid moieties in glycans can be dissociated by the ionization process of MALDI, this makes it difficult to correctly determine the concentration of sialylated glycans by MALDI-TOF (Fukuyama et al., 2014). Accordingly, we inferred that the proportion of the sialylated glycoproteins is between the measured values of MS (55%) and lectin (90%) methods.

DISCUSSION
In this study, we successfully engineered a pathway to produce a terminally sialylated, homogeneous N-glycosylated protein with α-2,6-linkages in the periplasm of E. coli, and without the requirement to supply sialic acid in the medium. To our knowledge, this is the first demonstration of the production of such a terminally sialylated N-glycoprotein, with a sialic acid cap characteristic of human N-glycans (Neu5Ac-α-2,6-Galβ-1,4-GlcNAc-), in E. coli. The molecular properties of the sialylated FN3 protein are currently being investigated, including its susceptibility to proteolytic degradation, solubility, persistence in the circulation, and immunogenicity.
Although the sialylated glycoprotein was successfully produced in E. coli, induction of the full sialylation pathway resulted in a detrimental effect on host cell growth and a greatly reduced yield of the sialylated protein, from 15 mg/L of unmodified FN3 to a maximum purified yield of 1.5 mg/L of the sialylated FN3. Similar observations have been noted in other studies of recombinant protein glycosylation (Lizak et al., 2011;Hu et al., 2013;Keys et al., 2017) and this effect is likely to be due to the increased metabolic burden associated with expression of the multiple additional genes required for glycosylation, particularly those under the control of strong promoters (Glasscock et al., 2018). Optimization of gene expression levels in the pathway through engineering of promoters and ribosome-binding sites will be important to better balance the protein expression, LLOs production and glycosylation processes, thereby reducing the metabolic burden. Plasmid loss has also been identified as a basis for reduced yields (Lizak et al., 2011;Hu et al., 2013), with host cells required to maintain two or three compatible plasmids with different selectable markers to accommodate the multiple genes necessary for protein expression and glyco-modification. Previous results from our group indicated that FN3 could be produced in considerably higher amounts in E. coli CLM37 (Ding et al., 2017) cells than in the present E. coli DKK601 cells and so further strain screening will be carried out to increase yields of the sialylated N-glycoprotein. Future work will focus on strain screening to improve stability during protein expression, and combining glycoengineering genes onto a single plasmid or into the bacterial genome to increase sialylated protein yields (Strutton et al., 2018;Yates et al., 2019). Evaluation of the sialylation efficiencies of other well characterized bacterial α-2,6-STases is also underway in our laboratory to further increase sialylation efficiency and build on the present breakthrough. The strategy reported in this study will be valuable for constructing E. coli strains capable of producing high yields of terminally sialylated N-glycoproteins.

CONCLUSION
The work presents the first pathway for the production of terminally sialylated, α-2,6-linked N-glycoproteins in E. coli.
As the engineered E. coli strain also synthesizes the CMP-Neu5Ac required for sialylation, this further eliminates the requirement to supply exogenous Neu5Ac in the medium and renders the process amenable to scale-up. The work constitutes an important step in efforts toward creating humanlike recombinant glycoproteins in E. coli.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusion of this manuscript will be made available by the authors, without undue reservation, to any qualified researcher.

AUTHOR CONTRIBUTIONS
XH and ND conceived the project. JZ, YR, XF, LZ, GG, TZ, and YZ performed the experiments and analysis the data. JW and ND analyzed the data and revised the manuscript. XH, ND, and JZ wrote the manuscript. All authors contributed to manuscript revision, and read and approved the submitted version.