# PROCEEDINGS OF ISPMF 2018 - PLANT MOLECULAR FARMING

EDITED BY : Anneli Ritala, Heiko Rischer, Suvi Tuulikki Häkkinen, Jussi Joonas Joensuu and Kirsi-Marja Oksman-Caldentey PUBLISHED IN : Frontiers in Plant Science

#### Frontiers eBook Copyright Statement

The copyright in the text of individual articles in this eBook is the property of their respective authors or their respective institutions or funders. The copyright in graphics and images within each article may be subject to copyright of other parties. In both cases this is subject to a license granted to Frontiers. The compilation of articles constituting this eBook is the property of Frontiers.

Each article within this eBook, and the eBook itself, are published under the most recent version of the Creative Commons CC-BY licence. The version current at the date of publication of this eBook is CC-BY 4.0. If the CC-BY licence is updated, the licence granted by Frontiers is automatically updated to the new version.

When exercising any right under the CC-BY licence, Frontiers must be attributed as the original publisher of the article or eBook, as applicable.

Authors have the responsibility of ensuring that any graphics or other materials which are the property of others may be included in the CC-BY licence, but this should be checked before relying on the CC-BY licence to reproduce those materials. Any copyright notices relating to those materials must be complied with.

Copyright and source acknowledgement notices may not be removed and must be displayed in any copy, derivative work or partial copy which includes the elements in question.

All copyright, and all rights therein, are protected by national and international copyright laws. The above represents a summary only. For further information please read Frontiers' Conditions for Website Use and Copyright Statement, and the applicable CC-BY licence.

ISSN 1664-8714 ISBN 978-2-88963-794-2 DOI 10.3389/978-2-88963-794-2

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# PROCEEDINGS OF ISPMF 2018 - PLANT MOLECULAR FARMING

Topic Editors:

Anneli Ritala, VTT Technical Research Centre of Finland Ltd, Finland Heiko Rischer, VTT Technical Research Centre of Finland Ltd, Finland Suvi Tuulikki Häkkinen, VTT Technical Research Centre of Finland Ltd, Finland Jussi Joonas Joensuu, VTT Technical Research Centre of Finland Ltd, Finland Kirsi-Marja Oksman-Caldentey, VTT Technical Research Centre of Finland Ltd, Finland

Citation: Ritala, A., Rischer, H., Häkkinen, S. T., Joensuu, J. J., Oksman-Caldentey, K.-M., eds. (2020). Proceedings of ISPMF 2018 - Plant Molecular Farming. Lausanne: Frontiers Media SA. doi: 10.3389/978-2-88963-794-2

# Table of Contents


Juliane Röder, Christina Dickmeis and Ulrich Commandeur

*25 Critical Evaluation of Strategies for the Production of Blood Coagulation Factors in Plant-Based Systems*

Oguz Top, Ulrich Geisen, Eva L. Decker and Ralf Reski

*36 Plant-Produced Chimeric VHH-sIgA Against Enterohemorrhagic* E. coli *Intimin Shows Cross-Serotype Inhibition of Bacterial Adhesion to Epithelial Cells*

Reza Saberianfar, Adam Chin-Fatt, Andrew Scott, Kevin A. Henry, Edward Topp and Rima Menassa

*52 Recombinant Production of MFHR1, A Novel Synthetic Multitarget Complement Inhibitor, in Moss Bioreactors*

Oguz Top, Juliana Parsons, Lennard L. Bohlender, Stefan Michelfelder, Phillipp Kopp, Christian Busch-Steenberg, Sebastian N. W. Hoernstein, Peter F. Zipfel, Karsten Häffner, Ralf Reski and Eva L. Decker

*66 Colicins and Salmocins – New Classes of Plant-Made Non-antibiotic Food Antibacterials*

Simone Hahn-Löbmann, Anett Stephan, Steve Schulz, Tobias Schneider, Anton Shaverskyi, Daniel Tusé, Anatoli Giritch and Yuri Gleba

*82 Colicins and Salmocins – New Classes of Plant-Made Non-antibiotic Food Antibacterials*

Simone Hahn-Löbmann, Anett Stephan, Steve Schulz, Tobias Schneider, Anton Shaverskyi, Daniel Tusé, Anatoli Giritch and Yuri Gleba

*97 Epitope Presentation of Dengue Viral Envelope Glycoprotein Domain III on Hepatitis B Core Protein Virus-Like Particles Produced in* Nicotiana benthamiana

Ee Leen Pang, Hadrien Peyret, Alex Ramirez, Hwei-San Loh, Kok-Song Lai, Chee-Mun Fang, William M. Rosenberg and George P. Lomonossoff

*109 Rapid and Scalable Plant-Based Production of a Potent Plasmin Inhibitor Peptide*

Mark A. Jackson, Kuok Yap, Aaron G. Poth, Edward K. Gilding, Joakim E. Swedberg, Simon Poon, Haiou Qu, Thomas Durek, Karen Harris, Marilyn A. Anderson and David J. Craik

*119 Characterization of a GDP-Fucose Transporter and a Fucosyltransferase Involved in the Fucosylation of Glycoproteins in the Diatom*  Phaeodactylum tricornutum

Peiqing Zhang, Carole Burel, Carole Plasson, Marie-Christine Kiefer-Meyer, Clément Ovide, Bruno Gügi, Corrine Wan, Gavin Teo, Amelia Mak, Zhiwei Song, Azeddine Driouich, Patrice Lerouge and Muriel Bardor

*136 Butterfly Pea (*Clitoria ternatea*), a Cyclotide-Bearing Plant With Applications in Agriculture and Medicine*

Georgianna K. Oguis, Edward K. Gilding, Mark A. Jackson and David J. Craik

*159 Rice Seeds as Biofactories of Rationally Designed and Cell-Penetrating Antifungal PAF Peptides*

Mireia Bundó, Xiaoqing Shi, Mar Vernet, Jose F. Marcos, Belén López-García and María Coca

*172 Critical Analysis of the Commercial Potential of Plants for the Production of Recombinant Proteins*

Stefan Schillberg, Nicole Raven, Holger Spiegel, Stefan Rasche and Matthias Buntru

*182 Production of Biopharmaceuticals in* Nicotiana benthamiana*—Axillary Stem Growth as a Key Determinant of Total Protein Yield* Marie-Claire Goulet, Linda Gaudreau, Marielle Gagné, Anne-Marie Maltais, Ann-Catherine Laliberté, Gilbert Éthier, Nicole Bechtold, Michèle Martel, Marc-André D'Aoust, André Gosselin, Steeve Pepin and Dominique Michaud

*191 Plant and Microalgae Derived Peptides are Advantageously Employed as Bioactive Compounds in Cosmetics*

Fabio Apone, Ani Barbulova and Maria Gabriella Colucci


Aleyo Chabeda, Albertha R. van Zyl, Edward P. Rybicki and Inga I. Hitzeroth

*224 Effects of N-Glycosylation on the Structure, Function, and Stability of a Plant-Made Fc-Fusion Anthrax Decoy Protein*

Yongao Xiong, Kalimuthu Karuppanan, Austen Bernardi, Qiongyu Li, Vally Kommineni, Abhaya M. Dandekar, Carlito B. Lebrilla, Roland Faller, Karen A. McDonald and Somen Nandi

*241 Non-target Effects of Hyperthermostable* a*-Amylase Transgenic* Nicotiana tabacum *in the Laboratory and the Field* Ian Melville Scott, Hong Zhu, Katherine Schieck, Amanda Follick,

L. Bruce Reynolds and Rima Menassa

*253 Plant-Made Nervous Necrosis Virus-Like Particles Protect Fish Against Disease*

Johanna Marsian, Daniel L. Hurdiss, Neil A. Ranson, Anneli Ritala, Richard Paley, Irene Cano and George P. Lomonossoff


# *287 Production and Immunogenicity of Soluble Plant-Produced HIV-1 Subtype C Envelope gp140 Immunogens*

Emmanuel Margolin, Rosamund Chapman, Ann E. Meyers, Michiel T. van Diepen, Phindile Ximba, Tandile Hermanus, Carol Crowther, Brandon Weber, Lynn Morris, Anna-Lise Williamson and Edward P. Rybicki

*300 Hairy Root Cultures—A Versatile Tool With Multiple Applications* Noemi Gutierrez-Valdes, Suvi T. Häkkinen, Camille Lemasson, Marina Guillet, Kirsi-Marja Oksman-Caldentey, Anneli Ritala and Florian Cardon

# Editorial: Proceedings of ISPMF 2018 - Plant Molecular Farming

Anneli Ritala\*, Heiko Rischer, Suvi Tuulikki Häkkinen, Jussi Joonas Joensuu and Kirsi-Marja Oksman-Caldentey

VTT Technical Research Centre of Finland Ltd., Espoo, Finland

Keywords: plant molecular farming, recombinant protein, biopharmaceutical, Nicotiana benthamiana, glycosylation

**Editorial on the Research Topic**

#### **Proceedings of ISPMF 2018 - Plant Molecular Farming**

Plant Molecular Farming Research Topic was launched in the 3rd Conference of the International Society of Plant Molecular Farming (ISPMF) in Helsinki, Finland, on June, 2018. Altogether, this Research Topic attracted 31 manuscripts of which 23 were accepted and published. The articles cover recent outcomes and success stories in Plant Molecular Farming and some of the highlights are summarized below.

#### Edited by:

Angelos K. Kanellis, Aristotle University of Thessaloniki, Greece

#### Reviewed by:

Johannes Felix Buyel, Fraunhofer Institute for Molecular Biology and Applied Ecology, Fraunhofer Society (FHG), Germany Antonio Granell, Consejo Superior de Investigaciones Científicas (CSIC), Spain

> \*Correspondence: Anneli Ritala anneli.ritala@vtt.fi

#### Specialty section:

This article was submitted to Plant Metabolism and Chemodiversity, a section of the journal Frontiers in Plant Science

> Received: 18 February 2020 Accepted: 01 April 2020 Published: 28 April 2020

#### Citation:

Ritala A, Rischer H, Häkkinen ST, Joensuu JJ and Oksman-Caldentey K-M (2020) Editorial: Proceedings of ISPMF 2018 - Plant Molecular Farming. Front. Plant Sci. 11:492. doi: 10.3389/fpls.2020.00492

Knödler et al. reported an important finding related to production of recombinant proteins in tobacco plants in greenhouse conditions. Their data indicates that high temperatures (>28◦C) and intense illuminance (>45 klx h−1) may cause substantial loss in the target protein yield due to stability problems. They showed that up to 90% of the product can get lost because of the extreme conditions. This high instability caused by the environmental factors needs to be taken into consideration when designing the contained production facilities for recombinant proteins. It might even be that a fully-controlled indoor farm turns out to be the most cost effective choice.

The paper of Goulet et al. describes the largely neglected phenomena how the basic culture conditions can substantially influence on growth and overall performance of plants producing recombinant pharmaceuticals. They used transient protein expression system using N. benthamiana and a promising vaccine antigen, influenza hemagglutinin H1, as a model. It was demonstrated that H1 antigen is not evenly distributed in the plant. The production yields were highly influenced by the age of plant leaves, young leaves being better producers than older ones. The auxillary stem leaves contributed more than 50% of total yield of antigen even though representing less than 30% of the total biomass.

Protein properties such as folding, structure, and function are substantially affected by N-glycosylation. Therefore, especially glycoprotein-based therapeutics rely on optimal Nglycosylation patterns. Xiong et al. produced three variants of the anthrax decoy protein rCMG2-Fc, an antitoxin resulting from the fusion of receptor Capillary Morphogenesis Gene 2 protein with the salvage neonatal Fc-receptor, in Nicotiana benthamiana plants. Notably, the authors show that all variants were able to bind the protective antigen of the anthrax toxin. Expression, integrity and thermostability were, however, differentially affected by glycosylation.

Soluble envelope (Env) glycoproteins constitute promising antigens for human immunodeficiency virus type 1 (HIV-1) vaccine development. Margolin et al. report the development of an Agrobacterium-mediated transient expression system for the production of cognate soluble HIV-1 subtype C gp140 antigens in N. benthamiana as an alternative for conventional production in mammalian cells, which is costly and limited in scalability. They present the successful production of trimeric soluble HIV-1 Env protein although with low yields of ∼5–6 mg/g fresh weight of purified protein and demonstrate promising immunogenicity in rabbits.

Nanotechnology is a rapidly advancing field applying nanostructured materials from various organic and inorganic sources. Applications in the field of vaccination, electronics, and bioimaging have been developed and especially for the latter, of PVX particles show unforeseen potential. They are able to carry large payloads and due to their filamentous structure tumor homing and retention properties are better than those of spherical structures. In addition, being protein-based nanoparticles, they are more suitable for biomedical applications compared to their synthetic competitors. A review by Röder et al. gathers technologies applying PVX nanoparticles and describes future opportunities and challenges for the utilization of PVX nanoparticles in various scientific fields.

Food-related bacterial outbreaks are occurring with increasing frequency and severity, enhanced by globalization of food production and active transportation of food ingredients and products. In addition, increasing interest by consumers for "organic" foods result in avoidance or reduction of chemicals and antibiotics by farmers and processing industry, leading to practices which may introduce additional risks of bacterial contamination. Hahn-Löbmann et al. describe the use of plantmade recombinant proteins namely colicins and salmocins in food processing applications. Nomad Bioscience, using the GRAS (Generally Recognized As Safe) regulatory process in the United States, has obtained favorable regulatory review and marketing allowance from the FDA for its Escherichia-derived antibacterial proteins, colicins, for the food use. Salmocins colicin-type proteins derived from Salmonella—are currently under GRAS status approval. The research and development, as well as regulatory and economic aspects of both compounds as food antimicrobials are discussed.

Human papilloma virus (HPV) tumor disease causes major health risks especially in developing countries. Massa et al. used tomato hairy roots to produce therapeutic vaccines against HPV, expressing a harmless form of the HPV type 16 E7 protein fused to a non-cytotoxic form of the saporin protein. Hairy root clones were obtained by infecting leaves of Solanum lycopersicum yielding approximately 35.5 µg/g of fresh biomass of expressed protein. Immunological response associated to anticancer activity was shown and in particular, synergistic effect of using DNA as prime, and hairy root extract as boost demonstrated the highest efficacy. In this work, the possibilities for using hairy root technology as plant-based biofactories was described, showing the great potential for biomedical applications.

Schillberg et al. presented a critical review on the plant-based expression systems including the highlights and bottlenecks of the platform. The most obvious obstacles are remaining low yields and challenges for the downstream processing i.e., purification of the product from the plant matrix. The most prominent opportunities of the plant expression systems being better or altered protein functionality and the speed of the transient expression systems. Increasing consumer demand for the animal-free and more "natural" products can be also seen as an asset of the system. The kind of healthy realism presented in the review is most welcome when we prepare ourselves for the future. More commercially viable examples are needed to drive the plant-based production further.

The importance of the Plant Molecular Farming Research Topic is clearly seen in the statistics. Only in a year's time, after the first accepted paper, the Research Topic has gained more than 50 000 views and 7,000 downloads. The 4th ISPMF conference was planned to take place in Rome, Italy on June 8–10, 2020. Unfortunately, the COVID-19 outbreak forced us to cancel the meeting and postpone it to the future. We are convinced that we will get together again and at that time we are having a great 4th ISPMF conference with excellent presentations showcasing outstanding data of the latest breakthroughs of Plant Molecular Farming. Stay tuned at http://www.ispmf.org/.

#### AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Ritala, Rischer, Häkkinen, Joensuu and Oksman-Caldentey. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Small, Smaller, Nano: New Applications for Potato Virus X in Nanotechnology

#### Juliane Röder† , Christina Dickmeis† and Ulrich Commandeur\*

Institute for Molecular Biotechnology, RWTH Aachen University, Aachen, Germany

Nanotechnology is an expanding interdisciplinary field concerning the development and application of nanostructured materials derived from inorganic compounds or organic polymers and peptides. Among these latter materials, proteinaceous plant virus nanoparticles have emerged as a key platform for the introduction of tailored functionalities by genetic engineering and conjugation chemistry. Tobacco mosaic virus and Cowpea mosaic virus have already been developed for bioimaging, vaccination and electronics applications, but the flexible and filamentous Potato virus X (PVX) has received comparatively little attention. The filamentous structure of PVX particles allows them to carry large payloads, which are advantageous for applications such as biomedical imaging in which multi-functional scaffolds with a high aspect ratio are required. In this context, PVX achieves superior tumor homing and retention properties compared to spherical nanoparticles. Because PVX is a protein-based nanoparticle, its unique functional properties are combined with enhanced biocompatibility, making it much more suitable for biomedical applications than synthetic nanomaterials. Moreover, PVX nanoparticles have very low toxicity in vivo, and superior pharmacokinetic profiles. This review focuses on the production of PVX nanoparticles engineered using chemical and/or biological techniques, and describes current and future opportunities and challenges for the application of PVX nanoparticles in medicine, diagnostics, materials science, and biocatalysis.

Keywords: plant virus, genetic engineering, chemical conjugation, nanoparticles, imaging, drug delivery, bioinspired materials

#### WHAT A WONDERFUL WORLD: PLANT VIRUS NANOPARTICLES

In the rapidly evolving interdisciplinary field of nanotechnology, VNPs are receiving more and more attention due to their outstanding structural characteristics and ease of functionalization compared to synthetic nanoparticles. Plant VNPs are particularly attractive because they are non-infectious in humans and thus inherently safe. Numerous copies of one or more identical CP

#### Edited by:

Suvi Tuulikki Häkkinen, VTT Technical Research Centre of Finland Ltd., Finland

#### Reviewed by:

Chiara Lico, Italian National Agency for New Technologies, Energy and Sustainable Economic Development (ENEA), Italy Tomas Moravec, Institute of Experimental Botany (ASCR), Czechia

\*Correspondence:

Ulrich Commandeur Ulrich.Commandeur@ molbiotech.rwth-aachen.de

†These authors have contributed equally to this work

#### Specialty section:

This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science

Received: 23 November 2018 Accepted: 29 January 2019 Published: 19 February 2019

#### Citation:

Röder J, Dickmeis C and Commandeur U (2019) Small, Smaller, Nano: New Applications for Potato Virus X in Nanotechnology. Front. Plant Sci. 10:158. doi: 10.3389/fpls.2019.00158

**8**

**Abbreviations:** CALB, Candida antarctica lipase B; CP, coat protein; dpi, days post-inoculation; ELISA, enzyme-linked immunosorbent assay; FMDV, Foot-and-mouth disease virus; GFP, green fluorescent protein; MIP, mineralization-inducing peptide; ORF, open reading frame; pI, isoelectric point; PEG, polyethylene glycol; PVX, Potato virus X; SC, SpyCatcher; sgRNA, subgenomic RNA; ST, SpyTag; TGB, triple gene block; TMV, Tobacco mosaic virus; VNP, virus nanoparticle.

subunits self-assemble into a defined spherical or rod-shaped particle (depending on the virus species), many of which have been characterized to atomic resolution (Lin et al., 1999; Clare and Orlova, 2010; Lee et al., 2012; Kendall et al., 2013; Agirrezabala et al., 2015). Although different viruses have distinct surface properties, these can easily be tailored to achieve a desired function by genetic engineering or chemical conjugation, or a combination of both, allowing the precise nanoscale control of VNP structure and function. Large quantities of plant VNPs can be produced in the laboratory by molecular farming, in which plants are used as a virus production factory. The resulting VNPs are highly stable under a wide range of conditions.

The two most popular plant VNP platforms are Cowpea mosaic virus and TMV, and their applications have been extensively reviewed (McCormick and Palmer, 2008; Soto and Ratna, 2010; Pokorski and Steinmetz, 2011; Alonso et al., 2013; Love et al., 2014; Wen and Steinmetz, 2016). In contrast, the development of PVX for pharmaceutical and imaging applications has only been discussed in a single review article thus far, even though PVX-based VNPs are unique in their ability to offer multi-functional flexible scaffolds with a high aspect ratio (Lico et al., 2015). In this article, we therefore focus exclusively on the modification of PVX and its applications in medicine, diagnostics, materials science and biocatalysis.

## COME AS YOU ARE: POTATO VIRUS X

#### The Way I Tend to Be: Characteristics

Potato virus X belongs to the family Alphaflexiviridae and is the type member of the genus Potexvirus (Adams et al., 2004). It is considered important among plant pathogens that infect agricultural plants of the family Solanaceae, especially potato, tomato and tobacco, and is transmitted via mechanical contact.

Potato virus X has a 6.4-kb positive-stranded RNA genome containing five ORFs, with a 5<sup>0</sup> -methylguanosine cap and a polyadenylated 3<sup>0</sup> -end (Koonin, 1991; Kim and Hemenway, 1997). The first ORF encodes the 166-kDa RNA-dependent RNA polymerase which is required for virus replication, whereas cellto-cell movement is mediated by p25, p12 and p8, the products of three overlapping ORFs known as the TGB (Angell et al., 1996; Verchot et al., 1998; Draghici et al., 2009). In addition, p25 is also a silencing inhibitor (Bayne et al., 2005; Chiu et al., 2010). The fifth ORF encodes the CP, multiple copies of which assemble to form the capsid around the genomic RNA. The CP is also important for cell-to-cell and long-distance (systemic) transport through the plant (Tollin and Wilson, 1988; Chapman et al., 1992; Fedorkin et al., 2001; Betti et al., 2012). The virus proteins are translated from three sgRNAs: sgRNA1 (2.1 kb) expresses TGB p25; sgRNA2 (1.4 kb) expresses TGB p12 and p8, the latter by leaky scanning (Dolja et al., 1987; Verchot et al., 1998); and sgRNA3 (0.9 kb) expresses the CP (Dolja et al., 1987).

The high-resolution structure of isolated PVX CP subunits has not yet been solved, although models have been proposed

(Nemykh et al., 2008; Kendall et al., 2013). The 515 × 14.5 nm flexuous rod-shaped particle (**Figure 1**) comprises 1270 CP subunits with 8.90 ± 0.01 subunits per turn, forming a 3.45 nm helical pitch (Tollin et al., 1980; Parker et al., 2002). Each CP subunit is thought to contain seven α-helices and six β-strands, with the C-terminus located inside the assembled particle and the N-terminus projected externally (Sober et al., 1988; Baratova et al., 1992; Nemykh et al., 2008). The N-terminus therefore provides an excellent site for the presentation of recombinant peptides, which is achieved by introducing the corresponding sequence at the 5<sup>0</sup> -end of the cp gene (see further discussion in Section "Change the World: Genetic Engineering").

In-frame deletions of the cp 5 0 -end that affect PVX infectivity and particle morphology were first described by Chapman et al. (1992). These deletions were shown to produce intact virions capable of systemic infection, but electron microscopy revealed an atypical twisted morphology similar to that of particles exposed to trypsin (Tremaine and Agrawal, 1972). These data suggested that the N-terminus influences intramolecular and/or intermolecular interactions that stabilize the virus structure.

The apparent molecular weight of the PVX CP as determined by sodium dodecylsulfate polyacrylamide gel electrophoresis changes when the CP is exposed to trypsin, which removes an N-terminal segment. The latter comprises a highly conserved cluster of serine and threonine residues representing potential glycosylation sites. When these are replaced with alanine or glycine the glycans are not added (Kozlovsky et al., 2003; Baratova et al., 2004). The PVX CP contains a single O-linked hexose monosaccharide (galactose or fucose) joined to the acetylated serine residues (NAcSer1). These carbohydrates alter the electrophoretic mobility of the CP and induce the formation of a columnar shell of bound water molecules (Tozzini et al., 1994; Kozlovsky et al., 2003; Baratova et al., 2004). Moreover, sequence analysis revealed in-frame deletions affecting the first 29 residues of the CP (129CP) in late infection passages of recombinant PVX particles displaying two Beet necrotic yellow vein virus epitopes (Uhde-Holzem et al., 2007). The N-terminus of the CP is therefore thought to help maintain the helical structure of the virus during assembly, and may influence the assembly and stability of PVX particles. These phenomena must be taken into consideration when creating PVX mutants (see further discussion in Section "Change the World: Genetic Engineering").

# All the Small Things: Virus Assembly and Virus-Like Particles

In contrast to the Potexvirus Papaya mosaic virus (Erickson and Bancroft, 1978), PVX CP subunits have not yet been shown to assemble into filamentous virus-like particles in the absence of RNA either in vivo or in vitro. This is likely to reflect the specific recognition of the virus genomic RNA by the CP, which plays a key role during the assembly of the virion (Kwon et al., 2005). The genomic RNA region which interacts specifically with the CP is known as the origin of assembly and similar structures have been identified in other plant viruses, e.g., TMV, Brome mosaic virus and Turnip crinkle virus (Butler, 1984; Sit et al., 1994; Miller et al., 1998; Choi and Rao, 2003; Arkhipenko et al., 2011). In the case of PVX, the origin of assembly is at the end of the 5<sup>0</sup> -region of the RNA, defined as stem loop 1 (Miller et al., 1998; Cho et al., 2012). This secondary structure forms within the nucleotide sequence spanning positions 32–106 of the 5<sup>0</sup> -region, and consists of four stems (SA, SB, SC, SD), three internal asymmetric loops (LA, LB, LC), and a terminal tetraloop (Park et al., 2008). A portion of the stem loop 1 region comprising nucleotides 32–47 and 86– 106 (SA, SB, LA, LB) is likely to adopt multiple conformations (Miller et al., 1998). This secondary structure contributes to functions such as virus replication, translation and cell-to-cell transport, as well as influencing the virion composition (Kim and Hemenway, 1997; Kwon et al., 2005; Lough et al., 2006). It is likely that the functional properties of the origin of assembly are conferred by its structure rather than a particular nucleotide sequence, and this is important for the assembly of PVX with (heterologous) RNAs into VNPs (Kwon et al., 2005; Park et al., 2008; Arkhipenko et al., 2011).

When heated to 70◦C, filamentous PVX particles begin to swell at one or both ends (Nikitin et al., 2016). Increasing the temperature to 90◦C for 10 s resulted in the formation of spherical PVX virus-like particles (Nikitin et al., 2016), similar to TMV structures formed at higher temperatures, as reported by Atabekov et al. (2011). The average diameter of these spherical PVX particles was 48 and 77 nm at concentrations of 0.1 and 1.0 mg ml−<sup>1</sup> , respectively. However, increasing the virus concentration to 10 mg ml−<sup>1</sup> did not cause any further change in the diameter. These particles did not contain RNA and were not resistant to detergents such as 0.15% sodium dodecylsulfate. Analysis of the predicted secondary structure of the denatured CP revealed some differences compared to the filamentous PVX particles (DiMaio et al., 2015). The α-helical content was 14–19% and the β-sheet content was 28–99%, with 53% of the protein remaining unordered. However, most native epitopes were retained on the particle surface so these atypical particles may still be suitable for the presentation of antigens.

# Don't Stop 'Til You Get Enough: Production of VNPs

PVX VNPs are typically produced by the infection of tobacco (Nicotiana) species, including N. benthamiana, N. tabacum Xanthi nc or Samsun NN, N. clevelandii or N. glutinosa. To propagate the (recombinant) virus, plants are inoculated with a PVX-derived vector comprising a cDNA copy of the viral genome under control of the Cauliflower mosaic virus 35S promoter. The vector contains either the wild-type cp gene or a corresponding gene fusion allowing the external display of a peptide. Plants can also be infected with the cDNA copy using Agrobacterium tumefaciens. However, this approach is mostly used to express recombinant proteins from deconstructed PVX vectors rather than the production of VNPs (Peyret and Lomonossoff, 2015). The surface of 4-week-old plants is gently treated with Celite 545 or a similar abrasive and three leaves are inoculated with 10 µg plasmid DNA (Lee et al., 2014). Infected plants should be harvested 14–21 dpi, which is a sufficient time for the establishment of a systemic infection. Particles are usually purified according to a modified protocol from the International Potato Center (Lima, Peru). Detailed protocols have been published (Lee et al., 2014; Lauria et al., 2017; Shukla et al., 2018).

Once successful infection is established, the VNPs can easily be propagated by using plant extracts or purified VNPs for the direct inoculation of uninfected plants (Uhde-Holzem et al., 2007). However, the genetic instability of the recombinant RNA genomes over serial passages of infection is a major limitation (Avesani et al., 2007; Dickmeis et al., 2014). Vector DNA or plant extracts from the first round of infection should therefore be used to ensure the reproducible production of VNPs.

## MASTER OF PUPPETS: STRATEGIES FOR THE CREATION OF MODIFIED PVX NANOPARTICLES

Modified PVX nanoparticles can be produced by methods that result in either a permanent or reversible functionalization introduced by genetic engineering and/or chemical conjugation. The choice of production approach depends on which properties are required in the VNP. Thus far, PVX nanoparticles have been used as scaffolds for external peptide presentation. Given that particles can only form in the presence of genomic RNA, the steric limitations of the virus morphology make it difficult to display peptides in the internal channel or to use this channel to carry a payload of drugs or imaging molecules. Exceptionally, hydrophobic substances such as the drug doxorubicin have been used for traceless deposition by spontaneous attachment to the surface grooves of the virus (Le et al., 2017b). However, the major drawback of this method is the need for a high molar excess of the drug and a long reaction time. **Figure 2** summarizes the types of modifications used thus far for the production of PVX-based VNPs.

#### Change the World: Genetic Engineering

Genetic engineering is the preferred strategy to modify VNPs when the aim is to display single amino acids or small peptides. Recombinant PVX particles can be created by adding the target sequence in frame at the 5<sup>0</sup> -end of the cp gene, whereas insertions at the 3<sup>0</sup> -end tend to be detrimental, probably because they inhibit virus replication and assembly (Chapman et al., 1992; Hoffmeisterova et al., 2012). Target sequences introduced

at the 5<sup>0</sup> -end must meet certain criteria to allow the virus to assemble into functional particles and move locally and systemically within the plant. A major limitation in this context is the size of the target peptide, which generally must be no longer than 60 amino acids (Uhde-Holzem et al., 2016). This is because PVX avoids extra genetic load by selecting against larger insertions. However, we recently found that the fluorescent protein iLOV (113 amino acids) could be fused directly to the CP without impairing the assembly and systemic movement of the virus (Röder et al., 2018) as discussed further in Section "Ligth Me Up: Imaging With PVX VNPs". Virion assembly is sensitive to steric hindrance (Dawson et al., 1989; Cruz et al., 1996), and cell-to-cell movement is inhibited by the presence of too many tryptophan residues (Lico et al., 2006). Furthermore, the absence of serine and threonine (Lico et al., 2006; Betti et al., 2012) or the presence of too many positively charged amino acids (Uhde-Holzem et al., 2007) can make the virus unstable resulting over several serial passages in the selection of compensatory deletion mutants. In fact, serine and threonine residues are essential for phosphorylation and glycosylation thereby stabilizing the particles by creating a surrounding water shell (Baratova et al., 2004; Lico et al.,

2006). The pI is another factor to consider when designing a CP fusion protein. If the pI of the CP fusion is in the range 5.2–9.2, the assembled particle can move systemically (Lico et al., 2006; Uhde-Holzem et al., 2007). Otherwise, the pI must be adjusted by introducing a compensatory sequence such as the acidic DEADDAED peptide (Röder et al., 2017). Certain peptides favor an additional flexible glycine/serine-rich linker, including mineralization-inducing peptide 3 (MIP3) as discussed in Section "Material Girl: PVX for Biomaterial Applications" (Lauria et al., 2017).

The assembly of particles comprising CP fusion proteins containing more than 60 additional amino acids can be facilitated by mixing recombinant and wild-type CPs, thus overcoming the steric hindrance between recombinant CPs in homogeneous recombinant particles. This can be achieved by introducing the ribosomal skip sequence from Foot-and-mouth disease virus (FMDV), known as the 2A sequence, between the 3<sup>0</sup> end of the inserted sequence and the 5<sup>0</sup> -end of the wildtype cp gene (Cruz et al., 1996; Donnelly et al., 2001). This overcoat strategy allows entire proteins to be presented on recombinant virus particles, including fluorescent proteins (Cruz et al., 1996; Shukla et al., 2014a), enzymes such as lipase (Carette et al., 2007), epitopes (Marconi et al., 2006; Zelada et al., 2006; Uhde-Holzem et al., 2010), the rotavirus VP6 protein (O'Brien et al., 2000), and a single-chain antibody (Smolenska et al., 1998), as discussed further in Section "Knowing Me, Knowing You: Biosensing." However, the 2A sequence does not ameliorate the general instability of vectors carrying large inserts, and selection pressure still tends to favor their deletion (Scholthof et al., 1996). For example, the GFP sequence was deleted from the vector PVX-GFP-2A-CP after 28 days (Shukla et al., 2014a).

Vaculik et al. (2015a) identified four promising internal transgene insertion positions within the surface loops of the PVX CP. The tested epitope insertion mutant was infectious and produced particles only after amino acid 23 (Vaculik et al., 2015b). However this position still belongs to the N-terminal intrinsically disordered domains of Potexviruses (Solovyev and Makarov, 2016) and is therefore not essential for particle assembly. This was proven by Chapman et al. (1992) by removing codons 7–31 of the PVX CP leading to virions with atypical morphology. In fact, spontaneous deletions in this region occur during infection with recombinant particles displaying epitopes (Lico et al., 2006; Uhde-Holzem et al., 2007). Furthermore deletions up to residue 29 successfully produced particles though only in low amounts (Lico et al., 2006; Uhde-Holzem et al., 2007; Dickmeis et al., 2014). An epitope insertion immediately after amino acid 23 might therefore be beneficial for virus stability and yield.

#### Catch Me If You Can: Sticky Particles

In addition to the limitations conferred by the size constraints and instability of inserted sequences (see Change the World: Genetic Engineering), another issue is the inability to present large peptides and proteins that require posttranslational modifications for functionality, because virus replication and assembly occurs exclusively in the cytoplasm. This problem can be addressed using chemical conjugation methods, as demonstrated by the conjugation of the heavy chain of the breast cancer drug Trastuzumab/Herceptin (Esfandiari et al., 2016). However, chemical methods for the attachment of proteins require a 1000-fold molar excess of the protein and very long reaction times, yet still result in poor conjugation efficiencies (typically of 21–86%) depending on the conjugation strategy and the size of the target molecule (Schlick et al., 2005; Holder et al., 2010; Venter et al., 2011; Wen et al., 2012). These cases clearly reveal the need for a rapid and site-specific covalent immobilization method. We recently demonstrated the stable attachment of a functional endoglucanase to PVX using the SpyTag/SpyCatcher (ST/SC) system (Zakeri et al., 2012; Röder et al., 2017). We modified PVX VNPs to display the short ST (see A Little More Action, Please: Catalysis), allowing the rapid and specific irreversible attachment of a SC fusion protein with, in this case, a ∼70% coupling efficiency. Problems resulting from chemical coupling or genetic engineering methods, including size constraints and inappropriate amino acid compositions, can be overcome using this approach. PVX-ST VNPs therefore provide a universally applicable platform with great promise for future practice.

# The Chemistry Between Us: Chemical Addressability

Potato virus X can be modified not only by genetic engineering but also by chemical conjugation, which is advantageous when the functionalization is conferred not by small peptides but by whole proteins, polymers or small molecules such as fluorescent dyes. Each PVX CP bears numerous amine and carboxylate groups among its 11 lysine, 10 aspartic acid, 10 glutamic acid and 3 cysteine residues, although only a single lysine residue and a single cysteine residue are exposed to the solvent, making them addressable using N-hydroxysuccinimide and maleimide chemistry, respectively (Pierpoint, 1974; Gres et al., 2012; Le et al., 2017a). PVX can be made more amenable to conjugation reactions by inserting additional amino acids carrying suitable exposed side chains, using the genetic engineering methods described in Section "Change the World: Genetic Engineering" (Wang et al., 2002; Geiger et al., 2013). Further potential targets for chemical modification are the glycans present in some strains of PVX, but earlier studies showed that they are not addressable (Gres et al., 2012). Various conjugation methods including click chemistry have been comprehensively reviewed (Pokorski and Steinmetz, 2011). However, these methods suffer from poor conjugation efficiencies and a large molar excess of the target molecule is generally required (Schlick et al., 2005; Holder et al., 2010; Venter et al., 2011). The fluorescent dye OregonGreen 488 was conjugated to PVX particles using both Lys-N-hydroxysuccinimide and Cys-maleimide chemistry, with the former achieving the best performance resulting in the modification of up to 15% of the CPs (Le et al., 2017a). This may reflect the low accessibility of the Cys residue, which is thought to be located within a surface groove.

# Two Princes: PVX Coat Protein Expression Using a TMV Co-vector

For the construction of peptide vaccines, it is often advantageous to present several different epitopes on a single scaffold to induce a strong immune response (Sette and Peters, 2007). This can be challenging when using plant viruses because viruses of the same species with different CP modifications cannot achieve simultaneous infections due to the phenomenon of super-infection exclusion (Folimonova, 2012; Julve et al., 2013; Zhang et al., 2018). The presentation of several epitopes on a plant virus can be achieved by (separate) heterologous expression followed by in vitro assembly (Eiben et al., 2014; Tyler et al., 2014; Jin et al., 2016). However, as stated in Section "All the Small Things: Virus Assembly and Virus-Like Particles", PVX cannot assemble without its genomic RNA and in vitro assembly does not achieve high yields of VNPs.

To address these issues, we developed an expression system for the construction of chimeric PVX particles displaying different proteins as CP fusions. We used combinations of PVX and TMV expression vectors each expressing different PVX CP fusions and achieved proof of principle using fusion proteins containing GFP and mCherry as well as split-mCherry (Dickmeis et al., 2015). We also reported the first co-expression of CPPVX using a full-sized PVX expression vector, a remarkable achievement given that the expression of a virus CP often leads to crossprotection against other strains of the same virus (Gal-On and Shiboleth, 2006; Lin et al., 2007). The expression of CPTMV by PVX prevents co-infection with a TMV vector, leading to CP-mediated resistance in N. benthamiana (Lu et al., 1998). In contrast, the expression of CPPVX by TMV does not have this effect, allowing robust co-expression with the PVX vector. The TMV-derived CPPVX did not appear to inhibit PVX infection although the TMV infection process is faster and CPPVX is expressed in the cells before the PVX vector gains entry. In contrast, we found that PVX infection was enhanced by the TMV vector, yielding brighter fluorescence for the PVX-expressed fluorescent proteins. The enhancement of PVX in TMV/PVX coexpression systems is well known (Goodman and Ross, 1974) and was also observed in our combination, whereas no CP-mediated resistance against PVX was detected. Our system was therefore able to achieve the co-presentation of GFP and mCherry on PVX particles as well as the reconstruction of split-mCherry on the particle surface.

# LEARN TO FLY: APPLICATIONS (FIGURE 3)

## What Will Become of Us: In vivo Fate and Cytotoxicity of PVX Nanoparticles

Virus nanoparticles have many advantages as carrier systems for drugs and imaging molecules, but their repetitive proteinaceous structures (reflecting the assembly of multiple identical CPs) can induce an immune response, which is a barrier to clinical translation (Lico et al., 2009). The ordered and multivalent structures formed by both helical and icosahedral capsids also appear to function as pathogen-associated molecular patterns,

which can be recognized by the innate immune system and elicit robust cellular and humoral immune responses (Lizotte et al., 2015). Thus, VNPs intended for medical applications must be evaluated for risks, and their fate and potential immunogenic or cytotoxic effects must be determined.

The bioavailability of VNPs in vivo can be controlled by modifying their surface chemistry, and the introduction of targeting ligands can facilitate their interactions with specific cell types. For example, VNPs carrying ligands recognized by receptors on cancer cells can be used to deliver toxic drug payloads to tumors, resulting in the accumulation of drugs inside the tumor while reducing systemic side effects. In addition to these active targeting mechanisms, the accumulation of nanoparticles in tumors also occurs via passive processes, which are enhanced by the higher aspect ratio of PVX (see Killing in the Name: Tumor Homing and Drug Delivery). Nanoparticle-mediated drug delivery can be achieved using lipid-based micelles, carbon nanotubes, metal nanoparticles, polymeric capsules, iron oxide nanoparticles, or protein-based particles and nanocages (Bhattacharya et al., 2014; Jin et al., 2018; Khan et al., 2018; Singh et al., 2018; Vallabani and Singh, 2018; Xiao et al., 2019). The shape of the nanoparticles has a huge impact on their in vivo behavior, particularly tissue accumulation and clearance from the circulatory system (Lee et al., 2015). The filamentous shape of PVX offers several advantages in this context because elongated materials evade the immune system more effectively, reducing the quantity of particles lost by macrophage uptake (Arnida et al., 2011; Vácha et al., 2011).

Furthermore, coating VNPs with the uncharged, hydrophilic polymer PEG can also reduce their immunogenicity by preventing undesirable non-specific cell interactions, thus prolonging the plasma circulation time and increasing their stability (Lewis et al., 2006; Steinmetz and Manchester, 2009; Shukla et al., 2013). Many functionalized PEG monomers and chains are available for the modification of nanomaterials (Raja et al., 2003; Bruckman et al., 2014; Lee et al., 2015). PEG coating creates a hydrophilic shield, which inhibits serum protein adsorption and confers stealth properties that increase the circulation time and also reduce the tendency of VNPs to accumulate in the liver and spleen (Owens and Peppas, 2006; Soo Choi et al., 2007; Ruggiero et al., 2010; Huang et al., 2011).

In healthy mice, PEGylated PVX particles accumulate in the white pulp regions of the spleen, and to a lesser extent in the liver and kidneys, 2–6 h after intravenous administration. This indicates that PVX is mainly sequestered by the mononuclear phagocyte system in the spleen and liver (Shukla et al., 2014b). Non-PEGylated PVX particles were also shown to adhere to red blood cells and penetrate the white pulp of the spleen (Lico et al., 2016). PEGylated PVX particles colocalize with F4/80-positive macrophages, probably Kupffer cells, in the liver (Shukla et al., 2014b). Filamentous nanomaterials in the same size range as PVX are usually cleared by the mononuclear phagocyte system, but are not excreted by the renal system (Bartneck et al., 2012; Sa et al., 2012; Raza et al., 2017). However, renal clearance cannot be ruled out given the small dimensions along the PVX short axis, and the potential presence of part-digested or broken VNP fragments. Strong PVX fluorescence signals were also observed in the stools of the injected mice, suggesting that some particles are also cleared through the hepatobiliary system. More detailed analysis revealed the accumulation of PVX in B-cells and a higher number of T-cells in the spleen, which may reflect the immunogenicity of PVX and accordingly the induction of humoral and cellular immune responses (Shukla et al., 2014b). PVX was cleared from tumors and other tissues after 5 days, and the strong fluorescence signal from the kidney indicated PVX degradation followed by renal filtration to the bladder.

Blandino et al. (2015) studied the fate and cytotoxicity of filamentous PVX particles and icosahedral Tomato bushy stunt virus particles in hemolysis assays and early embryo assays. Their data showed that the virus particles were very robust and were still able to infect plants after serum incubation for up to 24 h. The hemolysis assay revealed that 10 µg of PVX particles had no effect on erythrocytes in vitro, whereas 100–200 µg caused the slight and dose-dependent induction of hemolysis. However, the rate of hemolysis (1.8– 2.7%) was far lower than the 5% threshold mandated for biomaterials under ISO/TR 7406 (Li et al., 2011, 2012) with a very high VNP/erythrocyte ratio (Blandino et al., 2015). The early embryo assay is used to determine the teratogenic potential of substances during the first week of embryonic development (Henshel et al., 2003). PVX showed no signs of toxicity or teratogenicity at doses ranging from 1 to 10 µg per embryo, whereas carbon nanotubes induced up to 50% mortality as well as embryo malformations (Blandino et al., 2015). Furthermore, we observed no evidence of apoptosis when we seeded human mesenchymal stem cells onto a PVX-coated surface (Lauria et al., 2017).

Potato virus X nanoparticles are much safer than mammalian viruses for clinical use because they neither infect nor replicate in mammals (Manchester and Singh, 2006). Plant VNPs at doses of up to 100 mg (10<sup>16</sup> particles) per kg body weight showed no sign of clinical toxicity, which indicates that high concentrations could be used for the targeted destruction of tumors (Kaiser et al., 2007; Singh et al., 2007).

## Killing in the Name: Tumor Homing and Drug Delivery

Non-spherical materials achieve better tumor homing and margination toward vessel walls than spherical particles (Cai et al., 2007; Geng et al., 2007; Gentile et al., 2008; Christian et al., 2009; Lee et al., 2009; Decuzzi et al., 2010; Magee, 2012) and present ligands more efficiently to target cells as well as the larger and flatter vessel wall (Lee et al., 2009; Doshi et al., 2010; Tan et al., 2013). They also achieve more efficient tumor penetration than spherical particles (Nederman et al., 1983; Netti et al., 2000; Thurber et al., 2008; Stylianopoulos et al., 2010a,b) and positively charged materials (Dellian et al., 2000; Stylianopoulos et al., 2010a,b; Ma and Hidalgo, 2013; He and Pistorius, 2017).

Potato virus X accumulates passively in tumors due to the enhanced permeability and retention effect (Shukla et al., 2013). The tumor homing of PEGylated PVX has been demonstrated in several rodent models, including human tumor xenografts of fibrosarcoma, squamous cell sarcoma, colon cancer, and breast cancer (Shukla et al., 2013, 2014b). Successful delivery requires PVX to enter the tumor microcirculation followed by extravasation into the tumor tissue. Filamentous particles show enhanced penetration behavior and better retention because they are transported across membranes more efficiently (Gentile et al., 2008; Lee et al., 2009; Toy et al., 2011). PEGylated PVX particles injected into mice also accumulate in the liver and spleen because these organs are part of the reticuloendothelial system, which removes proteinaceous antigens from circulation (Peiser et al., 2002).

Potato virus X can be loaded with doxorubicin due to the spontaneous hydrophobic interactions and π-π stacking of the planar drug molecules and polar amino acids. Approximately 850–1000 drug molecules are carried by an unmodified PVX particle, indicating that 70–80% of the CPs become stably attached to the drug (Le et al., 2017b; Lee et al., 2017). Doxorubicin remains cytotoxic when loaded onto PVX but its efficacy is lower than that of the free drug, as previously reported for synthetic nanoparticles (Yoo and Park, 2000) and other VNPs (Ren et al., 2007; Lockney et al., 2011). This reflects the different cellular uptake and processing pathways probably used for nanoparticles and small molecules, with the free drug more likely to enter the cell by diffusion across the cell membrane whereas VNPs are taken up by endocytosis or macropinocytosis. No statistical differences in the tumor growth rate or survival

time were observed when PVX formulation was compared to the free drug, but the tumor volume was slightly lower in mice treated with the PVX formulation. Furthermore, the PVX formulation did not improve the treatment but the cytotoxic efficacy was maintained. The PEGylation of PVX increased its ability to carry doxorubicin, allowing the attachment of 1000–1500 drug molecules per particle (Le et al., 2017b). As a topical treatment, such VNPs achieve excellent blood and tissue compatibility (Bruckman et al., 2014; Lee et al., 2015), thus opening the door for possible intravenous, systemic administration.

#### Light Me Up: Imaging With PVX VNPs

Plant viruses labeled with fluorescent proteins are often used to follow infections in host plants and to unravel the function of viral proteins (Tilsner and Oparka, 2010; Barón et al., 2016; Folimonova and Tilsner, 2018). PVX can be used as a tool for optical imaging by preparing mCherry and GFP overcoat structures using the FMDV 2A sequence (Shukla et al., 2014a). These particles allow the infection of plants to be visualized clearly (Baulcombe et al., 1995; Shukla et al., 2014a).

Green fluorescent protein and mCherry have been fused to the TGB proteins and the CP to determine the structure of PVX intracellular replication complexes (Tilsner and Oparka, 2010; Tilsner et al., 2012, 2013; Linnik et al., 2013) (**Figure 4A**). PVX cell-to-cell movement can usually be observed in leaves 6–10 dpi by the appearance of fluorescent spots, which slowly undergo radial expansion from the inoculation site. When the infected zone reaches the veins, particles are transferred to the vascular bundles enabling long-distance movement. The characteristics of the CP fusion protein can influence the time taken to achieve local and systemic movement. For the fusion protein GFP-2A-CP, fluorescent spots in non-inoculated leaves appear 12–16 dpi (Baulcombe et al., 1995; Shukla et al., 2014a) whereas the smaller iLOV-2A-CP fusion protein spreads more rapidly, with systemic infection appearing as early as 6 dpi (Röder et al., 2018). Later during infection, the virus exclusively spreads from photosynthetically active source tissues to developing sink leaves on the shoot of the plant (**Figure 4C**). The 2A sequence used in the GFP and mCherry fusion constructs produces a 1:3 ratio of fusion proteins to wild-type CP (Shukla et al., 2014a, 2018). Confocal laser scanning microscopy revealed that the labeled 2A-CPPVX particles were able to move between epidermal cells, as indicated by the presence of a fluorescent signal in the plasmodesmata (Oparka et al., 1996; Cruz et al., 1998; Chapman et al., 2008; Tilsner et al., 2013; Röder et al., 2018) (**Figure 4D**). One large fluorescent viral replication complex per infected cell is often observed in established infections (Tilsner et al., 2012, 2013; Linnik et al., 2013). These so-called virus factories coordinate the infection processes (Linnik et al., 2013). Additional diffuse fluorescence can be observed in epidermal cells, representing the presence of free fluorescent proteins. This leads to a relatively high background of free fluorescent protein in the cells, preventing the detailed analysis of CP localization. The major disadvantage of the overcoat principle is the unpredictable ratio of fusion protein to wild-type CPPVX, but this can be adjusted by using different variants of the FMDV 2A sequence (Luke et al., 2009; Minskaia and Ryan, 2013). Interestingly, we were able to create a direct fusion of the 113-amino-acid residue iLOV protein to the CP that was still able to achieve systemic infection, which is the largest CP fusion reported thus far (Röder et al., 2018). As a tool for the imaging of viral cellto-cell movement, fluorescent proteins should be densely arrayed on the virus surface to achieve a bright signal, as shown for the iLOV-CPPVX direct fusion (**Figure 4D**).

Fluorescent VNPs have advantages in other imaging applications compared to inorganic templates such as gold particles and carbon nanotubes because they are biocompatible and do not aggregate under physiological conditions or persist in tissues, which can lead to cell damage (Liu et al., 2008, 2013; Semmler-Behnke et al., 2008; Gad et al., 2012; Jaganathan and Godin, 2012). Filamentous VNPs not only offer a large surface area for the presentation of fluorescent proteins or dyes without quenching (Brunel et al., 2010) but also undergo a two-step clearance process with a plasma circulation half-life of ∼100 min, whereas spherical VNPs have a half-life of 4–7 min (Singh et al., 2007; Bruckman et al., 2014). PVX particles labeled with fluorescent proteins can be produced in and purified from plants and used directly for further applications. For example, we used mCherry-2A-CP PVX nanoparticles to easily determine the biodistribution of PVX in C57BL/6 mice. This revealed that mCherry-PVX is cleared via the reticuloendothelial system and deposited in the liver, resulting in tissue clearance 7 days after administration (see What Will Become of Us: In vivo Fate and Cytotoxicity of PVX Nanoparticles). We were also able to show that mCherry-PVX particles are taken up by human HT-29 (colon) tumor cells and localized in the perinuclear region, which was consistent with previous experiments involving PVX labeled with organic dyes (Shukla et al., 2013, 2014a,b, 2015) (**Figure 4B**). Given that plant VNPs are non-toxic in humans, long-term imaging is also possible. PEG can be used to reduce non-specific interactions and evade the immune system, thereby prolonging circulation times, e.g., for the visualization of blood flow in vivo (see What Will Become of Us: In vivo Fate and Cytotoxicity of PVX Nanoparticles) (Lewis et al., 2006; Leong et al., 2010; Bruckman et al., 2014; Lee et al., 2015).

#### Staying Alive: Vaccination Applications

Plants are considered a promising alternative production system for pharmaceuticals and have been extensively studied for this purpose (Fischer et al., 2012; Lico et al., 2012; Melnik and Stoger, 2013). Plant virus particles or CPs are ideal for the presentation of epitopes (Porta and Lomonossoff, 1998; Yusibov et al., 2006; Rosenthal et al., 2014) and thus can serve as carrier molecules, enhancing the immunogenicity of peptides by presenting them robustly to the immune system (Lomonossoff and Evans, 2014). Immune responses against single pathogen epitopes are in most cases insufficient to provide protection against an infection (Sette et al., 2001; Awram et al., 2002; Sette and Peters, 2007). The presentation of several different epitopes from the same pathogen or numerous different pathogens is ideal for the construction of efficient vaccines. Many pathogens exist as different genotypes or subtypes, for example in the case of Hepatitis C virus (Simmonds et al., 2005). This makes vaccine development more challenging, and is further complicated by

the degree of heterogeneity in infected individuals due to the pathogen mutation rate (Hayashi et al., 1999). An advantage of PVX nanoparticles as vaccine candidates is that the presence of the plant virus RNA may trigger Toll-like receptor 7 on antigen-presenting cells, hence boosting the immune response like an adjuvant (Jobsri et al., 2015). Several epitopes have been presented as PVX CP fusions for the production of vaccines (**Table 1**). The epitope fusions, which were tested in immunization studies, promoted a robust immune response in different animal models. The development of PVX-based vaccines has been comprehensively reviewed (Lico et al., 2015). For the presentation of epitopes on PVX, the major goal is very dense particle coverage by the selected peptides, which favors a strong immune response. Therefore, most of the constructs tested thus far in mice have been direct fusions (Brennan et al., 1999; Marusic et al., 2001; Lico et al., 2009; Cerovska et al., 2012). However, as discussed in Section "Change the World: Genetic Engineering," not all peptide sequences are suitable for direct fusion to the cp and constructs including the 2A sequence, which result in the less-dense presentation of

TABLE 1 | Epitopes presented on PVX-based VNPs for vaccination applications.


peptides, have been used successfully for the immunization of rabbits (Marconi et al., 2006) and mice (Uhde-Holzem et al., 2010). In one study, the B-cell epitope from the extracellular domain of the human epidermal growth factor receptor 2 was chemically coupled to the CP for presentation on the surface of PVX, followed by the successful immunization of mice (Shukla et al., 2014c, 2017).

## Material Girl: PVX for Biomaterial Applications

Potato virus X is also promising as a building block for hybrid organic–inorganic materials. Inspired by natural protein-based biomineralization systems such as silaffins (Kröger et al., 1999; Foo et al., 2004), we used PVX as a means to induce the deposition of silica, which could allow the development of new biomaterials with combined surface properties. Silica deposition on templates often involves the use of alkoxysilane precursors such as tetraethyl orthosilane, tetramethyl orthosilane or (3-aminopropyl)triethoxysilane. Genetically modified PVX particles presenting the amino acid sequence YSDQPTQSSQRP fused to the N-terminus of the CP were able to promote mineralization with tetraethyl orthosilane at room temperature, allowing the development of hybrid materials with two or even three components designed using immunogold labeling (Van Rijn et al., 2015). Several VNPs have been shown to arrange themselves around a central core of mesoporous silicon dioxide, extending the virus–silica morphology up to 1–2 µm in diameter and forming higher-order structures. Drygin et al. (2013) reported the selective electroless deposition of platinum ions on one end of PVX particles with nucleation centers 1–2 nm in diameter, although the reason for the unipolar deposition remains unexplained.

Plant viruses also offer new solutions for the biomedical application of biomaterials. For example, biomimetic nanocomposites for the replacement and regeneration of defective bone tissue must achieve biocompatibility while promoting cell adhesion and proliferation. Biological interactions between the implanted biomaterial and the surrounding tissue can only occur if the appropriate physical and cellular signals are present. In a natural context, MIPs are required for the hydroxyapatite mineralization of collagen and they control apatite nucleation and growth (Fisher et al., 2001; George and Veis, 2008). These non-collagenous proteins in the dentin extracellular matrix mainly consist of polar and charged amino acids. PVX displaying similar peptide sequences was able to attract calcium phosphate derivatives when incubated in hydroxyapatite or simulated body fluid (Lauria et al., 2017). Small nucleation centers formed along the longitudinal axis of the particles. Due to the unique atomic precision of the particle assembly, some aspects of the extracellular matrix were mimicked, including the mineral phase of human bone. Hydrogels are widely used as biocompatible scaffolds for tissue engineering, but often lack the signals required for cell interactions (Lee et al., 2016; Naahidi et al., 2017). Therefore, PVX was engineered to display an arginine, glycine and aspartic acid (RGD) peptide, a fibronectin-derived motif that promotes cell adhesion, either alone or in combination with a MIP, which led to improved cell binding (Lauria et al., 2017). In these studies, the mineralization capability was comparable among different peptide modifications, and scanning electron microscopy coupled with energy dispersive X-ray spectroscopy confirmed the presence of calcium and phosphate. Recombinant PVX particles embedded in agarose hydrogels served as biomimetic nanocomposites building up filamentous and network-like nanostructures and stimulating osteogenic differentiation in vitro in human bone marrowderived mesenchymal stromal cells. For tissue engineering applications, it is not only important to control the size and shape of hydroxyapatite crystals but also to ensure the scaffold is compatible with the cells (Klein et al., 1994). Mineralized recombinant PVX particles may therefore be useful in bone tissue engineering, regeneration and restoration by mimicking certain aspects of the bone extracellular matrix. PVX-MIP particles offer a promising biomimetic composite for the synthesis of such bone-like materials.

## Knowing Me, Knowing You: Biosensing

Potato virus X has been fused to the B domain of Staphylococcus aureus Protein A to achieve efficient antibody capture and presentation on the particle surface. Protein A binds to the Fc region of immunoglobulins (especially IgG) from many species, and is therefore routinely used for antibody purification and immunoprecipitation. The Protein A fragment retained its ability to immobilize antibodies when exposed on the PVX surface as a direct CP fusion (Uhde-Holzem et al., 2016). The particles were then immobilized on gold chips and used for quartz crystal microbalance detection. The modified PVX particles were able to capture 300–500 antibodies per particle, which enhanced the available antibodies on the chip surface and allowed the sensitive detection of Cowpea mosaic virus. In addition to sensing applications, the arrays could be used in the future to capture pollutants for cleanup or detoxification. Furthermore, when combined with medical payloads, such as contrast agents or drugs, the particles could be used for molecular imaging and drug delivery.

Potato virus X has also been used to improve an ELISA for the diagnosis of primary Sjögren Syndrome (Tinazzi et al., 2015). PVX was genetically modified to display the immunodominant lipo-peptide from lipocalin, which is involved in the pathogenesis of this autoimmune disease. The modified particles were used to coat ELISA plates for the analysis of patient serum samples and were compared to plates coated with the lipocalin peptide alone. The new ELISA achieved a sensitivity of 86.8% for the synthetic peptide but 98.8% for the PVX-lipocalin particles, a remarkable improvement. The ELISA plates could be stored for 60 days with no loss of diagnostic sensitivity.

Potato virus X has also been engineered for the selective attachment of target proteins or molecules via non-covalent interactions. A hybrid in which 50% of the CPs were fused to a single-chain antibody was created using the FMDV 2A sequence 20 years ago (Smolenska et al., 1998). Among the many potential

applications of this platform, the authors produced a single chain antibody against the herbicide 3-(3,4-dichlorophenyl)-1,1 dimethylurea, and proposed that the PVX-antibody particles would be suitable for the remediation of contaminated soil and waterways. However, the recombinant virus remained infective, so careful precautions would be necessary before releasing it into the environment.

#### A Little More Action, Please: Catalysis

Many industrial processes require enzymes or enzyme cascades that survive harsh process conditions such as high temperatures or extreme pH for efficient substrate conversion. Additionally, the enzymes should be stable and reusable to prolong their retention and minimize process costs. The latter can be achieved by immobilization on solid supports, which in some cases even improves the enzyme stability (Rodrigues et al., 2013).

With its high aspect ratio, PVX is an ideal scaffold for the presentation of multiple copies of small peptides and proteins, including enzymes. Despite the constraints of genetic modification, this is the preferred method to develop new PVXbased biocatalysts because large quantities can be produced by molecular farming within 2–3 weeks (see Don't Stop 'Til You Get Enough: Production of VNPs). Carette et al. (2007) took advantage of this one-step production system to create stable PVX nanoparticles presenting a commercially important lipase from Candida antarctica (CALB). To circumvent the size limitation during particle assembly, they used the overcoat strategy to produce hybrid particles containing recombinant and wild-type CPs. CALB retained its activity when immobilized on the particle surface, as shown by in situ single enzyme studies with the profluorescent substrate 5(6)-carboxyfluorescein diacetate. However, the catalytic activity of the hybrid virus particle against the substrate p-nitrophenyl caproate was 2 µmol min−<sup>1</sup> mg−<sup>1</sup> . This is 45 times lower than the free enzyme, possibly because the CP fusion hindered substrate access to the active site. As stated earlier, the overcoat strategy is also incompatible with enzymes that require post-translational modification, such as the predominantly glycosylated enzymes from Trichoderma reesei. Klose et al. (2015) demonstrated that the expression and activity of these enzymes is also dependent on subcellular targeting, so a site-specific attachment system is desirable. One example is ST/SC (see Catch Me If You Can: Sticky Particles), which is based on the CnaB2 domains of the fibronectin-binding protein of Streptococcus pyogenes (Zakeri et al., 2012). These components rapidly form an irreversible and specific covalent bond, allowing for positional control across a broad range of buffers and temperatures. We recently engineered PVX nanoparticles displaying the shorter ST and expressed the T. reesei endoglucase Cel12A-SC fusion protein in a different plant cell compartment to facilitate glycosylation (Röder et al., 2017). This system achieved a three-fold higher coupling efficiency than the CALB overcoat strategy even though the lipase is smaller than Cel12A-SC. The resulting VNP displayed ∼850 enzymes per PVX particle, and the retention of catalytic activity was confirmed by measuring kinetic parameters in the presence of different

concentrations of 4-methylumbelliferyl-β-D-cellobioside. The affinity of the PVX-ST/Cel12A-SC nanoparticles for the substrate was ∼3.5-fold lower than the free enzyme, indicating that the scaffold interferes with substrate binding to some degree. This issue could be addressed by adding a carbohydrate binding module to anchor the cellulose chains. However, the turnover rate kcat and Vmax of the PVX-ST/Cel12A-SC particles was ∼2.9-fold higher than the free enzyme, which may reflect ability of closely spaced enzymes on the 515-nm scaffold to facilitate hydrolysis.

Cellulose can be broken down into single glucose molecules by the synergistic activity of exoglucanases, endoglucanases and β-glucosidases. Using the ST/SC approach, a system has been designed to allow the future development of stoichiometric multi-enzyme cascades immobilized on PVX VNPs. Moreover, the ST-engineered PVX is a promising universal platform for the attachment of single or multiple proteins that cannot be fused to the CP by genetic engineering.

## HOW FAR WE'VE COME: PERSPECTIVES

This article highlights the great potential of PVX as a platform for the development of novel VNPs. However, certain challenges remain to be overcome. One limitation for the commercial development of PVX and plant viruses in general is the purification process. The current purification protocol involves several ultracentrifugation steps (see Don't Stop 'Til You Get Enough: Production of VNPs) which are unsuitable for large-scale processes. A scalable purification protocol which might also be compatible with good manufacturing practices therefore needs to be established. We have also observed that the particle yield in planta and throughout the purification process depends on the peptide fusion. During the first steps of the purification process, the precipitation steps can easily be improved by adjusting the pH of the extraction buffer according to the pI of the CP fusion protein. We found that peptide fusions with lower pI values often cause severe infections in host plants but produce fewer particles (Lauria et al., 2017). Peptide fusions near or slightly higher than the pI of the wild-type CP most likely produce higher particle yields. Although CP protein fusion via the 2A sequence can enlarge the spectrum of potential fusion partners, the yields in planta and after purification are even lower than those achieved using the direct fusion strategy. Moreover, CP fusions tend to become less stable at the genomic and protein levels as the insert size increases. The use of different 2A sequences with different processivities may improve the stability and optimize the ratio of fusion protein to free CP on the particle surface.

The limitations of the VNP approach include the tendency of RNA viruses to delete non-essential foreign sequences and their incompatibility with target proteins that require posttranslational modification. One possible solution is the use of ST/SC chemistry to covalently attach proteins that are not encoded in the virus genome and that can be modified post-translationally in other subcellular compartments before

attachment, a strategy that also increased the coupling efficiency by about three-fold compared to the FMDV 2A approach (Röder et al., 2017). By using this configuration, it might be possible to stably immobilize entire enzyme cascades on a single PVX scaffold, thus taking advantage of proximity effects. To control the enzyme stoichiometry, sticky proteins with different coupling efficiencies can be combined, such as ST/KTag/SpyLigase or SnoopTag/SnoopCatcher (Fierer et al., 2014; Siegmund et al., 2016; Veggiani et al., 2016; Brune et al., 2017; Bao et al., 2018; van den Berg van Saparoea et al., 2018). These new fusion strategies are currently only possible using small batch production methods and it will be necessary to develop innovative approaches to increase the production scale. However, as these sophisticated approaches become more widespread, we will see an ever increasing spectrum of potential applications for engineered PVX particles, spanning the fields of medicine, biomedical engineering, material science and industrial biocatalysts.

#### REFERENCES


# AUTHOR CONTRIBUTIONS

JR and CD contributed equally to this work. All authors read and approved the final manuscript.

# FUNDING

This research was funded by the Excellence Initiative of the German federal and state governments (Gefördert durch Mittel der Exzellenzinitiative des Bundes und der Länder).

## ACKNOWLEDGMENTS

We would like to thank Dr. Richard M. Twyman for critically reading the manuscript.



ribosomal "skip." J. Gen. Virol. 82, 1013–1025. doi: 10.1099/0022-1317-82-5- 1013


complexes of 5-tert-butyl-pyrocatechinderived mannich bases. Chemija 23, 286–293. doi: 10.1021/nl9035753



fluorescent iLOV polypeptide fused directly to the coat protein. Biomed Res. Int. 2018:9328671. doi: 10.1155/2018/9328671



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Röder, Dickmeis and Commandeur. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Critical Evaluation of Strategies for the Production of Blood Coagulation Factors in Plant-Based Systems

#### Oguz Top1,2† , Ulrich Geisen<sup>3</sup> , Eva L. Decker<sup>1</sup>‡ and Ralf Reski1,2,4 \*

<sup>1</sup> Plant Biotechnology, Faculty of Biology, University of Freiburg, Freiburg im Breisgau, Germany, <sup>2</sup> Spemann Graduate School of Biology and Medicine, University of Freiburg, Freiburg im Breisgau, Germany, <sup>3</sup> Faculty of Medicine, Institute for Clinical Chemistry and Laboratory Medicine, University of Freiburg, Freiburg im Breisgau, Germany, <sup>4</sup> Signalling Research Centres BIOSS and CIBSS, University of Freiburg, Freiburg im Breisgau, Germany

The use of plants as production platforms for pharmaceutical proteins has been on the rise for the past two decades. The first marketed plant-made pharmaceutical, taliglucerase alfa against Gaucher's disease produced in carrot cells by Pfizer/Protalix Biotherapeutics, was approved by the US Food and Drug Administration (FDA) in 2012. The advantages of plant systems are low cost and highly scalable biomass production compared to the fermentation systems, safety compared with other expression systems, as plant-based systems do not produce endotoxins, and the ability to perform complex eukaryotic post-translational modifications, e.g., N-glycosylation that can be further engineered to achieve humanized N-glycan structures. Although bleeding disorders affect only a small portion of the world population, costs of clotting factor concentrates impose a high financial burden on patients and healthcare systems. The majority of patients, ∼75% in the case of hemophilia, have no access to an adequate treatment. The necessity of large-scale and less expensive production of human blood coagulation factors, particularly factors associated with rare bleeding disorders, may be an important area for plant-based systems, as coagulation factors do not fit into the industryfavored production models. In this review, we explore previous studies on recombinant production of coagulation Factor II, VIII, IX, and XIII in different plant species. Production of bioactive FII and FIX in plants was not achieved yet due to complex post-translational modifications, including vitamin K-dependent γ-carboxylation and propeptide removal. Although plant-made FVIII and FXIII showed specific activities, there are no followup studies like pre-clinical/clinical trials. Significant progress has been achieved in oral delivery of bioencapsulated FVIII and FIX to induce immune tolerance in murine models of hemophilia A and B, resp. Potential strategies to overcome bottlenecks in the production systems are also addressed in this review.

Keywords: plant molecular farming, biopharmaceuticals, blood coagulation factors, factor II, factor VIII, factor IX, factor XIII

# BLOOD COAGULATION CASCADE

Hemostasis is the complex physiological process responsible for stopping bleeding (hemorrhages). It depends on a delicate balance of pro- and anticoagulant forces. The main task of the human blood coagulation system is to prevent excessive blood loss after vascular injury. This is fulfilled by the concerted action of many players including coagulation factors that act in two major

#### Edited by:

Anneli Marjut Ritala, VTT Technical Research Centre of Finland Ltd., Finland

#### Reviewed by:

Muriel Bardor, Université de Rouen, France Arjen Schots, Wageningen University & Research, Netherlands

#### \*Correspondence:

Ralf Reski ralf.reski@biologie.uni-freiburg.de orcid.org/0000-0002-5496-6711

†orcid.org/0000-0003-2820-6505 ‡orcid.org/0000-0002-9151-1361

#### Specialty section:

This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science

Received: 30 November 2018 Accepted: 19 February 2019 Published: 07 March 2019

#### Citation:

Top O, Geisen U, Decker EL and Reski R (2019) Critical Evaluation of Strategies for the Production of Blood Coagulation Factors in Plant-Based Systems. Front. Plant Sci. 10:261. doi: 10.3389/fpls.2019.00261

**25**

pathways, known as extrinsic pathway (tissue factor pathway) and intrinsic pathway (the contact pathway) (Smith et al., 2015). The coagulation factors are mostly serine proteases, except tissue factor (TF), Factor V (FV), Factor VIII (FVIII), and Factor XIII (FXIII). One common feature of the coagulation factors is that they mostly circulate in blood in their inactive zymogen form to maintain homeostasis. They are activated via limited proteolysis upon blood loss from damaged vessels to catalyze the next reaction in the cascade (Smith et al., 2015). The extrinsic pathway is triggered at the site of injury due to the release of TF and hence, is also called tissue factor pathway. TF is a co-factor of Factor VIIa (FVIIa) and the formation of the TF:FVIIa complex catalyzes two downstream reactions: the conversion by proteolysis of Factor X (FX) and Factor IX (FIX) to FXa and FIXa, respectively. On the other hand, the intrinsic pathway begins with Factor XII (FXIIa), high molecular weight kininogen, prekallikrein, and Factor XI (FXI) (Silverberg et al., 1980; Tankersley and Finlayson, 1984; Palta et al., 2014). FXIa and TF:FVIIa complex further activates FIX, which acts with Factor VIII (FVIII) to form the tenase complex to activate FX. FXa propagates the cascade, the final common pathway, by activating FV (**Figure 1**). Subsequently, prothrombin (Factor II) is activated by the prothrombinase complex (FXa and FVa) and processes fibrinogen into fibrin, which forms a mesh over the wound, activates platelets and forms the blood clot by which the bleeding is stopped (Palta et al., 2014).

Coagulation factor deficiency or dysfunction due to inherited or acquired coagulation disorders impair hemostasis, which can result in life-threatening spontaneous bleeding without an obvious reason. The incidence and severity of disorders are dependent on the amount of clotting factor that is missing in the body. In addition to well-known FVIII and FIX deficiencies, which lead to hemophilia A and B, respectively, there are also other rare bleeding disorders due to deficiencies in Factor I (fibrinogen, FI), FV, FVII, FX, FXI, FXIII, and prothrombin (FII). There are 315,423 people suffering from bleeding disorders [based on data from 116 countries on the Annual Global Survey 2017 by World Federation of Hemophilia (2018)]; 196,706 of them being hemophilia patients, 76,144 von Willebrand disease (VWD), and 42,573 patients with other bleeding disorders. The World Federation of Hemophilia estimates that 400,000 people worldwide are suffering from hemophilia and only 25% receive decent treatment.

Hemophilia A is the most common bleeding disorder caused by defects in the FVIII gene, located on the X chromosome (Mannucci and Tuddenham, 2001). Hemophilia A occurs in 1 in 10,000 live births (1 in 5,000 male) (Ponder, 2011). Patients are grouped into three classes based on the severity of the disease, which is associated with the level of FVIII circulating in the blood. Patients with severe hemophilia A have 1% or less FVIII, moderate hemophilia A patients have between 1 and 5% FVIII and mild hemophilia A patients between 5 and 25% FVIII in circulation (Pool and Shannon, 1965). Mild hemophilia A patients experience reduced hemostasis upon bleeding after major trauma or surgery. However, under severe conditions, patients suffering from more severe hemophilia A do not stop bleeding after a minor trauma or start bleeding spontaneously, especially in joint spaces and soft tissues. Hemophilia B, also known as Christmas disease, is an X chromosome-linked disorder in the FIX gene (Soucie et al., 1998). It occurs in 1 in 30,000 male births and patients can be grouped into three classes, like in Hemophilia A, based on the severity of the disease that corresponds to the level of FIX circulating in the blood (Soucie et al., 1998). Deficiencies in FI, FV, FVII, FX, FXI, FXIII, and prothrombin also cause improper clotting, resulting in patients suffering from similar symptoms like in hemophilia.

There is currently no cure for bleeding disorders and treatment is restricted to protein replacement therapy, either plasma-derived or recombinant products. Therapy can be on-demand treatment or prophylaxis, i.e., the regular supplementation of clotting factor concentrates to keep concentrations over a certain level to prevent bleeding. The average annual per-person medical costs were \$85,852 for mild/moderate hemophilia B and \$198,733 for severe hemophilia B in the United States (Chen et al., 2017). Even if prophylaxis provides a better life quality, costs increase dramatically. The average medical costs for patients with severe hemophilia A receiving on-demand treatment in the United States were \$184,518 p.a., for those receiving prophylaxis \$292,525 p.a.. Factor VIII concentrates are the major burden within these costs (\$170,037 and \$289,172 p.a., respectively) (Chen, 2016). The global market value of recombinant coagulation factors was approximately \$8.5 billion in 2017 (Walsh, 2018).

The cloning of the FVIII (Gitschier et al., 1984) and FIX (Choo et al., 1982) genes not only promoted recombinant production of clotting factors but also instigated gene therapy attempts for hemophilia (Mannucci and Tuddenham, 2001). Yet these attempts are still far from offering standardized solutions for patients due to manufacturing and safety concerns of gene therapy vectors and their immunogenic responses triggered in patients (Doshi and Arruda, 2018). Moreover, gene therapy trials are concentrated on the diagnosis of hemophilia but not on other bleeding disorders, mainly due to the fact that these are low incidence diseases, varying from 1 in 500,000 to 1 in 2–3 million (Palla et al., 2015).

Major advantages of plant molecular farming are lower costs and highly scalable biomass production compared to typical fermentation systems (Edgue et al., 2017; Margolin et al., 2018). Plant-based systems are generally safer because they do not bear the risks of product contaminations with endotoxins, infectious viruses, and prions (Reski et al., 2015). Plants can also perform complex post-translational modifications, e.g., Asnlinked (N-) glycosylation, that can be further engineered to achieve humanized N-glycan structures (Castilho et al., 2010; Castilho and Steinkellner, 2012; Decker and Reski, 2012). Attempts on the prevention of plant-specific O-glycosylation and the establishment of de novo humanized O-glycosylation, despite current limitations, show another advantage of plant systems (Schoberer and Strasser, 2018). The necessity of large-scale and less expensive production of human blood coagulation factors, particularly factors associated with rare bleeding disorders, may be an important area for plant-based systems, as coagulation factors do not fit into the industry-favored production models.

In this review, we summarize the previous studies on recombinant production of FII, FVIII, FIX, and FXIII in plant

systems. We also discuss the current challenges and provide possible solutions to overcome bottlenecks.

# FACTOR VIII

The FVIII gene spans a genomic region of 186 kb, which contains 26 exons and 25 introns (Oldenburg et al., 2004) and is transcribed into 9 kb long mRNA, that encodes a single-chain protein with 2,322 amino acids (Gitschier et al., 1984). FVIII consists of six major domains (A1-A2-B-A3-C1-C2, **Figure 2**) and is mainly produced by liver sinusoidal endothelial cells (Do et al., 1999). It circulates in an inactive form bound to von Willebrand factor (vWF) until injury (Fay, 2006) and is activated by the cleavage and release of the B domain, which yields the light chain of 80 kDa and the heavy chain (A1-A2 domains) of variable size (90–200 kDa) (A3-C1-C2 domains) (Thorelli et al., 1998). The deletion of the B-domain increases the product yield but does not affect in vitro and in vivo activity (Kessler et al., 2005).

Factor VIII is heavily glycosylated, by both, N- and O-glycosylation, and bears sulphated tyrosine residues (Vehar et al., 1984; Kaufman and Pipe, 1999). N-glycosylation was validated for 19 of 25 putative sites in plasma-derived FVIII (pdFVIII) (Canis et al., 2018). The MALDI-MS profiling of pdFVIII revealed that high mannose and complex-type N-glycans preponderated: 16% of the population with high mannose, 67% with terminal sialic acid (both α2,3 and α2,6 linkages), and 10% with AB0(H) blood group antigens. Nearly 40% loss of activity after deglycosylation of FVIII illustrated that glycosylation is important for biological activity (Kosloski et al., 2009). Furthermore, terminal sialic acids increase halflife of glycoproteins in plasma by preventing exposure of N-acetylgalactosamine, a particular form of O-glycosylation, or galactose which are recognized by the asialoglycoprotein receptor in the liver and subsequently cleared from circulation (Ashwell and Harford, 1982). Due to the importance of complex post-translational modifications for FVIII activity, prokaryotic and yeast expression systems are no suitable production hosts (Pipe, 2008).

In addition to the decoration with N-linked oligosaccharides to the corresponding sites and the initial processing within the ER, FVIII binds to chaperone proteins, calnexin (CNX), calreticulin (CRT), or binding immunoglobulin protein (BiP), for proper folding (Kaufman, 1998; Molinari et al., 2003; Oda et al., 2003). After the quality control by CNX/CRT, FVIII detaches from these chaperones and can proceed in the secretory pathway

(Kaufman, 1998). Here, N-linked oligosaccharides are processed and O-glycosylation and sulphation of tyrosine residues occur in the Golgi apparatus (Thim et al., 2010; Orlova et al., 2013). On the other hand, FVIII forms a stable complex with BiP, an additional key component in the unfolded protein response (UPR) pathway, in the ER lumen and this interaction is the limiting step in FVIII secretion in Chinese Hamster Ovary (CHO) cells (Dorner et al., 1988; Dorner and Kaufman, 1994). The overexpression and incorrect folding of FVIII lead to the upregulation of BiP and hence the activation of UPR (Soukharev et al., 2002). Deletion of the B-domain, which corresponds to 38% of the sequence and contains the majority of N-linked oligosaccharides, and putative BiP binding site on FVIII resulted in increased FVIII secretion (Dorner et al., 1987; Saenko et al., 2003). On the other hand, mutating BiP in CHO cell lines expressing B-domaindeleted FVIII adversely affected FVIII secretion (Morris et al., 1997). These aspects have not been investigated thoroughly in plant systems.

Additionally, there are factors other than BiP binding site and FVIII B-domain limiting FVIII expression in human cell lines that should be taken into account to achieve efficient expression in plant-based systems. It was shown that the 1.2-kb long cDNA coding for the FVIII A2 domain had detrimental effects on RNA accumulation in human cell lines (Lynch et al., 1993). Furthermore, FVIII transcription could be inhibited due to the presence of a 305-bp transcriptional silencer originating from exons 9–11 (Hoeben et al., 1995). Although the majority of the transcriptional regulation mechanisms are shared between plants and animals, there are subtle differences (Macrae and Long, 2012) and one might test whether these limiting factors in human cell lines will also be limiting in plants or plantbased systems. Another limiting factor in human cell lines is that heavy and light chains are degraded in cell culture media quickly. This problem was overcome by the addition of von Willebrand factor (vWF) into cell culture medium and/or co-expression of vWF together with FVIII, as vWF assists FVIII secretion and association of heavy and light chains (Fay, 1988; Saenko et al., 1999). Accumulation of FVIII in plant apoplasts might prevent degradation of recombinant FVIII and plant-based systems may be advantageous for FVIII production.

The first recombinant FVIII, Recombinate <sup>R</sup> , was launched in 1992 by Genetics Institute and Baxter Healthcare Corporation. With the research on FVIII over the years, there are now more than ten recombinant FVIII concentrates on the market compared to over 40 plasma-derived FVIII concentrates [Annual Global Survey 2017 by World Federation of Hemophilia (2018)]. These are produced in CHO cells (seven products), human embryonic kidney (HEK-293) cells (two products), and baby hamster kidney (BHK) cells (two products) (Swiech et al., 2017). Although production is restricted to these three cell types, the characteristics of rFVIII, purification, and the use of animalderived proteins as stabilizer do differ. Due to the potential risk of exposure to transmissible agents, human albumin was no longer used in the majority of products on the market except Recombinate <sup>R</sup> and Kogenate <sup>R</sup> (Swiech et al., 2017). In addition to full-length rFVIII products (Recombinate <sup>R</sup> , Kogenate <sup>R</sup> , Kogenate <sup>R</sup> FS, Advate <sup>R</sup> , Adynovate <sup>R</sup> , Kovaltry <sup>R</sup> , Jivi <sup>R</sup> ), there are also B-domain truncated (Novoeight <sup>R</sup> and Afstyla <sup>R</sup> ) and B-domain deleted (Refacto, Refacto <sup>R</sup> AF <sup>R</sup> , Eloctate <sup>R</sup> , and Nuwiq <sup>R</sup> ) rFVIII products on the market (Swiech et al., 2017). Moreover, to achieve longer half-life and decrease FVIII administration to the patients, FVIII was PEGylated (Adynovate <sup>R</sup> and Jivi <sup>R</sup> ) and even fused to the IgG1-FC (Eloctate <sup>R</sup> ) (Swiech et al., 2017). Recently, Hemlimbra <sup>R</sup> (emicizumab-kxwh), a

humanized bispecific monoclonal antibody that restores FVIII function by bridging FIXa and FXa, has been approved by FDA for hemophilia A without FVIII inhibitors (Oldenburg et al., 2017; Scott and Kim, 2018).

The expression of full-length FVIII was achieved in transgenic tobacco lines (Hooker et al., 1999). FVIII levels reached up to 0.002% of soluble leaf protein as confirmed by Western blot analysis. The activity of tobacco-made FVIII in chromogenic assays was 14.85 IU/mg soluble leaf protein (Hooker et al., 1999). Typical levels of FVIII production in CHO cell lines are 0.5–2 IU/ml culture medium (Orlova et al., 2017). Fulllength FVIII, as well as B-domain deleted FVIII, and A2 domain exchanged FVIII (human A2 sequence was replaced with porcine due to the adverse effect of the human A2 domain on RNA accumulation) were produced in tobacco protoplasts, whole plants and in calli (Hooker et al., 1999). In addition to the production in tobacco, FVIII was produced in potato (Hooker et al., 1999). In 2014, FVIII domains were produced in tobacco chloroplasts and bioencapsulated in plant cells for oral delivery for patients suffering from hemophilia A with a FVIII inhibitor. FVIII inhibitor development is a helper-T-cell dependent response upon treatment with FVIII concentrates and causes increased morbidity and mortality (Verma et al., 2010; Sherman et al., 2014). Inhibitors, which occur in 20–30% of hemophilia A patients, can be eliminated with immune tolerance-induction (ITI) therapy in which the immune system is trained to tolerate FVIII concentrate by the repeated and frequent administration of FVIII over several months, or sometimes years (Schep et al., 2018). Orally administered tobacco cells expressing FVIII domains in a murine model of hemophilia A suppressed the inhibitor formation by induction of specific populations of regulatory T cells (CD4+CD25<sup>+</sup> and CD4+CD25−, resp.; Sherman et al., 2014). Recently, full-length FVIII produced in lettuce chloroplasts reached levels up to 852 µg/g in lyophilized plant cells and its oral delivery within lettuce cells suppressed the inhibitor formation in a hemophilia A mouse model (Kwon et al., 2018). Although it was shown that tobacco and lettuce chloroplast-derived foreign proteins were folded properly (Boyhan and Daniell, 2011; Zhang et al., 2017), N-glycosylation of proteins in chloroplasts is not possible. Still, the production of FVIII in different plant systems, and especially its expression and bioencapsulation in tobacco and/or lettuce cells for oral delivery to lower inhibitor formation, has a promising future with additional benefits. These benefits are elimination of expensive cell growth and purification costs as well as the suitability and oral delivery of freeze-dried plant cells with proteins for long-term storage without adverse effects on folding and function (Kwon et al., 2018).

#### FACTOR IX

FIX is one of the serine proteases in the blood coagulation cascade and its deficiency causes hemophilia B. FIX is produced in the liver and it is a smaller and less complex protein compared to FVIII (Swiech et al., 2017). However, it undergoes complex posttranslational modifications (PTMs) including γ-carboxylation by γ-glutamyl carboxylase (GGCX) in the ER and a proteolytic processing by paired basic amino acid cleaving enzyme (PACE, also called furin) in the Golgi apparatus (Pipe, 2008; **Figure 2**). Human pdFIX has two sites for N-glycosylation (Asp-157 and Asp-164 located in the activation peptide) and four sites for O-glycosylation; Ser-53 and Ser-61 which are fully glycosylated and Thr-159 and Thr-169 which are only partially glycosylated (Agarwala et al., 1994; Kaufman, 1998). FIX is secreted into the bloodstream after these extensive PTMs in an inactive zymogen form (∼57 kDa). In the case of bleeding, it is activated by FXIa or FVIIa (Orlova et al., 2012).

Although there are promising attempts to cure hemophilia B via gene therapy (Pfizer announced the initiation of the phase III program for investigational hemophilia B gene therapy, Identifier NCT03587116, in July 2018.), current treatments are restricted to protein replacement therapies, as in hemophilia A. BeneFIX <sup>R</sup> , the first recombinant FIX product, was introduced to the market by Pfizer in 1997, and for a long time it was the only recombinant product on the market (Swiech et al., 2017). Currently, there are additional recombinant products on the market (Rixubis <sup>R</sup> , Alprolix <sup>R</sup> , Ixinity <sup>R</sup> , Idelvion <sup>R</sup> , and Rebinyn <sup>R</sup> ). Four of these products are produced in CHO cells and one in HEK-293 cells. Ixinity is the only product with Thr-148 polymorphism, while the rest have Ala-148 (Swiech et al., 2017). Due to the concerns about the half-life of FIX concentrates, several different strategies were employed: three products with improved half-life, Alprolix (FIX-IgG1 Fc fusion), Idelvion (FIX-Albumin fusion) and Rebinyn (PEGylated FIX), were approved by the FDA in 2014, 2016, and 2017, respectively (Powell et al., 2013; Santagostino, 2016; Graf, 2018).

FIX, FVII, FX, prothrombin (FII), protein C, protein S, and osteocalcin are vitamin K-dependent (VKD) proteins and they all have γ-carboxyglutamate (Gla) by the addition of a carboxyl group to a glutamic acid residue. During this process, vitamin K hydroquinone is oxidized to vitamin K 2,3 epoxide and CO<sup>2</sup> is added to Glu. This reaction is catalyzed by GGCX, an integral membrane protein located in the ER of hepatocytes (Presnell and Stafford, 2002). GGCX recognizes propeptide regions and performs all modifications at once. In its native form, FIX contains 12 Gla residues in the so-called Gla domain; the first 10 Gla residues are conserved in all VKD proteins, whereas the last two are unique to FIX (Gillis et al., 2008). The Gla domain is crucial for the interaction with phospholipid surfaces and consequently for FIX activity (Pipe, 2008). After γ-carboxylation in the ER, FIX is further processed by PACE in the Golgi apparatus. The removal of the propeptide by PACE influences the formation of Ca2+-induced secondary and tertiary structures of the Gla domain, thus it is required for normal function of FIX (Pipe, 2008). γ-carboxylation by GGCX and propeptide removal by PACE/furin do not occur in planta. Thus, expression of FIX, GGCX, and PACE in plantbased systems is required for a production of bioactive FIX in plants. γ-carboxylation and propeptide removal are rate-limiting steps in FIX production: The overexpression of FIX in CHO cells resulted in 180 µg FIX/ml of culture supernatant but only 0.8% of it was fully carboxylated (Kaufman et al., 1986). When compared to other VKD proteins, one can speculate

Top et al. Blood Clotting Factors in Plants

that FIX might not be the best substrate for GGCX: In vitro γ-carboxylation analyses showed that the K<sup>m</sup> of FIX was several thousand-fold lower than other VKD proteins that have FLEEL or FLEEV peptides, which affects GGCX binding (Wu et al., 1990). Hence, to achieve increased γ-carboxylation in FIX, the propeptide of FIX might be replaced with a better one from other VKD proteins.

The attempts to produce bioactive FIX in plants are challenging due to the absence of GGCX and PACE and the introduction of these two genes alone might not guarantee the functionality of plant-made FIX. As described previously, GGCX converts Glu to Gla by reducing vitamin K hydroquinone to vitamin K epoxide. Plants are the main source of vitamin K and animals are dependent on plants (Shearer and Newman, 2008). Most likely due to the limited availability of vitamin K in animals, vitamin K epoxide has to be converted first to vitamin K quinone and then to vitamin K hydroquinone which can be used in γ-carboxylation (Stafford, 2005). These reactions are catalyzed by the vitamin K epoxide reductase complex (VKORC) subunit 1, which is also absent in plants. Another challenge is that FIX has to be γ-carboxylated in the ER but vitamin K is synthesized from shikimate by nine consecutive reactions taking place mainly in chloroplasts and partially in peroxisomes, and the final product, vitamin K phylloquinone, is located in the chloroplast (Reumann, 2013). Even though it was proposed that chloroplasts are metabolically coupled to the ER (Bobik and Burch-Smith, 2015), to our knowledge there is no study on the presence of vitamin K in the ER of plants. Therefore, to achieve at least a limited amount of vitamin K in the ER, one can suggest feeding the plants with vitamin K and introducing VKORC1 to ensure enough vitamin K hydroquinone is generated, which GGCX needs during the γ-carboxylation process. Moreover PACE, the propeptide removal enzyme, has its own propeptide and undergoes a complex self-activation process in animals: The propeptide in PACE first undergoes Ca2<sup>+</sup> autoproteolysis in the ER and second Ca2<sup>+</sup> and acidic pH-dependent autoproteolysis in the trans-Golgi network. After these two cleavages, the propeptide is completely removed from PACE and PACE is converted into the active form (Anderson et al., 1997, 2002). The attempts to produce PACE without the propeptide resulted in non-functional PACE due to the fact that the propeptide regulates folding of the protein (Ware et al., 1989; Bristol et al., 1993). Consequently, expression of full-length PACE/furin and its Ca2<sup>+</sup> and acidic pH-dependent autoproteolysis are also important for plant-based systems. During maturation of FIX, GGCX has to act before PACE because if PACE acts first, GGCX cannot bind and modify FIX. Thus, the intracellular localization of GGCX and PACE, if introduced into plant-based systems, has to be guaranteed. There is no report on the introduction of GGCX in plants yet, but one study reported the successful introduction of PACE/furin in Nicotiana benthamiana and confirmed its activity on transforming growth factor-β1 (Wilbers et al., 2016). With this report (Wilbers et al., 2016), it can be anticipated that PACE can also activate itself by Ca2<sup>+</sup> and acidic pH-dependent autoproteolysis in plants, despite the differences between plant and animal organelles' pH (Shen et al., 2013).

There have been several attempts to produce FIX in plantbased systems. The first study aimed to accumulate FIX in tomato fruits and reached up to 0.01584 mg FIX/g fresh weight fruit (Zhang et al., 2007). In the second study, FIX was introduced into soybean and the highest FIX levels were 800 mg/kg of soybean seeds (Cunha et al., 2011). However, plant-made FIX proteins failed to show any activity due to the fact that only FIX (without GGCX, PACE, and VKORC1) was expressed. In another study, as in FVIII bioencapsulated in lettuce cells (Kwon et al., 2018), FIX was produced in lettuce chloroplasts and oral delivery of bioencapsulated FIX in lettuce cells to the hemophilia B murine model suppressed inhibitor formation (Su et al., 2015).

It is still possible to achieve in vitro γ-carboxylation (Hubbard et al., 1989), but it has several disadvantages. Isolation of liver microsomes, purification of GGCX from microsomes, control and completeness of the in vitro γcarboxylation assay, heterogeneity of end products and necessity for further purification and the associated costs make in vitro γ-carboxylation inapplicable. Attempts to produce bioactive FIX in plant-based systems are currently not possible due to complex post-translational modifications. Once current challenges will have been overcome, plant-based systems might become a good alternative production host for this blood coagulation factor also.

## FACTOR XIII

FXIII is a transglutaminase that stabilizes the fibrin clot by crosslinking fibrin monomers and protecting the clot from fibrinolytic degradation (Kaufman and Pipe, 1999; Lovejoy et al., 2006). It circulates in the plasma as a tetramer of two A and B subunits (Komáromi et al., 2011; **Figure 2**). FXIII-B subunits are present in plasma freely in excess amounts (Komáromi et al., 2011). FXIII deficiency causes a rare bleeding disorder, affecting one in 1–5 million people, due mainly to mutations in the FXIII-A subunit (Karimi et al., 2009). Patients suffering from this deficiency require life-long protein replacement therapy (Komáromi et al., 2011).

The A subunit of FXIII has no signal sequence, no N-linked glycosylation, and no sulfides (Mosher, 2014). Thus, in this case yeast systems can be preferred as production host over plants and animals. Hence, the recombinant product (rFXIII-A2) on the market is produced in Saccharomyces cerevisiae by Novo Nordisk. The recent trial with patients showed that yeast-made rFXIII-A<sup>2</sup> prevented bleeding and did not cause the formation of any non-neutralizing or neutralizing antibodies in patients (Carcao et al., 2018). The A subunit was also successfully expressed in tobacco cell suspensions (NT-1) and in tobacco plants (Gao et al., 2004). Tobacco-made FXIII-A reached up to 1.8% of the total extracted soluble leaf protein and showed FXIII-specific activity of 258 U/g of soluble protein, while human plasma-derived FXIII has an activity of 50.2 U/mg of soluble protein (Gao et al., 2004). Moreover, like human plasma-derived FXIII, it crosslinked human fibrin and produced dimers and multimers (Gao et al., 2004). To our knowledge, there was no follow-up study.

# PROTHROMBIN

fpls-10-00261 March 7, 2019 Time: 12:7 # 7

Prothrombin (FII), a member of vitamin-K dependent proteins, is a 72 kDa protein composed of 579 amino acids (Mann et al., 2003). It comprises four major domains: a Gla domain, two kringle domains and a catalytic domain (Soriano-Garcia et al., 1992). Like FIX, it is modified by GGCX and PACE. It has four cleavage sites: R155, R271, R284, and R<sup>320</sup> (Suttie and Jackson, 1977; **Figure 2**). Two of these (R<sup>284</sup> and R320), are cleaved by FXa and the other two are thrombin-specific autoproteolytic sites. Depending on the cleavages, prethrombin 1, prethrombin 2, meizothrombin, fragment 1, and fragment 1.2 are generated from prothrombin (Galli and Barbui, 1999). The most important product after possible cleavages is prethrombin 2, which is an inactive FII intermediate that is further cleaved by FXa to form active thrombin (Boskovic et al., 1990). This active protein can be used in any surgical procedure to control excessive intra-operative bleedings (Croxtall and Scott, 2009). Therefore, rather than producing full-length FII, attempts concentrated on the production of functional isoforms. In 2008, Recothrom <sup>R</sup> , rThrombin produced in CHO cells, by Zymogenetics received United States market approval (Ratner, 2008).

Expression of prothrombin and prethrombin 2 is possible in tobacco plants but unfortunately, the specific activity of tobacco-made prethrombin 2 was not determined (Hooker et al., 1999). Although GGCX and PACE activities are required for the maturation of prothrombin, in the active thrombin, there is no Gla residue. Moreover, there is only one N-linked glycosylation site in the active form. Due to these reasons, plants can still be an alternative platform in the production of prothrombin, even without expression of GGCX and PACE, or prethrombin 2, as long as plant-made prothrombin or prethrombin 2 can be activated by factor Xa. However, this suggestion has to be tested and costs, especially for in vitro activation, have to be compared with other production hosts, especially human cell lines.

# OUTLOOK

Despite an increasing interest in using plants as alternative expression hosts, studies on plant-made blood clotting factors revealed that plant-based systems are no alternative platform as long as current problems are not solved. Yields of recombinant plant-made blood clotting factors can be enhanced by modulating many factors such as promoter and terminator activities, codon optimization of the transgene, matrix attachment regions, mRNA stability, translational efficiency, vector size, viral systems, and silencing suppressors. Production of bioactive vitamin K-dependent coagulation factors (FII, FVII, FIX, and FX) has not been achieved in plant-based systems yet. Vitamin K supplementation and expression of FIX, GGCX, PACE, and VKORC1, which have not been reported yet, can overcome current bottlenecks. Improvements in plant-made pharmaceuticals' quality and quantity are as important as maximizing the product yield (Buyel et al., 2015). Optimization of downstream processes and associated costs have not been addressed in previous plant-made clotting factor studies. To make plant-based systems competitive alternatives, upstream as well as downstream processes have to be performed in an advanced and cost-effective manner compared to traditional expression systems.

Glycosylation can affect structure and function of the protein and differences in plant and human glycosylation can induce immune responses in patients (Kosloski et al., 2009). Although the principles of protein Asn-linked (N) glycosylation and N-glycan core structures are identical between plants and humans, there are differences in specific proximal and terminal sugar residues within the glycan structures. Plant-specific α1,3 fucose and β1,2-xylose residues can be immunoreactive in humans, hence it is desirable to produce proteins with humanized glycosylation (Bardor et al., 2003; Gomord et al., 2010; Decker et al., 2014; Shaaltiel and Tekoah, 2016). For this purpose, the first step was to knock out plant-specific sugar residues by glycoengineering. This was achieved for the first time by knocking out FucT and XylT genes in Arabidopsis thaliana (Strasser et al., 2004) and in Physcomitrella patens (Koprivova et al., 2004). Due to the high homologous recombination rates in the moss P. patens, FucT and XylT genes have been knocked out and human β-1,4-galactosyltransferase was stably introduced into the moss genome a year later (Huether et al., 2005). With the advent of CRISPR/Cas9 technology, six genes responsible for xylose and fucose residues were knocked out in Nicotiana tabacum BY-2 suspension cells (Hanania et al., 2017; Mercx et al., 2017) and in N. benthamiana (Jansing et al., 2018) more recently. Moreover, overexpression of mammalian sialic acid biosynthesis pathway genes in N. benthamiana enabled in-vivo sialylation (Castilho et al., 2010; Castilho and Steinkellner, 2012).

Unlike N-glycosylation, which is partially conserved between all eukaryotes, plant O-glycosylation is fundamentally different from the typical human mucin-type O-glycosylation. In mucins, O-glycans are incorporated via an N-acetylgalactosamine to the hydroxyl side of serine or threonine residues in the Golgi apparatus. Neither these O-glycans nor the glycosyltransferases for the human mucin-type O-glycosylation are present in plants (Taylor et al., 2012; Tryfona et al., 2012; Saito et al., 2014). Although a single Gal attachment to Ser residues on specific proteins is observed in plants, mainly arabinose chains and complex arabinogalactans are attached to 4-transhydroxyproline (Hyp), whereas no modification on Hyp is observed in mammals (Strasser, 2013). Hence, the elimination of non-human prolyl-hydroxylation, which might potentially be the cause of immunogenic response in patients, can be a safe strategy to avoid adverse effects of plant-made pharmaceuticals. Addition of the prolyl 4-hydroxylase inhibitor 2,2-dipyridyl to N. tabacum Bright Yellow-2 suspension cultures abolished proline hydroxylation and arabinosylation (Yang et al., 2012). Moreover, targeted knockout of P4H1, the gene for non-human prolyl-hydroxylation of human erythropoietin recombinantly produced in P. patens, eliminated the attachment of plantspecific O-glycosylation (Parsons et al., 2013). The success of humanized N-glycosylation in plants (Castilho et al., 2010, 2013; Castilho and Steinkellner, 2012) inspires studies on de novo N-acetylgalactosamine (GalNAc) type O-glycosylation in plants. The attachment of a single GalNAc residue was achieved for

the first time with the transient expression of UDP-GlcNAc 4-epimerase (Yersinia enterocolitica), UDPGlcNAc/UDP-GalNAc (Caenorhabditis elegans), and human GalNAc-T2 in N. benthamiana (Daskalova et al., 2010). Subsequently, transgenic A. thaliana and tobacco BY2 cells were generated (Yang et al., 2012). Despite the significant progress, there are still limitations to achieve efficient humanized O-glycosylation in plant-made pharmaceuticals, such as heterogeneous plantproduced O-glycan structures, formation of core 2, 3, and 4 structures, and optimization of sub-Golgi targeting of mammalian glycosyltransferases in plants (Strasser, 2013). Taken together, several bioengineering challenges have to be addressed to really evaluate the competitiveness of plant-made clotting factors and plant-based systems in this field of biopharmaceuticals.

#### CONCLUSION

Plant-based systems are highly scalable, cost-effective, GMPcompliant, and safer than animal systems. They are successful in humanized N-glycosylation, and promising progress in mucin type O-glycans revealed that they are becoming competing alternatives to mammalian cells. Although studies summarized in this review show that plant-based systems are able to produce some bioactive blood coagulation factors, to the best of our knowledge, there were no further studies to fully characterize

#### REFERENCES


the potential of plant-made clotting factors in pre-clinical and clinical trials. The production of functional FIX and prothrombin is still challenging, as γ-carboxylation was not yet achieved in plant-based systems. Approaches aiming to acquire plant-made coagulation factors with enhanced pharmacological properties by engineering the glycosylation pathway as well as optimizing plant-based systems to meet the demands of industry norms will reveal the potential of plant systems.

#### AUTHOR CONTRIBUTIONS

OT, ED, and RR wrote the manuscript. UG contributed critical comments to the draft. All authors have read and approved the final manuscript.

#### FUNDING

We acknowledge funding by the Excellence Initiative of the German Federal and State Governments (EXC 294 to RR, GSC 4 to OT).

## ACKNOWLEDGMENTS

We thank Anne Katrin Prowse for language editing.


cells from endoplasmic reticulum stress but is not required for the secretion of selective proteins. J. Biol. Chem. 272, 4327–4334. doi: 10.1074/jbc.272.7.4327


fpls-10-00261 March 7, 2019 Time: 12:7 # 10

prevents inhibitor formation and fatal anaphylaxis in hemophilia B mice. Proc. Natl. Acad. Sci. U.S.A. 107, 7101–7106. doi: 10.1073/pnas.0912181107


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Top, Geisen, Decker and Reski. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Plant-Produced Chimeric VHH-sIgA Against Enterohemorrhagic E. coli Intimin Shows Cross-Serotype Inhibition of Bacterial Adhesion to Epithelial Cells

Reza Saberianfar1,2, Adam Chin-Fatt1,2, Andrew Scott<sup>1</sup> , Kevin A. Henry<sup>3</sup> , Edward Topp1,2 and Rima Menassa1,2 \*

<sup>1</sup> Agriculture and Agri-Food Canada, London Research and Development Centre, London, ON, Canada, <sup>2</sup> Department of Biology, University of Western Ontario, London, ON, Canada, <sup>3</sup> Human Health Therapeutics Research Centre, National Research Council Canada, Ottawa, ON, Canada

#### Edited by:

Anneli Marjut Ritala, VTT Technical Research Centre of Finland Ltd., Finland

#### Reviewed by:

Arjen Schots, Wageningen University & Research, Netherlands Hugh S. Mason, Arizona State University, United States

> \*Correspondence: Rima Menassa rima.menassa@canada.ca

#### Specialty section:

This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science

Received: 04 January 2019 Accepted: 19 February 2019 Published: 12 March 2019

#### Citation:

Saberianfar R, Chin-Fatt A, Scott A, Henry KA, Topp E and Menassa R (2019) Plant-Produced Chimeric VHH-sIgA Against Enterohemorrhagic E. coli Intimin Shows Cross-Serotype Inhibition of Bacterial Adhesion to Epithelial Cells. Front. Plant Sci. 10:270. doi: 10.3389/fpls.2019.00270 Enterohemorrhagic Escherichia coli (EHEC) has consistently been one of the foremost foodborne pathogen threats worldwide based on the past 30 years of surveillance. EHEC primarily colonizes the bovine gastrointestinal (GI) tract from which it can be transmitted to nearby farm environments and remain viable for months. There is an urgent need for effective and easily implemented pre-harvest interventions to curtail EHEC contamination of the food and water supply. In an effort to address this problem, we isolated single-domain antibodies (VHHs) specific for intimin, an EHEC adhesin required for colonization, and designed chimeric VHH fusions with secretory IgA functionality intended for passive immunotherapy at the mucosal GI surface. The antibodies were produced in leaves of Nicotiana benthamiana with production levels ranging between 1 and 3% of total soluble protein. in vivo assembly of all subunits into a hetero-multimeric complex was verified by co-immunoprecipitation. Analysis of multivalent protection across the most prevalent EHEC strains identified one candidate antibody, VHH10-IgA, that binds O145:Hnm, O111:Hnm, O26:H11, and O157:H7. Fluorometric and microscopic analysis also indicated that VHH10-IgA completely neutralizes the capacity of the latter three strains to adhere to epithelial cells in vitro. This study provides proof of concept that a plant-produced chimeric secretory IgA can confer cross-serotype inhibition of bacterial adhesion to epithelial cells.

Keywords: EHEC, SIgA, VHH, chimeric antibody, molecular farming

#### INTRODUCTION

Consumption of enterohemorrhagic Escherichia coli (EHEC) via contaminated food or water is associated with intestinal hemorrhage and osmotic dysregulation (Kandel et al., 1989). Each year, EHEC is estimated to affect approximately 230,000 people in the United States and is the fourth most frequently isolated food-borne pathogen from clinical stool samples (Hale et al., 2012).

Approximately 73,000 EHEC infections are caused by the O157:H7 serotype which has consistently been the most prevalent and virulent EHEC serotype over the approximately 30 years of United States national surveillance (Hale et al., 2012). Six additional EHEC serogroups, O26, O45, O103, O111, O121, and O145, known as the "Big Six," generally comprise >90% of non-O157 infections during any given year and have been traced to at least 22 human disease outbreaks in the United States since 1990 (CDC, 2017).

The gastrointestinal (GI) tract of cattle is considered the primary reservoir of EHEC and can contaminate various food or water supplies via excreted fecal matter or after slaughter during processing of the carcass (Montenegro et al., 1990; Beutin et al., 1993). Indeed, cattle density has been identified as a primary risk factor for the incidence of local EHEC infections (Brehony et al., 2018). In accord with the "One Health" framework, virtually all strategic interventions to prevent EHEC transmission to humans have focused on minimizing colonization of cattle reducing the risk of contamination from fecal shedding or at harvest. In cattle, EHEC principally adheres to and colonizes the lymphoid follicledense mucosa at the terminal rectum known as the rectoanal junction (Phillips et al., 2000; Naylor et al., 2003; Lim et al., 2007). The adhesin protein known as intimin mediates interaction of the bacteria with uninfected host epithelial cells and is a necessary prerequisite for intimate bacterial adhesion and colonization (Frankel et al., 1994).

The use and efficacy of recombinant secretory immunoglobulin A (sIgA) in passive mucosal immunotherapy is well established (Enriquez and Riggs, 1998; Virdi et al., 2013; Nakanishi et al., 2017; Vanmarsenille et al., 2018). Because sIgA application can impart immediate, albeit transient, protection from a pathogen, it may be of value to beef producers and processors as a pre-harvest intervention for EHEC. In the GI tract, sIgA primarily functions to clear pathogens by immune exclusion: after binding to its target, glycans on the secretory component (SC) facilitate binding to the mucus lining of the GI tract enabling clearance of sIgA–pathogen complexes by peristalsis (Macpherson et al., 2008). A sIgA directed against intimin would thus be expected to prevent luminal EHEC cells from interacting with the host epithelium, clearing them by entrapment in the mucus layer and subsequent fecal shedding.

Structurally, sIgA consists of an IgA dimer linked by two additional chains: a 15-kDa joining chain (JC) that links the IgA Fc end-to-end (Krugmann et al., 1997) and a 70-kDa SC that coils around both Fc chains (Bonner et al., 2007). A plant production platform is currently the most suitable for producing recombinant sIgA because of the requirement for glycosylation and disulfide bond formation for proper folding and assembly of sIgA subunits as well as higher relative yields and the prospect of oral delivery (Wycoff, 2005).

With the intent of blocking the interaction of EHEC with the intestinal mucosa, we immunized a llama with the C-terminal 277 residues of intimin, which extend extracellularly from the bacterial cell and mediate interaction with intestinal epithelial cells via binding to its cognate translocated receptor (Frankel et al., 1994). We produced and panned a phage-displayed library of llama heavy chain only antibody variable domains (VHHs) and identified VHHs that could bind and neutralize intimin γ, the main subtype associated with O157:H7. With passive mucosal immunotherapy and diagnostic development as end goals, we developed a chimeric antibody by fusing each of the isolated VHHs to a bovine IgA Fc and co-expressing them with both JC and SC subunits to enable sIgA functionality. Unlike native mammalian sIgA which consists of four light chains, four heavy chains, one JC, and one SC, the chimeric antibody (VHH-sIgA) is composed of four VHH-Fc heavy chain-only subunits, one SC, and one JC (**Figure 1A**). We demonstrated the recombinant production and correct assembly of the chimeric VHH-sIgA against intimin γ in Nicotiana benthamiana. We also characterized its binding to and neutralization of the O157:H7 serotype as well as the "Big Six" serotypes. This study is notable because of the potential for development of an oral passive mucosal immunotherapeutic capable of multivalent protection, and as a diagnostic tool for detection of four of the seven most prevalent EHEC strains.

## RESULTS

## Camelid VHHs Recognize EHEC O157:H7 Intimin With High Affinity

We selected the C-terminal 277 residues of intimin (Int-277) as the intended VHH-sIgA target because this region has previously been demonstrated to be immunogenic and to elicit IgAs (McKee and O'Brien, 1996; Gansheroff et al., 1999). Int-277 of EHEC O157:H7 strain EDL933 intimin γ was fused to maltose-binding protein (MBP) and the resulting fusion (MBP-Int277) was expressed in E. coli BL21 (DE3) cells by GenScript USA, Inc. (NJ, United States; **Supplementary Figure S1A**). Characterization of the MBP-Int277 protein by SDS-PAGE, western blotting, size exclusion chromatography, and ELISA using an anti-intimin antibody (Carvalho et al., 2005) collectively suggested that MBP-Int277 had the expected size (73.8 kDa), was monodisperse, and was correctly folded (**Supplementary Figures S1B–E**). Because camelids can generate heavy chain-only antibodies, a llama was immunized with MBP-Int277. A competition ELISA using a polyclonal anti-intimin antibody (Carvalho et al., 2005) suggested that the polyclonal antibody response in the llama was directed substantially toward Int277 rather than MBP, based on almost complete knock down of the anti-intimin antibody in the presence of serum from the immunized llama (**Supplementary Figure S1E**). A phagedisplayed VHH library was produced and panned for intiminspecific VHH sequences. Of five VHHs, four (VHH1, 3, 9, and 10) showed low-nanomolar monovalent binding affinities for intimin based on surface plasmon resonance (SPR) and had no binding to MBP alone (**Figure 1C**).

## Production of Chimeric sIgA Subunits in N. benthamiana

To produce chimeric VHH-IgAs, each VHH sequence was fused to a bovine IgA Fc (VHH-Fc). Each of the VHH-Fc, SC, and JC subunits were fused to the PR1b signal peptide

pathogenesis-related protein 1b signal peptide; VHHx-Fc, fusion of a camelid-derived VHH to a bovine Fc where x is either 1, 3, 9, or 10, corresponding to the isolated VHHs; SC, bovine secretory component; JC, bovine JC; c-Myc, FLAG, HA, detection tags; KDEL, endoplasmic reticulum retrieval tetra-peptide; CPMV 3 <sup>0</sup>UTR, 3<sup>0</sup> -untranslated region of Cowpea mosaic virus; nos, nopaline synthase terminator sequence; the cassettes were cloned into pEAQ-DEST-1 plant expression vectors. Schematic not drawn to scale. Bold outlines indicate translated regions. (C) Monovalent affinities and kinetics of the interaction between VHHs and MBP-Int277 by SPR (pH 7.4, 25◦C). (D) Predicted protein size and number of glycosylation sites for each subunit. (E–G) Western blots of crude extract from leaves of N. benthamiana harvested at 6 dpi expressing VHH1, 3, 9, and 10-Fc along with p19, a suppressor of gene silencing (E), SC (F), and JC (G). 10 µg of TSP was loaded in each lane.

and KDEL retrieval signal peptide to enable ER targeting and localization, as well as c-Myc, FLAG, and HA tags, respectively, to enable separate detection of the subunits upon co-expression (**Figures 1A,B**). For production in plant leaf tissue, the constructs were codon-optimized for N. benthamiana nuclear expression, cloned separately into pEAQ-DEST-1 (Sainsbury et al., 2009) plant expression vectors, and verified by DNA sequencing. Agrobacterium tumefaciens was transformed with each construct, then N. benthamiana leaves were co-infiltrated with each Agrobacterium strain along with an Agrobacterium strain containing p19, a suppressor of gene silencing from Cymbidium ringspot virus (CymRSV) (Silhavy et al., 2002; Saberianfar et al., 2015). Accumulation of each subunit at 6 days post-infiltration (dpi) was evaluated by western blotting using either anti-c-myc, anti-FLAG, or anti-HA antibodies to detect the respective subunits (**Figures 1E–G**). All subunits appeared to be of slightly higher molecular mass than their respective predicted molecular weights based on amino acid residues only, presumably due to glycosylation (**Figure 1D**). In the fully assembled native sIgA complex, each of the VHH-Fc, SC, and JC chains are predicted to have three, three, and one N-glycosylation sites, respectively (Steentoft et al., 2013). The glycans on native sIgA have been shown to protect the structure from proteolytic degradation in the harsh mucosal environment and may also exhibit some neutralization capacity against some bacterial strains by sterically hindering attachment of sugardependent receptors or fimbriae to epithelial cells (Wold et al., 1990; Ruhl et al., 1996; Royle et al., 2003).

#### Optimizing Co-expression of Chimeric sIgA Subunits

The correct assembly of the chimeric sIgA into a heteromultimeric protein complex likely requires the nascent polypeptides to be temporally and spatially coordinated in a predicted 4:1:1 stoichiometric ratio of VHH-Fc:SC:JC. To optimize the conditions for producing the assembled complex, we tested a range of Agrobacterium ratios for co-infiltration in N. benthamiana leaves (VHH-Fc:SC:JC of 1:1:1, 4:1:1, 4:1:2). We obtained accumulation levels of the subunits (g/kg) closest to the 4:1:1 present in the assembled sIgA with an Agrobacterium ratio of 4:1:2. Infiltration cultures were prepared by mixing Agrobacterium strains containing VHH3- Fc or VHH9-Fc with Agrobacterium strains containing SC, JC, and p19 at optical densities (OD at A600) of 0.50, 0.12, 0.24, and 0.12, respectively. The accumulation levels of each subunit were measured from 4 to 8 dpi. Accumulation levels of all three subunits in both infiltration mixtures peaked at 8 dpi (**Figures 2A,B**). VHH9-Fc mixtures reached the highest accumulation levels for all three subunits with VHH9-Fc at 0.22 g/kg, SC at 0.08 g/kg, and JC at 0.04 g/kg, resulting in a total of approximately 0.34 g/kg for sIgA subunits, which when converted to molar ratios result in 4.2:1:1.6 (VHH9- Fc:SC:JC). This combination was the closest to the expected 4:1:1 molar ratio required for assembly of sIgA and should allow for in vivo assembly of the subunits into a hetero-multimeric protein complex.

FIGURE 2 | Optimization of the accumulation levels of VHH-sIgA subunits after transient transformation of N. benthamiana leaves. (A) N. benthamiana leaf tissue was infiltrated with Agrobacterium mixtures containing the following ODs (A600); VHH-Fc: 0.57, SC: 0.14, JC: 0.14, and p19: 0.14. Leaf tissue was collected 4–8 dpi. TSP was extracted from pooled samples from three independent biological replicates. Ten micrograms of TSP was loaded in each well, separated by SDS-PAGE under reducing conditions, and visualized by western blot. Known amounts of c-Myc, HA, and FLAG tagged protein were used as reference (not shown). TSP from p19-infiltrated N. benthamiana leaves was used as a negative control. (B) Quantification of (A) by densitometry. (C) Time-course (6–12 dpi) of VHH10-Fc accumulation in combination with SC and JC.

Since the highest accumulation levels for VHH3-sIgA and VHH9-sIgA were reached at 8 dpi, the accumulation levels of VHH10-sIgA were monitored beyond 8 dpi to examine if higher accumulation could be achieved. We used similar Agrobacterium ODs for infiltration mixtures (0.57:0.14:0.14:0.14) as in the previous experiment, and monitored the accumulation of VHH10-Fc up to 12 dpi (**Figure 2C**). Over the course of the experiment, VHH10-Fc accumulated well up to 12 dpi. The accumulation of VHH10-Fc reached 0.12 g/kg fresh weight (FW).

#### The Chimeric sIgA Subunits Associate in vivo

Subunits of native sIgA are known to be covalently linked by disulfide bonds. To determine if the co-expressed subunits were physically associating, crude extracts of leaves infiltrated with VHH3-Fc/SC/JC were immunoprecipitated with the c-Myc antibody specific to the VHH-Fc subunit. The immunoprecipitated proteins were detected on a western blot with either anti-FLAG antibody specific to the SC subunit (**Figures 3A,C**) or anti-HA antibody specific to the JC subunit (**Figures 3B,D**). When the proteins were separated under reducing PAGE conditions, the ∼70-kDa SC subunit was detected in the extracts containing SC only and those containing all three subunits. However, after co-immunoprecipitation (co-IP), SC was only detected in the treatment containing all three subunits (**Figure 3A**), indicating that it was associated with the VHH-Fc subunit. Similarly, the ∼20-kDa JC was detected in extracts containing JC only and those containing all three subunits, but after co-IP, JC was only detected in the treatment containing all three subunits (**Figure 3B**). When the same samples were separated by non-reducing PAGE, SC expressed alone appeared as a main band at 70-kDa, with a ladder of larger products presumably representing multimerization via non-specific disulfide bond formation. When all three subunits were present in the cell extract (VHH3-Fc/SC/JC), several other intermediate products were detected. After co-IP, bands were only observed in extracts containing all three subunits, including a band running around 250 kDa, the expected size of the fully assembled chimeric sIgA (**Figure 3C**). Similarly, upon detection with anti-HA, we observed several faint bands representing JC multimers, and in the VHH3-Fc/SC/JC lane, JC monomer (shown with an arrow) and several other intermediate products were detected. As expected, after co-IP and detection with anti-HA, the same bands were only observed in the treatment containing all three subunits (**Figure 3D**). Co-IP experiments were performed with all constructs and similar results were consistently observed in every case (data not shown).

### Secretory IgA Subunits Assemble Into a Hetero-Multimeric Protein Complex in vivo

Although we determined that VHH-Fc accumulation peaks at 8 dpi, it was not clear how fast the subunits assemble into the chimeric sIgA complex. Therefore, a time-course experiment was performed in which leaf tissue was collected every 2 days from 4 to 12 dpi, separated by SDS-PAGE under nonreducing conditions, and detected with anti-HA (**Figure 4A**), anti-c-Myc (**Figure 4B**), and anti-FLAG (data not shown) antibodies. The results indicated that assembly of the chimeric sIgA and intermediates was gradual and continued through 12 dpi (**Figure 4A**, arrows 1–3; **Figure 4B**, arrows 1–2), while monomeric JC and a 90-kDa intermediate (**Figure 4A**, arrows 4–5) and monomeric VHH9-Fc and an 80-kDa intermediate (**Figure 4B**, arrows 3–4) showed diminishing accumulation across the same period. Taken together, these data suggest that the chimeric sIgA assembled with time, and that a later harvest may be beneficial.

# Vacuum Infiltration and Purification of VHH9-sIgA

To characterize the antigen- and pathogen-binding of chimeric VHH9-sIgA, large quantities of the purified assembled protein complex were required. Therefore, N. benthamiana plants were vacuum-infiltrated, and leaves were collected at 12 dpi. While IgG is routinely purified using protein A or protein G resins, there are no efficient methodologies available for purifying IgA molecules lacking light chains. Therefore, we compared two methods for purifying VHH9-sIgA. The first method took advantage of a peptide derived from a surface protein of Streptococcus pyogenes, peptide M, which binds to the Fc region of bovine IgA. The second purification method used an affinity resin that binds the FLAG tag we fused to SC (**Figure 5**).

In the crude leaf extract, VHH9-sIgA was the main product observed on a western blot detected with the c-Myc antibody (**Figure 5A**, extract lane, arrow 1). When purified with peptide M, unassembled and partially multimerized VHH9-Fc polypeptides were heavily enriched, and several bands were observed (**Figure 5A**). The strongest bands belonged to monomeric (∼44 kDa) and dimeric (∼88 kDa) VHH9-Fc (**Figure 5A**, arrows 4 and 5). In addition, three other bands were observed that correspond to the trimeric (∼132 kDa) and tetrameric (∼176 kDa) VHH9-Fc, and a fainter band representing the fully assembled chimeric sIgA (∼270 kDa) (**Figure 5A**, bands 3, 2, and 1, respectively). This method of purification was efficient and allowed recovery of 0.6 mg/ml of c-Myc-reactive antibody fragments, as estimated by whole lane densitometry against known amounts of a standard protein. However, purification with peptide M preferentially recovered monomeric and dimeric VHH9-Fc compared with fully assembled VHH9-sIgA.

We also used anti-FLAG agarose for purification of VHH9 sIgA in a second attempt to enrich for the fully assembled chimeric sIgA. We hypothesized that this method of purification should allow a higher recovery of the fully assembled sIgA since the FLAG tag is located on the SC subunit which is wrapped around the fully assembled sIgA complex. After Western blot and detection with anti-FLAG antibody, we observed several bands. The strongest band belonged to free or monomeric SC (∼66 kDa) (**Figure 5B**, arrow 5), but the fully assembled VHH9-sIgA band was much more prominent than following peptide M purification (∼270 kDa) (**Figure 5B**, arrow 1). Three

leaves was used as negative control.

other bands were also recovered which we speculate belong to sIgA intermediate products such as SC/trimeric VHH9- Fc/JC (∼206 kDa), SC/dimeric VHH9-Fc (∼160 kDa), and SC/VHH9-Fc (No. 4, ∼110 kDa) (**Figure 5B**, bands 4, 3, and 2, respectively). However, whole lane densitometry indicated that the purification with anti-FLAG agarose recovered much less protein (0.014 mg/ml) than peptide M (0.6 mg/ml).

## Plant-Produced VHH9-sIgA Is Antigen-Binding Competent

Intimin binding by plant-produced VHH9-Fc purified either using peptide M (yielding all VHH9-Fc molecules regardless of the presence of JC and SC) or anti-FLAG antibody (yielding SC as well as secretory VHH9-Fc in complex with SC, and possibly JC) was assessed by SPR and ELISA. No loss of intimin-binding affinity was observed by SPR for peptide M-purified VHH9- Fc produced in planta compared with VHH9 monomer produced in E. coli (**Figure 6A**). Moreover, both peptide M-purified and anti-FLAG-purified VHH9-Fc bound intimin with similar half maximal effective concentrations (EC50s) in ELISAs detected with horseradish peroxidase (HRP)-conjugated anti-bovine IgG antibody (**Figure 6B**). However, no binding of peptide M-purified VHH9-Fc was observed in ELISAs detected with anti-FLAG

antibody, suggesting that little SC was present in the purified material.

### Plant-Produced VHH10-sIgA Binds EHEC Strains O26:H11, O145:Hnm, O111:Hnm, and O157:H7

To determine if plant-produced chimeric VHH-sIgA antibodies bind to the seven most prevalent strains of EHEC, bacterial cells of O26:H11, O45:H2, O103:H2, O145:Hnm, O121:H19, O111:Hnm, and O157:H7 were incubated with VHH10-sIgA purified using anti-FLAG (binds the SC), then visualized using a secondary fluorescent antibody (rabbit anti-bovine-FITC) that binds the Fc and 4<sup>0</sup> ,6-diaminodino-2-phenylindole (DAPI) that stains bacterial cells. The confocal images showed consistent colocalization of FITC signal with strains O26:H11, O145:Hnm, O111:Hnm, and O157:H7 cells (**Figure 7**). Since the heavily glycosylated SC has been reported to interact with some bacterial strains, we were concerned that the observed co-localization could be a product of non-specific glycan-mediated binding, and not binding of the VHH to intimin. To address this, we compared binding of VHH10-sIgA with binding of VHH10-Fc expressed alone to EHEC O157:H7. Confocal images showed co-localization of VHH10-Fc with EHEC O157:H7 cells in the absence of SC and JC, suggesting that binding was likely VHHmediated (**Figure 8**). As a negative control, EHEC cells were also treated with PBS containing 0.1% Tween-20 (PBS-T) instead of antibodies and similarly stained but did not show fluorescence under FITC-related imaging conditions (480 nm excitation and 520–540 nm detection) (**Figure 8**).

### Plant-Produced VHH10-sIgA Reduces Adherence of Three EHEC Serotypes to Epithelial Cells

Since intimin mediates the intimate attachment of EHEC to epithelial cells, we investigated if the binding of VHH10-sIgA to EHEC could neutralize the ability of bacteria to adhere to epithelial cells. HEp-2 cells were incubated with a culture of one of seven EHEC strains (O26:H11, O45:H2, O103:H2, O145:Hnm, O121:H19, O111:Hnm, and O157:H7) in the presence or absence of VHH10-sIgA, washed to remove any non-adherent bacteria, and then visualized by immunofluorescence microscopy. HEp-2 cells were visualized by fluorescent actin staining using rhodamine phalloidin (red) and EHEC cells using a donkey anti-rabbit Alexa 350-conjugated secondary antibody (shown in white). Compared to the respective positive controls of HEp-2 cells and EHEC only, the addition of VHH10-sIgA seemed to abrogate the adhesion of EHEC strains O26:H11, O111:Hnm, and O157:H7 to HEp-2 cells, while it seems to somewhat reduce adhesion of EHEC strain O145:Hnm to HEp-2 cells (**Figure 9A**). To quantify the neutralization capacity of VHH10-sIgA, we adapted the adhesion assay for fluorometry and measured the relative fluorescence of HEp-2 cells incubated with a culture of each of the seven EHEC strains with and without VHH10-sIgA. The addition of VHH10-sIgA afforded complete protection, that

is, it reduced the relative fluorescence caused by adherent bacteria for strains O26:H11, O111:Hnm, and O157:H7 to background levels, and somewhat reduced the relative fluorescence caused by adherent bacteria for strain O145:nm, although this effect was not statistically significant (p = 0.09 in a T-test; **Figure 9B**).

We performed a multiple sequence alignment and derived a neighbor-joining tree of Int277 across all seven strains and found that EHEC strains O157, O111, O145, and O26 grouped together based on sequence similarity, while O45, O103, and O121 were more disparate in sequence (**Supplementary Figure S2**). With the exception of O145, this is in accord with VHH10-sIgA being able to bind and neutralize O26, O111, and O157 but not bind and neutralize O45, O103, and O121.

#### DISCUSSION

# Critical Factors Involved in Chimeric VHH-sIgA Production: High Accumulation Levels and Optimal Stoichiometric Ratio of the Subunits

Chimeric VHH-sIgA is a complex molecule composed of six subunits that require assembly into a functional unit. To achieve this, three key goals should be met: high accumulation levels of individual subunits, optimal molar ratio of the subunits, and proper assembly of the subunits into the multimeric complex.

To ensure high accumulation levels, proper folding, and post-translational modifications of recombinant proteins, we targeted the proteins of interest to the secretory pathway and retrieved them to the ER with an ER-retrieval tetrapeptide signal (KDEL) (Saberianfar et al., 2015; Xu and Ng, 2015; Saberianfar and Menassa, 2017).

Several strategies have been suggested to reach the optimal ratio for assembly of the sIgA subunits such as using promoters with varying strengths in multi-cassette vectors, or even in vitro re-association (or reconstitution) of the sIgA subunits (Longet et al., 2014). In practice however, empirical determination is necessary (Virdi et al., 2016). To ensure the desired 4:1:1 stoichiometric ratio of VHH-Fc:SC:JC that we hypothesized would be optimal for assembly and accumulation of this chimeric sIgA, we used different amounts of Agrobacterium for coinfiltrations of N. benthamiana leaves, and achieved relatively high accumulation levels up to 0.34 mg/g FW (3.24% of total soluble protein, TSP).

#### Time and Subunit Assembly as Limiting Factors for sIgA Production

When each of the sIgA subunits was transiently expressed alone, the VHH-Fc and SC subunits accumulated to relatively

(top) or anti-FLAG antibody (bottom). Results are representative of two independent experiments.

high levels, whereas the JC accumulated to much lower levels (**Figures 1E–G**). Higher accumulation of the JC was detected after co-expression with the VHH-Fc and SC subunits (**Figures 2A,B**), and both recombinant protein accumulation levels and sIgA assembly increased over time (**Figures 2**, **4**). Collectively, this suggests that upon co-expression, the chimeric sIgA subunits assemble in vivo over time into a complex that is more stable to degradation than the nascent unassembled chains, particularly with regards to the JC. Indeed, the incorporation of the JC into the dimeric IgA complex has previously been reported to be a key limiting factor for sIgA production in planta (Westerhof et al., 2015). Our findings support this, and furthermore suggest that stabilizing the JC may be a key target for further optimization of sIgA production.

#### Plant-Produced VHH-sIgA Binds E. coli O26:H11, O145:Hnm, O111:Hnm, and O157:H7

We showed by SPR that both the plant-produced VHH9-Fc chain as well as the assembled VHH9-sIgA complex have the same binding affinity as monomeric VHH9 produced in E. coli,

co-localization.

respectively. Bar, 2 µm.

fpls-10-00270 March 12, 2019 Time: 12:0 # 11

suggesting that binding is modularly mediated and retained via the VHH following Fc fusion and assembly with the SC and JC. Because VHH10 was shown to have superior binding affinity by SPR, it was chosen to be advanced toward pathogen binding and neutralization assays. We observed consistent co-localization of VHH10-sIgA with strains O26:H11, O145:Hnm, O111:Hnm, and O157:H7 cells by immunofluorescence confocal microscopy. Coverage was observed across the entirety of the cells unlike previous reports of partial binding for rat- and chicken-produced antibodies against EHEC (Cook et al., 2007), suggesting that intimin is abundantly embedded across the entire cell surface membrane of these four strains and is accessible to VHH10-sIgA.

#### Plant-Produced VHH10-sIgA Neutralizes the Ability of EHEC Strains O26:H11, O111:Hnm, and O157:H7 to Adhere to Epithelial Cells

All EHEC strains use a highly conserved type III secretion system to enable colonization of intestinal epithelial cells. Intimate adherence mediated by intimin docking to its translocated cognate receptor is a necessary prerequisite for invasion and virulence (Dziva et al., 2004; Buttner, 2012). Given that our results indicate that VHH10-sIgA prevents intimate adherence for strains O26:H11, O111:Hnm, and O157:H7, it is tempting to speculate that this protective effect will also be observed when used in animal trials. Although VHH10-sIgA was able to bind O145:Hnm, fluorometry and confocal images of the adhesion assay suggested a compromised ability to neutralize. It is possible that VHH10-sIgA can partially bind O145:Hnm but not sufficiently to prevent intimate adherence to epithelial cells. Regarding the differential capacity of VHH10-sIgA to bind and neutralize across strains, we speculate that this may be due to sequence variability across the C-terminal 277 residues of the intimin protein. Although the transmembrane and intracellular residues are strongly conserved in native intimin, the extracellular Int277 region is highly variable and is likely shaped by selection pressures of the host immune system. Despite O145 being similar in sequence to O157, weaker binding and neutralization of O145 may be due to sequence variability at a local epitope to which VHH10 binds rather than sequence conservation of Int277 as a whole.

The finding that VHH10-sIgA offers multivalent protection against EHEC O26:H11, O111:Hnm as well as O157:H7 is notable because the vast majority of previously developed therapeutics against EHEC have focused on O157 only, despite the clinical relevance of the "Big Six" strains. The current incidences across the United States of O26, O111, and O157 are 206, 125, and 807 per 100,000 individuals, respectively (CDC, 2017). Although non-O157 strains are individually less prevalent, the collective contribution of non-O157 strains to GI illness has recently been of growing concern, particularly since surveillance data indicated a 41% increase in the average annual incidence of infection of non-O157 strains over the last 5 years across the United States. The majority of diagnostic, intervention, and awareness strategies have historically been O157-specific (Gill and Gill, 2010; CDC, 2017). O26 and O111 currently account for 25.5 and 15.5% of non-O157 EHEC infections, respectively (CDC, 2017).

In conclusion, we have designed and produced a chimeric antibody with secretory IgA functionality against three EHEC strains and demonstrated its production and assembly in N. benthamiana. Testing this antibody to verify that high mannose glycans as opposed to complex glycans usually found on SC interact with the mucus lining of the GI tract will be important toward ensuring in vivo functionality. Further work testing the efficacy of VHH10-sIgA in live animals will hopefully confirm its ability to prevent EHEC colonization and shedding as well as its utility for fast acting prevention and intervention. Because of its multivalency, VHH10-Fc may also be useful if developed as a diagnostic reagent for detecting O26, O111, O145, and O157 in food, the environment, colonized animals, and in infected individuals. Currently, there are no EHEC diagnostics available on the market that can detect both O157 and non-O157 strains despite their clinical relevance. We are optimistic that either of these directions for development will be of value in minimizing EHEC contamination of food and the environment.

#### MATERIALS AND METHODS

## Production of Recombinant EHEC O157:H7 Intimin

A DNA sequence encoding the C-terminal 277 residues of E. coli O157:H7 strain ELD933 intimin (**Supplementary Figure S1A**) was fused C-terminally to MBP and cloned into pMAL-p5X. E. coli BL21 (DE3) cells were transformed with the construct and grown overnight under IPTG induction. The next day, cells

cells with a donkey anti-rabbit secondary antibody (white) as well as the actin cytoskeleton of HEp-2 cells using rhodamine phalloidin (red). Shown are merged images of the red and white channels for either HEp-2 cells incubated with EHEC alone (left panel) or with EHEC and VHH10-sIgA (right panel). (B) VHH10-sIgA reduces fluorescence of O26:H11, O111:Hnm, and O157:H7 to background levels. Shown is the relative fluorescence of EHEC strains that have been immunolabeled, are adherent on HEp-2 cells, and either incubated on HEp-2 cells alone or in combination with VHH10-sIgA. As a negative control, HEp-2 cells were incubated with PBS instead of a bacterial strain or antibody. <sup>∗</sup> indicates a significant reduction of the amount of immunolabeled adherent bacteria as determined by a one-tailed unpaired homoscedastic T-test between an EHEC strain alone versus the same EHEC strain with VHH10-sIgA added (p < 0.05, N = 3 biological replicates). Error bars indicate standard errors of the means.

were harvested, lysed by sonication, centrifuged at 20,000 × g for 20 min, and the MBP-Int277 fusion protein was purified using amylose affinity chromatography.

# Isolation of VHHs

fpls-10-00270 March 12, 2019 Time: 12:0 # 13

Camelid VHHs were generated against recombinant intimin as previously described (Henry et al., 2015, 2016). Experiments involving animals were conducted using protocols approved by the National Research Council Canada Animal Care Committee and in accordance with the guidelines set out in the OMAFRA Animals for Research Act, R.S.O. 1990, c. A.22. Briefly, a male llama (Lama glama) was immunized subcutaneously with 180 µg of MBP-Int277 in a total volume of 1 ml Tris-buffered saline (50 mM Tris, 150 mM NaCl, 10% glycerol, pH 8.0) emulsified in an equal volume of complete Freund's adjuvant (Cedarlane, Burlington, ON, Canada) (day 1). The animal was boosted with the same dose of MBP-Int277 emulsified in incomplete Freund's adjuvant (Cedarlane) on days 21, 28, 35, and 42. Serum polyclonal antibody responses against MBP-Int277 were monitored using indirect ELISA and detected using HRP-conjugated goat antillama IgG antibody (Cedarlane, Cat. No. A160-100P). Total RNA was extracted from peripheral blood lymphocytes collected at days 35 and 49, reverse transcribed, and then expressed VHH genes were amplified via two rounds of nested PCR and cloned into the pMED1 phagemid vector. The final size of the phage-displayed VHH library size was 5 × 10<sup>6</sup> independent transformants, with an insertion rate of >95%. Library diversity was verified by DNA sequencing.

VHH-displaying phage were rescued from library-containing E. coli TG1 cells by coinfection with M13KO7 helper phage and purified by polyethylene glycol precipitation. Library phage were panned for three rounds against 20 µg of MBP-Int277 immobilized in wells of microtiter plates. Bound phage was eluted with 0.1 M triethylamine for 10 min, neutralized with 1 M Tris-HCl, pH 7.4, and amplified in exponentially growing E. coli TG1 cells for subsequent panning rounds. After the final round of panning, binding of individual phage clones was assessed by monoclonal phage ELISA and detected using HRP-conjugated rabbit anti-M13 antibody (GE Healthcare, Cat. No. 27-9421-01, Piscataway, NJ, United States).

#### Cloning and Transient Expression in N. benthamiana

The bovine Fc, JC, and SC sequences were obtained from the NCBI public database (ANN46383, NP\_786967, and NP\_776568, respectively). The VHHx-Fc, JC, and SC genes were synthesized by Bio Basic Inc. (Markham, ON, Canada), cloned into pEAQ-DEST-1 plant expression vectors (Sainsbury et al., 2009), and transformed into A. tumefaciens (EHA105). N. benthamiana plants were grown in a growth chamber at 22◦C with a 16 h photoperiod at a light density of 110 µmol m−<sup>2</sup> s −1 for 7 weeks, or in a greenhouse with natural light for 5 weeks before infiltration. Plants were fertilized with water soluble N:P:K (20:8:20) at 0.25 g/l (Plant Products, Brampton, ON, Canada). Agrobacterium cultures were prepared as previously described (Saberianfar et al., 2015). Transient expressions were performed either by injection (Miletic et al., 2015) or by vacuum infiltration for small-scale or largescale transformations, respectively. Prior to vacuum infiltration, Agrobacterium transformed with expression vectors encoding either VHH3-Fc, VHH9-Fc, VHH10-Fc, SC, JC, or p19 were sub-cultured from starter cultures and grown separately in Luria-Bertani (LB) broth at 28◦C overnight. Each of the cultures bearing constructs encoding the VHHx-Fc constructs was then combined with cultures carrying SC, JC, and p19. Trays of N. benthamiana plants were inverted and submerged into each of these co-cultures and placed into a vacuum chamber. To enable infiltration into the leaves, a pump was used to lower the pressure of the chamber to 85 kPa for 2 min and then immediately released. Plants were transferred back to the growth chamber until sampling.

# Tissue Sampling and Protein Extraction

Depending on the experiment, leaf tissue was collected 4–12 dpi. Four leaf discs were collected from each biological replicate. Protein extraction and total soluble protein quantification were performed as previously described (Conley et al., 2009).

#### Recombinant Protein Quantification

Quantification of VHH-Fc, JC, and SC was performed by western blot analysis. Samples were run under reducing or non-reducing conditions. Ten micrograms of TSP was resolved using NuPAGETM 3–8% tris-acetate protein gels (Thermo Fisher Scientific, Waltham, MA, United States) and transferred to PVDF membranes. The recombinant proteins were detected with one of the following primary antibodies: mouse antic-Myc monoclonal antibody (GenScript, Cat. No. A00864), mouse anti-HA monoclonal antibody (Millipore Sigma, Cat. No. H3663), mouse anti-FLAG monoclonal antibody (Millipore Sigma, Cat. No. F3165), and HRP-conjugated goat antimouse IgG secondary antibody (Bio-Rad, Cat. No. 170-6516). Detection was performed using Enhanced Chemiluminescent detection solution (Biorad Laboratories Inc., Hercules, CA, United States) and a MicroChemi 4.2 imaging system with GelCapture acquisition software (DNA Bio-Imaging Systems Ltd., Jerusalem, Israel). Recombinant proteins were quantified by image densitometry using Totallab TL100 software (Nonlinear Dynamics, Durham, NC, United States), against known amounts of a standard protein loaded on every gel.

#### ELISA

ELISAs using plant-produced VHH-sIgAs were conducted essentially as described previously (Henry et al., 2015, 2016). Wells of Nunc MaxiSorp <sup>R</sup> microtiter plates (Thermo-Fisher, Waltham, MA, United States) were coated overnight at 4◦C with 100 ng of MBP-Int277 in 35 µl of PBS, pH 7.4. The next day, wells were blocked with 200 µl of PBS containing 2% (w/v) skim milk for 1 h at 37◦C. Purified VHH-sIgAs were serially diluted in PBS containing 1% (w/v) bovine serum albumin (BSA) and 0.1% (v/v) Tween-20 and added to wells after rinsing 3× with PBS. After incubating for 2 h at room temperature, wells were washed 5× with PBS-T and 2× with PBS. Secondary and/or tertiary antibodies [Monoclonal mouse anti-FLAG <sup>R</sup> M2 antibody (Sigma–Aldrich, Cat. No. F3165, St. Louis,

MO, United States), HRP-conjugated polyclonal donkey antimouse IgG (Jackson ImmunoResearch, Cat. No. 715-035-150, West Grove, PA, United States), or HRP-conjugated polyclonal sheep anti-bovine IgA (Abcam, Cat. No. ab12755, Cambridge, United Kingdom)] were diluted 1:5000 in PBS containing 1% BSA and 0.1% Tween-20 and added sequentially to wells, washing 5× with PBS-T and 2× with PBS after each incubation. Wells were developed with 35 µl of tetramethylbenzidine substrate, stopped after 5 min with 35 µl of 1 M H2SO4, and absorbance at 450 nm was measured using a MultiskanTM FC photometer (Thermo-Fisher).

#### Recombinant Protein Purification

Plant extracts were prepared under native conditions as described above. Purification was performed using peptide M/Agarose (Invivogen, San Diego, CA, United States, Cat. No. gelpdm-5) and anti-DYKDDDDK G1 affinity resin (GenScript, Piscataway, NJ, United States, Cat. No. L00432) according to the manufacturers' protocols.

#### Surface Plasmon Resonance (SPR)

Prior to SPR analyses, VHH monomers and MBP-Int277 were purified by size exclusion chromatography using a SuperdexTM 75 10/300 GL column (GE Healthcare, Mississauga, Canada) connected to an ÄKTA FPLC protein purification system (GE Healthcare) into HBS-EP+ buffer [10 mM HEPES buffer, pH 7.4, containing 150 mM NaCl, 3 mM EDTA, and 0.05 % (v/v) surfactant P20]. Approximately 700–1600 response units (RUs) of MBP-Int277 were immobilized in 10 mM acetate buffer, pH 4.5, on CM5 sensor chips using an amine coupling kit (GE Healthcare). Multi-cycle kinetic analyses were carried out on a Biacore T200 instrument (GE Healthcare) at 25◦C by injecting VHHs at concentrations ranging from 0.3 to 400 nM, at a flow rate of 30–50 µl/min and with a contact time of 300 s, and then allowing the VHHs to dissociate for 600 s. Data were analyzed using BIAevaluation software version 4.1 (GE Healthcare) and fitted to a 1:1 binding model. The MBP-Int277 surface was regenerated between injections using glycine buffer, pH 1.5.

For SPR analyses of plant-produced VHH-IgAs, approximately 2200 RUs of VHH-IgA or 100 RUs of matched VHH monomer were immobilized in 10 mM acetate buffer, pH 3.5, on CM5 Series S sensor chips using an amine coupling kit. Single-cycle kinetic analyses were carried out on a Biacore T200 instrument at 25◦C by injecting MBP-Int277 in HBS-EP+ buffer at concentrations ranging from 0.3 to 5 nM, at a flow rate of 30 µl/min and with a contact time of 300–600 s, and then allowing the VHHs to dissociate for 600 s. Data were analyzed using BIAevaluation software version 4.1 (GE Healthcare) and fitted to a 1:1 binding model. The antibody surface was regenerated between injections using glycine buffer, pH 1.5.

#### Enterohemorrhagic E. coli Binding Assays

Enterohemorrhagic Escherichia coli strains O26:H11, O45:H2, O103:H2, O145:Hnm, O121:H19, O111:Hnm, and O157:H7 were obtained from Dr. Michael Mulvey at the Public Health Agency of Canada, National Microbiology Laboratory, E. coli Unit, Enteric Diseases Program, Winnipeg, MB, Canada. EHEC strains were individually grown overnight in 5 ml of LB broth (Miller Formulation, Difco, Thermo Fisher Scientific, Ottawa, ON, Canada) at 37◦C. The next day, 100 µl of the overnight culture was inoculated in 3 ml of LB broth and grown to an OD<sup>600</sup> of 0.7–0.9. Cells were harvested from 1 ml of the culture by centrifugation at 13,000 × g for 5 min, rinsed three times in PBS for 5 min each time. The bacterial pellet was then resuspended in 1 ml of 2.5% paraformaldehyde (PFA) and incubated at 37◦C for 10 min with gentle agitation (350 rpm). The excess PFA was rinsed by centrifugation at 13,000 × g for 5 min. The pellet was then resuspended in 200 µl of PBS-T and incubated overnight at 4◦C. The next day, 20 µl aliquots of the cell suspension were prepared in separate tubes, centrifuged at 13,000 × g for 5 min, and resuspended in 20 µl of the primary plant-produced antibody treatments (100 ng/µl) in PBS-T, as well as PBS-T with no antibody as control, and incubated at 37◦C for 90 min with gentle agitation (350 rpm). The primary antibodies were removed by centrifugation at 13,000 × g for 5 min, followed by three washes in PBS for 5 min each time. The cells were then resuspended in 20 µl aliquots of secondary antibody, rabbit anti-bovine IgG, IgM, IgA-FITC (1:40 dilution, Thermo Fisher Scientific, Cat. No. SA1-36043), and incubated at 37◦C for 1 h. The cells were washed and rinsed three times in PBS as described previously, and one final time in dH2O. To stain the bacteria, the cells were resuspended for 2 min in 20 µl aliquots of DAPI (10 mg/ml solution diluted 1:1000 in dH2O, Thermo Fisher Scientific, Cat. No. D1306), centrifuged at 13000 × g for 5 min, and resuspended in 20 µl of dH2O. The cells were then transferred onto poly-L-lysine coated coverslips (Millipore Sigma, Cat. No. S1815) contained in a 24-well plate and centrifuge at 450 × g for 10 min. Coverslips were then dried and mounted onto glass slides with Aqua-Poly/Mount (Polyscience Inc., Warrington, PA, United States, Cat. No. 18606).

#### HEp-2 Adherence Inhibition Assay

HEp-2 cells were grown in eight-well chamber slides in Dulbecco's Modified Eagle Medium (DMEM, Life Technologies, Thermo Fisher Scientific, Toronto, ON, Canada) supplemented with 10% fetal bovine serum at 37◦C in 5% CO<sup>2</sup> to ∼80% confluency. EHEC strains O26:H11, O45:H2, O103:H2, O145:Hnm, O121:H19, O111:Hnm, and O157:H7 were individually grown overnight in 5 ml of LB broth at 37◦C then subcultured into DMEM at a 1:50 dilution and incubated at 37◦C in 5% CO<sup>2</sup> for 2 h. This subculture was further diluted at 1:10 in DMEM with and without 100 ng/ml of VHH10- Fc/SC/JC and then incubated with the HEp-2 cells at 37◦C in 5% CO<sup>2</sup> for 3 h. The cultures were then washed with PBS to remove non-adherent bacteria and fixed using 2.5% PFA (Sigma) in PBS. Cells were then washed in PBS four times and blocked overnight in PBS containing 10% BSA and 0.1% Triton X-100. Cells were then hybridized with Alexa 647 phalloidin (Thermo Fisher Scientific, Cat. No. A22287) used to visualize actin in the HEp-2 cells and donkey anti-rabbit Alexa 350 (Thermo Fisher Scientific, Cat. No. A10039) used to visualize EHEC cells. Cells were then washed in PBS and mounted using Aqua-Poly/Mount (Polyscience Inc., Warrington, PA, United States, Cat. No. 18606). To quantify adherence inhibition

by relative fluorescence, the assay was adapted by growing the HEp-2 cells in 96-well black fluorometry plates that had been coated with poly D-lysine. Relative fluorescence was measured using a Synergy2 plate reader (Biotek) using the Gen5 v1.10 software (Biotek). Relative fluorescence of the donkey anti-rabbit IgG Alexa 350 antibody (Thermo Fisher Scientific, Cat. No. A10039) used to visualize EHEC cells was measured in each well at 37◦C, with 5 s intermediate shaking, excitation at 360◦nm, and emission at 460◦nm.

#### Confocal and Fluorescence Microscopy

To visualize binding of the VHH-sIgA to E. coli cells, FITC and DAPI sequential imaging was performed with an Olympus LSM FV 1200 or a Leica TCS SP2 CLSM. Images were acquired with 100× oil objective lens. FITC was imaged by excitation with a 480 nm laser and detection at 520–540 nm. DAPI was imaged by excitation at 350 nm and detection at 455–465 nm. To visualize adherence to HEp-2 cells, a Leica TCS SP2 confocal microscope was used. Images were acquired with a 64× water objective lens. Alexa 647 phalloidin was imaged by excitation at 650 nm and detection at 660–680nm. The donkey anti-rabbit Alexa 350 antibody was visualized by excitation at 350 nm and detection at 455–465 nm.

#### DATA AVAILABILITY

All datasets generated for this study are included in the manuscript and/or the **Supplementary Files**.

#### AUTHOR CONTRIBUTIONS

RM conceived the study. RS and RM designed the research. RS, KH, and AC-F performed the experiments. RS, AC-F, KH, and RM wrote the manuscript. AS assisted with the binding and adhesion experiments. ET provided feedback on experimental design and result interpretations. RS, KH, AC-F, AS, ET, and RM edited the manuscript.

#### REFERENCES


#### FUNDING

This research was supported by Agriculture and Agri-Food Canada A-base project 1258 to RM.

#### ACKNOWLEDGMENTS

We thank Henk van Faassen and Greg Hussack at NRC for help with SPR experiments and Dr. Alison O'Brien from the Uniformed Services University of the Health Sciences for the goat anti-intimin polyclonal antibody. We also thank Hong Zhu and Angelo Kaldis at Agriculture and Agri-Food Canada for providing technical support, and Alex Molnar for assistance with preparation of figures.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00270/ full#supplementary-material

FIGURE S1 | Validation of recombinant maltose-binding protein (MBP)-EHEC O157:H7 intimin fusion protein. (A) Amino acid sequence of MBP-Int277 fusion protein. MBP sequence is underlined and TEV protease cleavage site is shown in blue. The fusion protein had a molecular mass of 73,781 Da and a theoretical pI of 6.06. (B) SDS-PAGE (4–20% gradient) stained with Coomassie Brilliant Blue. Lane 1, BSA; Lane 2, MBP-Int277. (C) Western blot of MBP-Int277 using either anti-6 × His antibody (Lane 3) or anti-MBP antibody (Lane 4). (D) Size exclusion profile of MBP-Int277 on a SuperdexTM 75 10/300 GL column showing monodisperse behavior. (E) Binding of polyclonal goat anti-intimin antibody to MBP-Int277 in ELISA and detected with HRP-conjugated donkey anti-goat IgG.

FIGURE S2 | Sequences for Int277 are similar across EHEC strains O157, O111, O26, and O145. (A) Multiple sequence alignment using Clustal Omega default settings of Int277 protein sequence for the seven tested EHEC strains. The alignment has been shaded to show identical residues in black and similar residues in gray. (B) Phylogenetic tree using a neighbor joining method to cluster the aligned Int277 sequences based on similarity. The cladogram shown has not been corrected for evolutionary distance and is merely meant to be representative of how the strains cluster based on similarity.

actin polymerization. Infect. Immun. 73, 2541–2546. doi: 10.1128/iai.73.4.2541- 2546.2005



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Saberianfar, Chin-Fatt, Scott, Henry, Topp and Menassa. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Recombinant Production of MFHR1, A Novel Synthetic Multitarget Complement Inhibitor, in Moss Bioreactors

*Oguz Top1,2†‡ , Juliana Parsons1‡ , Lennard L. Bohlender1 , Stefan Michelfelder3 , Phillipp Kopp1 , Christian Busch-Steenberg1 , Sebastian N. W. Hoernstein1 , Peter F. Zipfel4 , Karsten Häffner3 , Ralf Reski1,2,5† and Eva L. Decker1 \**

*Edited by: Anneli Marjut Ritala, VTT Technical Research Centre of Finland Ltd, Finland*

#### *Reviewed by:*

*Mareike Schallenberg-Rüdinger, Universität Bonn, Germany Muriel Bardor, Université de Rouen, France*

#### *\*Correspondence:*

*Eva L. Decker eva.decker@biologie.uni-freiburg.de orcid.org/0000-0002-9151-1361*

*† orcid.org/0000-0003-2820-6505 † orcid.org/0000-0002-5496-6711* 

*‡*

*These authors have contributed equally to this work*

#### *Specialty section:*

*This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science*

*Received: 28 November 2018 Accepted: 19 February 2019 Published: 20 March 2019*

#### *Citation:*

*Top O, Parsons J, Bohlender LL, Michelfelder S, Kopp P, Busch-Steenberg C, Hoernstein SNW, Zipfel PF, Häffner K, Reski R and Decker EL (2019) Recombinant Production of MFHR1, A Novel Synthetic Multitarget Complement Inhibitor, in Moss Bioreactors. Front. Plant Sci. 10:260. doi: 10.3389/fpls.2019.00260*

*1Department of Plant Biotechnology, Faculty of Biology, University of Freiburg, Freiburg, Germany, 2Spemann Graduate School of Biology and Medicine (SGBM), University of Freiburg, Freiburg, Germany, 3Faculty of Medicine, Department of General Pediatrics, Adolescent Medicine and Neonatology, Medical Center - University Freiburg, University of Freiburg, Freiburg, Germany, 4Leibniz Institute for Natural Product Research and Infection Biology, Friedrich Schiller University, Jena, Germany, 5Signalling Research Centres BIOSS and CIBSS, University of Freiburg, Freiburg, Germany*

The human complement system is an important part of the immune system responsible for lysis and elimination of invading microorganisms and apoptotic body cells. Improper activation of the system due to deficiency, mutations, or autoantibodies of complement regulators, mainly factor H (FH) and FH-related proteins (FHRs), causes severe kidney and eye diseases. However, there is no recombinant FH therapeutic available on the market. The first successful recombinant production of FH was accomplished with the moss bioreactor, *Physcomitrella patens*. Recently, a synthetic regulator, MFHR1, was designed to generate a multitarget complement inhibitor that combines the activities of FH and the FH-related protein 1 (FHR1). The potential of MFHR1 was demonstrated in a proof-of-concept study with transiently transfected insect cells. Here, we present the stable production of recombinant glyco-engineered MFHR1 in the moss bioreactor. The key features of this system are precise genome engineering *via* homologous recombination, Good Manufacturing Practice-compliant production in photobioreactors, high batch-tobatch reproducibility, and product stability. Several potential biopharmaceuticals are being produced in this system. In some cases, these are even biobetters, i.e., the recombinant proteins produced in moss have a superior quality compared to their counterparts from mammalian systems as for example moss-made aGal, which successfully passed phase I clinical trials. *Via* mass spectrometry-based analysis of moss-produced MFHR1, we now prove the correct synthesis and modification of this glycoprotein with predominantly complex-type N-glycan attachment. Moss-produced MFHR1 exhibits cofactor and decay acceleration activities comparable to FH, and its mechanism of action on multiple levels within the alternative pathway of complement activation led to a strong inhibitory activity on the whole alternative pathway, which was higher than with the physiological regulator FH.

Keywords: *Physcomitrella patens*, moss bioreactor, factor H, plant-made recombinant pharmaceuticals, synthetic complement inhibitor, alternative pathway of complement activation, aHUS, C3 glomerulopathy

#### INTRODUCTION

Biopharmaceutical production is a steadily growing field within the pharmaceutical market (International Federation of Pharmaceutical Manufacturers & Associations IFPMA, 2017). Plant-based systems can offer advantages in sectors neglected by the mainstream bacterial or mammalian systems and their current business models (Stoger et al., 2014). Impressive recent examples for fully developed plant production processes comprise, e.g., Protalix, the carrot-cell-derived Taliglucerase alfa from a stable production process, as well as the transiently expressed ZMapp antibody cocktail to fight Ebola infections (Zimran et al., 2011; Lomonossoff and D'Aoust, 2016), the announcement of the clinical phase 3 start for the seasonal quadrivalent influenza vaccine produced in *Nicotiana benthamiana*<sup>1</sup> and the successful completion of clinical phase 1 for recombinant alphagalactosidase A (moss-aGal) produced in *Physcomitrella patens*<sup>2</sup> .

The development of pharmaceuticals treating diseases, which are associated with malfunction of the human complement system, may be an attractive field for plant-based production. As part of the innate immune system, the complement system is a crucial defense line against invading microorganisms (Thurman and Holers, 2006). Furthermore, it is essential for tissue homeostasis by discriminating damaged or apoptotic host cells from healthy tissues and promoting their elimination (Janeway and Medzhitov, 2002; Zipfel and Skerka, 2009). The activity of more than 50 plasma proteins functioning in a cascade of enzymatic reactions, either in plasma (fluid-phase) or on cell surfaces, has to be tightly controlled by regulatory proteins. The malfunction of these complement regulators can lead to over-activation of the system with the consequence of severe diseases, especially of the kidney and eyes (Józsi and Zipfel, 2008).

The complement system can be activated by three different mechanisms, the classical, lectin, and alternative pathway. They merge at the level of C3 activation by generating variants of the so-called C3 convertase, an enzymatic complex able to cleave C3 molecules into the active forms C3b, an opsonin, and C3a, an anaphylatoxin (**Figure 1A**; Thurman and Holers, 2006). The alternative pathway of complement activation (AP) constitutively hydrolyzes C3 at low levels, generating C3(H2O), and subsequently C3(H2O)Bb, the fluid-phase C3 convertase. In the so-called amplification loop, the C3 convertase produces more C3b molecules, and additional C3 convertase (C3bBb) is formed (Zipfel and Skerka, 2009). This pathway can also be initiated by the presence of bacterial lipopolysaccharides (LPS) (Pangburn et al., 1980). The last step in the proximal part of the alternative pathway comprises the addition of another C3b molecule to the C3 convertase, thus forming the C5 convertase (C3bBbC3b). The terminal pathway, common to all three routes of complement activation, starts with the C5 convertase-catalyzed cleavage of C5 to C5a and C5b and involves the non-enzymatic assembly of a complex of C5b with the plasma proteins C6, C7, C8, and C9. This complex, the membrane attack complex (MAC) or terminal complement complex (TCC), creates a pore on cell membranes leading to the lysis of target cells (**Figure 1A**; Morgan et al., 2016).

As C3b binds to infectious microbes as well as adjacent host cells alike and marks them for lysis, host tissues have to be protected from damage with the help of complement regulatory proteins (De Córdoba and De Jorge, 2008). Factor H (FH) is the main regulator of the activation and amplification of the alternative pathway cascade. It is a single-chain plasma glycoprotein consisting of 20 globular short consensus repeat (SCR) domains (Zipfel et al., 1999) and is active in the fluid phase as well as on host cell surfaces. FH inhibits C3-convertase (C3bBb) formation by competing with FB for binding to C3b and promotes the irreversible dissociation of preformed C3bBb (decay acceleration activity) (Hourcade et al., 1999). It also acts as a cofactor for the factor I (FI)-dependent inactivation of C3b in the fluid phase, the so-called cofactor activity of FH. While these activities reside in the N-terminal SCRs 1–4 of FH (Kühn et al., 1995), the C-terminal SCRs 19–20 interact with cell surfaces *via* binding to polyanions, such as glycosaminoglycans (GAGs), thus protecting host cells from complement attack.

Factor H forms a small family with five related proteins (FHR1–5), also composed of SCR domains which share a high degree of sequence identity (Skerka et al., 2013). FHR1 regulates the terminal complement pathway by binding to C5, preventing its activation and inhibiting TCC assembly later in the cascade (Heinen et al., 2009). In addition, homo- or heterodimers of FHR1, FHR2, and FHR5 can compete with FH for binding to polyanions resulting in a decrease of FH levels with the consequence of local complement activation on host cell surfaces (Fritsche et al., 2010). Although the exact role of FHRs on complement regulation is not yet fully clarified, it is proposed that expression levels and ratios of the different FH-family members are necessary for fine-tuning of complement regulation (Józsi and Zipfel, 2008; Goicoechea de Jorge et al., 2013; Skerka et al., 2013).

Mutations in FH, mainly in the carboxy-terminus of the protein can lead to an ineffective local regulation of the complement system on host cells causing damage of tissues, especially on endothelia, and lead to microangiopathic hemolytic anemia and acute renal failure known as atypical hemolytic uremic syndrome (aHUS) (Józsi et al., 2005). Autoantibodies against FH or FH deficiency or mutations may cause an over-activation of the complement cascade and uncontrolled cleavage of C3, followed by a depletion of plasma C3 and accumulation of C3-cleavage products on the glomerular basement membrane of the kidney. These depositions are typical in C3 glomerulopathies (C3G) and lead to renal failure (Pickering et al., 2002; Noris and Remuzzi, 2015). Age-related macular degeneration (AMD), the major cause of irreversible loss of central vision, especially in the elderly population, is also linked to genetic variants of complement components, among others FH (Fritsche et al., 2016; Geerlings et al., 2017).

Treatment options for complement-associated renal diseases are limited. FH-substitution *via* plasmapheresis was shown to restore normal complement activity in aHUS and C3G patients (Cataland and Wu, 2014). The use of Eculizumab, a monoclonal antibody inhibiting C5 activation and one of the most expensive pharmaceuticals worldwide, has significantly improved the

<sup>1</sup> http://medicago.com

<sup>2</sup> https://www.greenovation.com

clinical treatment of aHUS and PNH patients (Wong and Kavanagh, 2015). However, Eculizumab could not prevent the activation of C5 sufficiently for every patient (Harder et al., 2017). Moreover, Eculizumab is not effective in many patients suffering from C3G because it does not act on C3 level, thus does not prevent the accumulation of C3 cleavage products (Bomback et al., 2012). For these patients, the use of the physiological regulator FH will be beneficial, as it already acts on the level of C3 activation and inhibits over-activation of the system locally on host cells. In addition, the side effects of systemic inhibition treatment, e.g. a higher risk of infections (Fridkis-Hareli et al., 2011), will be avoided.

namely FHR11-2, FH1-4 and FH19-20, without artificial linkers and an 8x histidine tag at the C-terminus (His).

Recombinant FH was already successfully produced in *P. patens*. Moss-produced FH (mossFH) showed full *in vitro* complement regulatory activity, and it efficiently blocked AP activation and hemolysis induced by sera from aHUS patients (Büttner-Mainik et al., 2011; Michelfelder et al., 2017). Moreover, in a preclinical study, it decreased the pathological deposition of C3 cleavage products and increased the levels of plasma C3 in a murine C3G model (Michelfelder et al., 2017). Recombinant proteins modulating the complement system on multiple levels of the activation cascade might be the key to the development of new therapeutics to treat complementrelated disorders. This strategy aims at the generation of smaller proteins with higher activity, implying that lower amounts of protein would be necessary to achieve the desired therapeutic effect. This is desirable, not only from the production point of view but also because of higher convenience for the patient. Recently, a novel synthetic multitarget regulator, MFHR1, was designed to combine the terminal pathway-regulating and dimerization domains of FHR1 with the functionally relevant C3-regulating and surface-binding domains of FH (**Figure 1B**; Michelfelder et al., 2018). After demonstrating the potential of MFHR1 as a biopharmaceutical in a proof-of-concept study using protein transiently expressed in insect cells, we now aimed to establish a stable production process for MFHR1 in the moss (*P. patens*) bioreactor. The moss *P. patens* is an important model species in basic research as well as a biotechnology production platform with outstanding features including a fully sequenced genome, efficient homologous recombination-based genome-engineering, dominant haploid gametophytic stage in life cycle and Good Manufacturing Practice (GMP)-compliant production in moss photobioreactors (Rensing et al., 2008; Decker and Reski, 2012; Decker et al., 2014; Reski et al., 2015; Lang et al., 2018; Wiedemann et al., 2018). In the end of 2017, a milestone was reached when Greenovation Biotech GmbH successfully completed the phase I clinical study for the first moss-produced drug candidate, moss-aGal, against Fabry disease (Reski et al., 2018). Moreover, mossFH is currently being tested in preclinical trials and showed that it successfully reduced C3 deposits in a FH-deficient mouse model (Häffner et al., 2017; Michelfelder et al., 2017).

In this work, we report the successful stable production of the novel synthetic multitarget complement inhibitor MFHR1, a potentially promising pharmaceutical product, in the GMP-compliant production platform *P. patens*. Moss-produced MFHR1 was fully characterized by mass spectrometry. It retains the regulatory activity from both originating proteins, FH and FHR1, and displays a higher inhibitory activity *in vitro* on the whole alternative pathway than FH, thus being a promising future biopharmaceutical product to cure complement-associated diseases.

## RESULTS

#### Transgenic Plant Generation

For the production of MFHR1, the ∆*xt/ft* moss parental line was used, a double knockout for the α1,3 fucosyltransferase and the β1,2 xylosyltransferase genes, which has been generated by gene targeting *via* homologous recombination and has been used previously for the production of FH (Michelfelder et al., 2017). This plant generates N-glycans without α1,3-attached fucose or β1,2-attached xylose (Koprivova et al., 2004; Huether et al., 2005; Michelfelder et al., 2017). MFHR1 expression was driven by the PpActin5 promoter (Weise et al., 2006) and targeting to the secretory pathway for its proper posttranslational modification was achieved by using the aspartic proteinase signal peptide of PpAP1 (Schaaf et al., 2004, 2005).

Plants surviving the selection procedure were directly screened by PCR for the presence of the transgene in the moss genome (**Supplementary Figure 1**) and *via* a sandwich ELISA for the production of MFHR1, from extracts of plant material grown on solid medium. Plants with a positive signal were transferred to liquid culture and screened again for productivity *via* ELISA (**Supplementary Table 1**). The two best-producing clones, P1 and P5, were selected for photobioreactor cultivation. Line P5 showed a slower biomass increase (**Supplementary Figure 2**). Therefore, according to its wild-type-like growth behavior and the overall level of MFHR1 production, P1 was chosen for further experiments.

#### Establishment of a Production and Purification Process for mossMFHR1

In order to accumulate biomass for the purification of MFHR1, the process in the photobioreactor was executed as a batch without harvesting material for 6 days (**Figure 2A**). From this time point onward, 2 L of suspension were harvested every day and replaced by fresh medium. The amount of biomass harvested in this 9-day process attained 190 g fresh weight (FW), and the growth index was 53.4 (GI = (biomassfinal− biomassinitial)/biomassinitial). Under these conditions, the concentration of MFHR1 in the plant material reached 100 μg MFHR1/g FW and was constant until day 8, when it started to decrease (**Figure 2B**). Previous experiments have shown that the concentration of the protein of interest decreased further with the time of culture (**Supplementary Figure 2**). In the culture conditions used, the addition of 5 μM naphthaleneacetic acid (NAA), a synthetic auxin, increased intracellular MFHR1 concentrations until the whole plant material was harvested 1 day later. All in all, more than 17 mg MFHR1 accumulated in the 5 L bioreactor within 9 days.

His-tagged MFHR1 was extracted from 6- to 9-day-old plant material and purified *via* Ni-NTA chromatography. As measured *via* ELISA, the protein of interest eluted at an imidazole concentration above 250 mM, between fraction 6 and 16 (**Supplementary Figure 3**). Additionally, Coomassie staining and Western blotting confirmed the signal at the expected size of 58 kDa (**Figures 3A,B**). Moreover, fractions 10–16 displayed suitable purity of the protein of interest (**Figure 3A**). A small amount of degraded protein was observed. The elution fractions 10–16 were collected, dialyzed against DPBS, concentrated *via* membrane ultrafiltration, and used for activity tests. Concentrated elution fractions 10–16 from homogenates of the parental plant *Δxt/ft* were used as negative control for any activity of mossendogenous proteins. In both cases, fractions up to fraction 9 were discarded due to host-cell protein contamination as assessed by SDS-PAGE and Coomassie staining (**Figure 3C**). As MFHR1 is a novel fusion protein, a protein standard for quantification *via* ELISA had to be produced. After purification *via* Ni-NTA and ultrafiltration, the concentration of the protein of interest was assessed by band densitometry on Coomassie-stained gels, using BSA as a standard for protein amount (**Figure 3C**, E10–16). The concentration of mossMFHR1 used for activity assays relies on ELISA-quantifications using this standard.

#### Structural Analyses Prove Correct Synthesis and Complex-Type N-Glycosylation of mossMFHR1

Purified mossMFHR1 was identified with a sequence coverage of 91% by mass-spectrometric analysis (**Figure 4**). We can assume the correct cleavage of the signal peptide, as this peptide was not detected in the MS analysis, and the N-terminus of the mature protein could be confirmed. The carboxy-terminal His-tag remains undetected due to the low mass/charge ratio of this peptide, but nevertheless it is present, as MFHR1 was purified *via* Ni-NTA affinity chromatography. Furthermore, we analyzed the N-glycosylation of mossMFHR1. As previously reported for both, human-derived and mossFH (Fenaille et al., 2007; Michelfelder et al., 2017), the putative glycosylation site NGS, originated from SCR4 of FH and located in SCR6 from MFHR1, was found to be deamidated to DGS, and therefore not glycosylated. Besides, the glycosylation site NIS located in

SCR2 and derived from FHR1 was occupied with glycans in only 35% of the peptides, while 65% were not modified (**Figure 4B**, **Supplementary Figure 4**). This is in agreement with the situation of FHR1 derived from human plasma where two different isoforms, FHR1α and FHR1β, 37 and 42 kDa, with one and two attached carbohydrate chains respectively, occur (Skerka and Zipfel, 2008). Approximately 86% of the glycosylated MFHR1 displayed the complex type N-glycan GnGn. Structures exhibiting terminal mannoses were below 1%. Peptides with N-glycans bearing β1,2-linked xylose and α1,3-linked fucose were not detected. The lack of β1,2-linked xylose and α1,3-linked fucose residues had been shown before for the parental line *Δxt/ft* (Michelfelder et al., 2017). N-glycans decorated with Lewis A structures, the trisaccharide Fucα1– 4(Galβ1–3)GlcNAc (Parsons et al., 2012), were detected in up to 14% of the product and on only one antenna of the sugar tree (GnAF). To sum up, moss-produced MFHR1 was complete and complex-type N-glycosylated.

#### MossMFHR1 Displays Cofactor Activity as Early Regulatory Function in Complement Activation

The first activity of FH within the regulation of the alternative complement pathway is its role as cofactor for FI-mediated cleavage of C3b in the fluid phase (cofactor activity of FH; Zipfel et al., 1999). To assess whether this important function is retained in the synthetic protein MFHR1, we compared mossMFHR1 to FH for their cofactor activity in a fluid-phase assay. We used mossFH because mossFH and plasmaFH exhibited comparable

FIGURE 4 | Sequence and domain structure of MFHR1 and mass spectrometric sequence and N-glycosylation analyses. (A) Mature MFHR1 sequence (black) fused to the AP1-signal peptide (grey). The amino acid positions relating to the mature protein are given. Negative numbers refer to the signal peptide. The aminoterminal portion of MFHR1 is composed of the first two SCRs of FHR1 (green), followed by the catalytically active SCRs 1-4 from FH (yellow). The C-terminus includes the surface-binding SCRs 19-20 from FH (blue) followed by a His-tag. Peptides identified by mass spectrometry are shown in bold and the putative N-glycosylation site NIS (Asn108) and the deamidated site DGS (Asn323 ➔ Asp323), are underlined, corresponding mass spectrometric detected tryptic peptides are shown in italics. (B) Elution profiles (extracted ion chromatogram, EIC) of the tryptic peptide (102LQNNENNISCVER113) which flanks the glycosylation site NIS (Asn108) with and without N-glycosylation. Peak identity was confirmed by m/z-value and charge state on MS1- and by reporter ions on MS2-level (see Supplementary Figure 4 for further information). Peak quantification revealed an N-glycan occupancy at Asn108 of 35%, which in total consisted of 86% GnGn−, nearly 14% GnAF structures and < 1% of N-glycans with terminal mannoses, all of them lacking plant specific core xylose and fucose. For the deamidated site (Asn323 ➔ Asp323) no N-glycosylation was detectable. Gn: N-acetylglucosamine, A: galactose, F: Fucose – referring to the terminal sugar residues.

cofactor activities before (Michelfelder et al., 2017). Moreover, we preferred mossFH for its homogeneity. In contrast to the recombinant mossFH, plasma-purified FH derives from several donors, thus being a mixture of FH variants with polymorphisms in some positions. The cleavage of C3b is indicated by a decrease of the C3b α'-chain and occurrence of the cleavage products α'68-, α'46-, and α'43. Cofactor activity was analyzed by SDS-PAGE and Coomassie staining (**Figure 5A**), and the intensity of α'-chain bands was quantified by densitometry (**Figure 5B**). In a dose-dependent manner, mossMFHR1 and mossFH contributed to the FI-mediated degradation of the C3b α'-chain into α'68-, α'46-, and α'43 kDa fragments while the β-chain remained unchanged. Hence, MFHR1 showed comparable cofactor activity to FH.

#### MossMFHR1 Accelerates the Decay of the C3 Convertase and Inhibits Further Alternative Complement Pathway

The so-called decay acceleration activity of mossMFHR1 was compared with plasma-derived FH (plasmaFH) and mossFH *in vitro*. C3b together with FB generates a complex that is cleaved by FD to generate the active C3 convertase C3bBb, which promotes the alternative pathway, intensifying the reaction *via* the amplification loop (**Figure 1A**). On the contrary, FH reduces the level of active C3 convertase C3bBb by removing the Bb portion, thus inactivating the AP (Harris et al., 2005). As expected, mossMFHR1 accelerated the decay of the C3 convertase C3bBb and even performed significantly better than plasmaFH and mossFH at higher concentrations of 10 and 25 nM (*p* < 0.05) (**Figure 6**).

#### MossMFHR1 Inhibits the AP Activation in Human Blood

To check the additional regulatory capacity of MFHR1 which derives from FHR1 domains, we measured the ability of mossMFHR1 to inhibit the formation of C5b-9, i.e. the terminal complement complex (TCC), after activation of the cascade with bacterial lipopolysaccharide (LPS) in human blood serum. In this assay, the regulatory activities on all AP levels are evaluated together. Increasing amounts of mossMFHR1 regulated LPS-induced AP activation and inhibited TCC formation efficiently and in most concentrations not significantly different from the therapeutic antibody Eculizumab (**Figure 7**); only at 32 nM did the blocking antibody perform better than mossMFHR1 (*p* < 0.05). MossMFHR1 regulated the complement cascade and the formation of the TCC much more efficiently than plasmaFH. The complete inhibition of the AP was achieved by 56 nM MFHR1; approximately, 22 times less MFHR1 was needed when compared to plasmaFH.

#### DISCUSSION

The complement system is a tightly regulated cascade that efficiently clears infectious agents and modified body cells and protects host tissues. Dysregulation of this delicately balanced cascade leads to infection and severe diseases such as atypical hemolytic uremic syndrome (aHUS), paroxysmal nocturnal hemoglobinuria (PNH), C3 glomerulopathy (C3G), age-related macular degeneration, and microangiopathic hemolytic anemia (Józsi et al., 2005). Complement factor H is a potent regulator of the alternative pathway and FH-replacement therapy could restore normal complement activity in the sera of aHUS and C3G patients (Michelfelder et al., 2017). Due to its high molecular mass and biochemical complexity, recombinant production of FH is far from trivial, thus no recombinant product is under clinical evaluation (Yang et al., 2018). Recently, mossFH showed full *in vitro* complement regulatory activity and decreased deposition of C3 cleavage products in preclinical studies in a murine model of C3G, indicating improved kidney function (Michelfelder et al., 2017). The therapy with Eculizumab, which

FIGURE 6 | MossMFHR1 displays decay acceleration activity on the C3 convertase C3bBb. Like mossFH and plasmaFH, mossMFHR1 accelerated the decay of Bb from C3 convertase C3bBb complex in a dose-dependent manner. The OD at 450 nm (OD450) of preformed C3bBb (C3b + FB + FD) without any regulator was set to 100%. C3bBB OD450 (only C3b + FB without any FD and regulator) was set to 0%. Purified extract of the parental moss strain *Δxt/ft* was included as a control, in equal amounts to the volume of MFHR1 used for the highest concentration. Data represent mean values and ± SD from three replicates and were analyzed with two-way ANOVA followed by Bonferroni test. Some error bars are shorter than the symbol. Analyses were done with GraphPad Prism software version 8.0 for Windows.

binds C5 and inhibits its activation, improved disease progression, survival, and quality of life in patients with aHUS and PNH but showed partial response only in some patients suffering from C3G, most likely due to the uncontrolled over-activation of the cascade in steps previous to Eculizumab's point of action (Bomback et al., 2012). Thus, recombinant proteins modulating the complement system on multiple levels might be key to the development of new therapeutics to treat complement-related disorders. MFHR1, a novel synthetic multitarget complement inhibitor designed to combine the terminal pathway regulatory activity of FHR1 with the regulatory domains for C3 level control and binding affinity to host cell surfaces of FH, was synthesized in a proof-of-concept study with transiently transfected insect cells and shown to be a promising multilevel complement regulator (Michelfelder et al., 2018). This study demonstrated that MFHR1 can achieve the same complement regulatory activity as plasmaFH at much lower concentrations. Being FH a very abundant serum protein with concentrations of approximately 500 μg/ml, the amount of protein needed per patient is important for the feasibility of the treatment. Baculovirus-infected insect cells are used for heterologous protein expression widely for two reasons: availability of two strong promoters which drive target protein expression to high levels

FIGURE 7 | MossMFHR1 inhibits the AP activity after induction of the alternative pathway in human serum. The C5b-9 complex formation was measured by ELISA after LPS-induced AP activation. Data represent mean values and bars show the range of duplicates and were analyzed with twoway ANOVA followed by Bonferroni test. The serum without regulators was set to 100% and heat-inactivated serum to 0%. Activity in wells not coated with LPS (-LPS) was used as a reference. Purified extract of the parental moss strain *Δxt/ft* was included as a negative control, in equal amounts to the volume of MFHR1 used. Analyses were done with GraphPad Prism software version 8.0 for Windows.

(Scholz and Suppmann, 2017) and lepidopteran cell lines that can perform post-translational modifications and grow to high cell densities (Berger et al., 2004; Van Oers et al., 2015). On the other hand, there are several drawbacks including demanding and expensive culture conditions (Gecchele et al., 2015), batchto-batch inconsistency due to the lot-to-lot variability of media (Chan and Reid, 2016), the need of large volumes of viruses to scale up protein production, different glycosylation pattern (paucimannosidic N-glycans) than humans (Shi and Jarvis, 2007), potential cell lysis, and degradation of protein of interest caused by viral infection (Broadway, 2012). Therefore, after demonstrating the bioactivity of insect-cell expressed MFHR1 protein, we now aimed to establish a stable production process for MFHR1 in the moss bioreactor. This system proved its validity as a biopharmaceutical production host by successfully completing the phase I clinical trial for moss-aGal (Reski et al., 2018). Moreover, it was the first system to succeed in the stable production of active recombinant human FH (Büttner-Mainik et al., 2011; Michelfelder et al., 2017).

After the characterization of transgenic moss lines in terms of growth performance and MFHR1 production levels in bioreactors, the line P1 was found to be most promising with production levels of 100 μg MFHR1/g FW and a 54-time increment of the biomass within 9 days of cultivation in a 5-L bioreactor.

The endogenous actin 5 promoter, and the 5' UTR including its intron, is a valuable tool for the production of pharmaceutically interesting proteins in *P. patens* (Baur et al., 2005a,b; Weise et al., 2006, 2007; Büttner-Mainik et al., 2011; Michelfelder et al., 2017). For proteins expressed under the control of this promoter, a temporal increase in the production of the protein of interest in *P. patens* could be achieved by the addition of auxin (data not shown). This effect could be shown here as the addition of the synthetic auxin NAA increased MFHR1 levels. Auxin also stimulated the ACT7-GUS reporter in *Arabidopsis thaliana* (Kandasamy et al., 2001). To attain an increase in recombinant protein yield, however, auxin concentration and time of exposition had to be adapted to each protein of interest and culture conditions.

According to mass spectrometry, mossMFHR1 is completely and correctly synthesized with a moss-derived signal peptide (Schaaf et al., 2004) and properly processed as demonstrated before for moss-produced recombinant human VEGF and FH (Gitzinger et al., 2009; Büttner-Mainik et al., 2011; Michelfelder et al., 2017). In addition, complex-type N-glycosylation of mossMFHR1 was confirmed by MS-based glycopeptide analysis. The site NGS originated from FH SCR4 was not glycosylated, but deamidated to DGS, as observed before for both, human plasma-purified as well as moss-produced FH (Fenaille et al., 2007; Michelfelder et al., 2017). The other MFHR1 N-glycan site, located in FHR1-derived SCR2, was occupied in approximately 35% of the molecules predominantly with GnGn N-glycans. A small portion harbored monoantennary Lewis A structures. The moss α1,4 fucosyltransferase and β1,3 galactosyltransferase responsible for this modification have already been identified and could be further eliminated by knocking out the responsible genes as it was already achieved for asialo-EPO production in moss (Parsons et al., 2012). N-glycans with terminal mannoses were less than 1% and as expected from the glyco-engineered *Δxt/ft* parental line, putatively immunogenic β1,2-linked xylose and α1,3-linked fucose never appeared. The fact that the majority of mossMFHR1 molecules was not glycosylated is in full agreement with the situation described for human FHR1 (Skerka and Zipfel, 2008). Considering the different aspects of glycosylation, the mossMFHR1 presented here provides optimal conditions for a safe, non-immunogenic pharmaceutical.

Having proven its structural integrity, the different proposed functions of mossMFHR1 were subsequently tested. Aiming at a multilevel activity from combining the relevant functional domains of FHR1 and FH, we checked both, the functions of FH domains, by controlling the early stages of AP activation (proximal part) in the fluid phase and of FHR1 in the terminal part, blocking TCC formation on target structures. FH is inhibiting the activation of the complement cascade on two levels: as a cofactor for FI-mediated cleavage of C3b and in accelerating the decay of the crucial enzyme of complement activation, the C3 convertase. Both functions were executed by mossMFHR1 in a similar or even more efficient way compared to the full-length mossFH (Michelfelder et al., 2017) or human plasma-purified FH. MossMFHR1 degraded the C3 convertase like plasmaFH and mossFH at low concentrations and performed significantly better than both FH versions at higher concentrations. MFHR1 activity for inhibiting the terminal pathway was measured as decrease of C5b-9 (TCC) formation and compared to the activity of the C5-inhibiting antibody Eculizumab as well as plasmaFH. MossMFHR1 inhibited TCC formation in a similar manner as Eculizumab and 22 times more efficiently than plasmaFH. These results are in accordance with those obtained with the insect cell-derived MFHR1 (Michelfelder et al., 2018) and a similar fusion protein published recently (Yang et al., 2018). The improved activity compared to native FH might be explained not only by the combined activity at different levels of the alternative pathway but also by the formation of MFHR1 homodimers deriving from the FHR1 dimerization domain included in MFHR1 (Michelfelder et al., 2018; Yang et al., 2018).

These encouraging results, the structural integrity and increased activity combined with a smaller molecule size, strongly recommend the initiation of large-scale cultivation for proving the mossMFHR1's *in vivo* therapeutic activity. In summary, mossMFHR1 showed its high potential to become a new and indispensable drug for patients with complement-associated disorders.

#### MATERIALS AND METHODS

#### Construct Generation

For efficient production of MFHR1, the vector pAct5-MFHR1 was used, where the expression of the MFHR1 coding sequence is driven by the 5′ region, including the 5′ intron, of the PpActin5 gene (Weise et al., 2006). The CaMV 35S terminator preceded by an 8x His-tag and a SalI restriction site was amplified from the pRT\_VEGF121 plasmid (Koprivova et al., 2004). In addition, this plasmid includes an hpt cassette for selection of transformed plants with hygromycin (**Supplementary Figure 5**).

The targeting of the mossMFHR1 to the secretory pathway for proper posttranslational modifications was driven by the aspartic proteinase signal peptide from *P. patens*, PpAP1 (Schaaf et al., 2004, 2005). Its coding sequence was amplified from the plasmid pFH (Büttner-Mainik et al., 2011) with the primers XhoI\_MFHR1\_F (5′-TCTCGAGATGGGGGCATCGAGG-3′) and AP\_MFHR1\_R (5′-ATCACAAAATGTTGCTTCTGCCTCAG CTAAGGC-3′). The coding sequence for the MFHR1 was amplified from pFastbac-MFHR1 (Michelfelder et al., 2018) with the primers AP\_MFHR1\_F (5′-GAGGCAGAAGCAACATTTT GTGATTTTCCAAAAATAAACC-3′) and SalI\_MFHR1\_R (5′-TGTCGACTCTTTTTGCACAAGTTGGATACTC-3′). The signal peptide and the MFHR coding sequence were assembled *via* two-template PCR with the primers XhoI\_MFHR1\_F and SalI\_MFHR1\_R and cloned into the expression vector *via* XhoI and SalI restriction sites.

All amplifications were performed by phusion DNA polymerase (Thermo Fisher Scientific, Waltham, MA, USA)-based PCR.

#### Plant Material and Cell Culture

*Physcomitrella patens* (Hedw.) Bruch & Schimp was cultivated as described previously (Frank et al., 2005). The MFHR1-producing moss lines were obtained by stable transformation of the Δ*xt/ ft* moss line, in which the α1,3 fucosyltransferase and the β1,2 xylosyltransferase genes have been disrupted *via* homologous recombination. Transfection was performed with 40 μg of linearized MFHR1 construct per reaction as described before (Decker et al., 2015). Subsequently, selection of stable transformants on solidified Knop medium containing 25 mg/L hygromycin was performed as described previously (Decker et al., 2015).

Plants surviving the selection with hygromycin were screened for the presence of the transgene in the moss genome by PCR, as described before (Parsons et al., 2012), using the primers MFHR1\_fwd (5′-GAAGGATGGTCACCAACACC-3′) and MFHR1\_rev (5′-CATTGGTCCATCCATCTGTG-3′). The production of the protein of interest was tested *via* ELISA. For this purpose, approximately 10 mg of plant material were picked from colonies growing on Knop solid medium and transferred to 2-ml tubes with one tungsten carbide (Qiagen, Hilden, Germany) and one glass (Roth, Karlsruhe, Germany) beads, diameter 3 mm. After the addition of 120 μl extraction buffer (408 mM NaCl, 60 mM Na2HPO4, 10.56 mM KH2PO4, 60 mM EDTA, pH 7.4 and 1% protease inhibitor (P9599, Sigma-Aldrich), plant material was homogenized for 1 min by the use of a mixer mill (MM 400, Retsch, Haan, Germany) at 30 Hz and subsequently sonicated for 10 min in a ultrasound bath (Bandelin Sonorex RK52). Extracts were analyzed *via* ELISA as described before (Büttner-Mainik et al., 2011), but using the anti-FH antibody GAU 018-03-02 (Thermo Fisher Scientific) as capture antibody 1:2,000 in coating buffer, and using FH as standard protein (Calbiochem, San Diego, CA, USA) for the screening of MFHR1-producing plants. Under these conditions, the ELISA has a linear response from 1 to 64 ng/ml. For the quantification of MFHR1 produced by plants grown in liquid media, the protein extraction was performed as described above, but the plant material was first vacuum-filtrated, frozen, and disrupted frozen before the addition of the buffer.

For production of MFHR1, the P1 line was cultivated at pH 4.5 in a 5 L photobioreactor as described previously (Hohe and Reski, 2005), with continuous light at 350 μE intensity. Daily from day 6 to day 8, 2 L suspensions were harvested and replaced by fresh medium. Plant material was vacuumfiltrated and shock-frozen in N2, and stored at −80°C until further processing. At day 8, after harvesting, 250 μl 100 mM naphthaleneacetic acid (NAA, Sigma) was added to the bioreactor to reach a final concentration of 5 μM NAA. At day 9, the whole culture was harvested.

#### Purification of Recombinant MFHR1

Frozen plant material was resuspended in binding buffer (0.75 M NaCl, 75 mM Na2HPO4, 10 mM Imidazole, 1% protease inhibitor, pH 8.0) in a ratio plant material:buffer 1:4, and homogenized for 10 min with an Ultra-Turrax at 10,000 rpm on ice. After centrifugation, the supernatant was filtrated through 1 μm polyethersulfon (PES) filters (Whatman, GE Healthcare UK Limited, Buckinghamshire, UK) and subsequently through 0.22 μm PES filters (Roth).

Chromatography was performed using an ÄKTA system (GE Healthcare, Uppsala, Sweden) at a flow rate of 1 ml/min. The filtered cellular extract was applied to a 1 ml (CV) HisTrap FF column (GE-Healthcare). After washing with 30 CV binding buffer supplemented with 3% buffer B (500 mM NaCl; 500 mM imidazole; 100 mM Na2HPO4; pH 8.0), elution was performed in a gradient of 3–100% of buffer B in 9 CV and recovered in 0.5 ml fractions. Fractions containing the protein of interest with a negligible amount of contaminant proteins were pooled (fractions 10–16), dialyzed against DPBS (Gibco® by Life Technologies, Darmstadt, Germany) in two Slide-A-Lyzer® MINI Dialysis Devices, 20 K MWCO and concentrated using Vivaspin membrane filter devices (Sartorius AG, Goettingen, Germany) with a 10 kD MWCO. Subsequently, the product was aliquoted, shock frozen in liquid N2, and stored at −80°C until further analysis.

## Protein Quantification

The concentration of mossMFHR1 used for activity assays was measured *via* ELISA, using the same antibodies and protocol performed for mossFH as described above. As standard for the ELISA, a batch of purified mossMFHR1 was used in which its concentration was determined *via* band densitometry (Quantity One, Bio-Rad, Munich, Germany) after SDS–PAGE (Ready Gel Tris-HCl, Bio-Rad) and Coomassie staining. The sandwich ELISA using this MFHR1 standard protein has a linear response between 1 and 64 ng/ml. Electrophoretic separation of proteins was carried out in 10% SDS–polyacrylamide gels (Ready Gel Tris-HCl; BioRad) at 120 V. Subsequently, gels were stained with PageBlue® Protein Staining Solution (PageBlue™, Thermo Fisher Scientific). For Western blot analysis, SDS-PAGE gel was blotted to polyvinylidene fluoride (PVDF) membranes (Immobilon-P; Millipore, Bedford, MA, USA) in a Trans-Blot SD Semi-Dry Electrophoretic Cell (Bio-Rad) for 1 h with 1 mA /cm2 membrane. Immunoblotting was performed using mouse anti-His antibodies (MAB050, R&D Systems, Minneapolis, MN, USA) as primary and sheep anti-mouse antibodies coupled to peroxidase (NA931, Amersham ECL™, GE Healthcare) as secondary antibody in a 1:500 and 1:10,000 dilution respectively, followed by chemiluminescence development (ECL™ Advance Western Blotting Detection Kit, GE Healthcare) following the manufacturer's instructions.

# Glycopeptide Analysis

Glycopeptide analysis was performed from samples purified *via* Ni-NTA. These were mixed 1:1 with 2x sample loading buffer (Bio-Rad) with 50 mM DTT, incubated for 5 min at 95°C and after cooling down to room temperature the samples were S-alkylated. Electrophoretic separation and Coomassie staining was performed as described above, and the bands corresponding to mossMFHR1 (monitored by a parallel Western blot) were cut and digested with trypsin overnight. Tryptic peptides were extracted first with 100% acetonitrile (ACN) followed by 5% formic acid and glycopeptide enrichment procedure was modified after (Kolarich et al., 2012) using HILIC HyperSep™ Tips (Thermo Fisher Scientific). Gel extracts were dried in a vacuum concentrator and dissolved in 100 μl HILIC binding buffer (85% Acetonitrile, 15 mM ammonium acetate, pH 3.5). Each HILIC tip was equilibrated with 20 μl binding buffer. About 100 μl sample was loaded by up and down pipetting for 40 times. The flow-through was collected and dried in a vacuum concentrator. The loaded HILIC tip was washed with 20 μl binding buffer by up and down pipetting for 20 times. Glycopeptides were eluted in 20 μl 15 mM ammonium acetate (pH 3.5) and dried by vacuum concentration. The dried flow-through fraction was dissolved in 100 μl 0.1% formic acid and desalted using C18 StageTips (Thermo Fisher Scientific). C18 Tips were equilibrated successively with 100 μl 0.1% formic acid, next with 100 μl 80% ACN with 0.1% formic acid and finally again with 100 μl 0.1% formic acid. The dissolved flow-through was loaded, washed with 100 μl 0.1% formic acid, and eluted in 100 μl ACN with 0.1% formic acid. The elution fraction was dried in a vacuum concentrator. Both the HILIC and the C18 eluates were dissolved in 0.1% formic acid and measured using the UltiMate 3,000 RSLCnano system (Dionex LC Packings/Thermo Fisher Scientific, Dreieich, Germany) coupled online to a QExactive Plus instrument (Thermo Fisher Scientific, Bremen, Germany). For the UHPLC systems, a C18-precolumn (Ø 0.3 mm, length 5 mm; PepMap, Thermo Fisher Scientific) and an Acclaim® PepMap analytical column (*ID*: 75 μm, 500 mm, length 2 m, 100 Å, Dionex LC Packings/Thermo Fisher Scientific) were used. Washing and pre-concentration of samples took place on a C18-precolumn with 0.1% formic acid (solvent A) for 5 min before peptides entering the analytical column. With a flow rate of 250 nl/ min, peptide separation was performed applying a 45-min gradient of 4–40% solvent B (0.1% formic acid/86% acetonitrile) in 30 min and 40–95% solvent B in 5 min. After each gradient, the analytical column was washed with 95% solvent B for 5 min and re-equilibrated for 15 min with 4% solvent B. MS/ MS analyses were performed on multiply charged peptide ions. To automatically switch between MS (max. of 1 × 10 ions) and MS/MS the instrument operation took place in the datadependent mode. After MS scan, a maximum of 12 precursors were selected for MS/MS scans using HCD with normalized collision energy of 35%. The mass range for MS was m/z = 375–1,700 and resolution was set to 70,000. MS parameters were as follows: spray voltage 1.5 kV and ion transfer tube temperature 200°C. Raw data analysis was performed using Mascot Distiller V2.5.1.0 (Matrix Science, USA), and the peak lists were searched with Mascot V2.6.0 against an in-house database containing all *P. patens* V1.6 protein models (Zimmer et al., 2013) as well as the MFHR1 sequence.

For database searching, the following parameters were used: peptide mass tolerance: 5 ppm, MS/MS mass tolerance: 0.02 Da, enzyme: trypsin with maximum two missed cleavages, variable modifications: Gln− > to pyroGlu (N-term. Q) −17.026549 Da, oxidation (M) and carbamidomethyl (C). +57.021464 Da, Hydroxyproline (P) +15.994915 Da. Quantitation of peptides was done with Excalibur Qual Browser V2.2.44 (Thermo Fisher Scientific) employing manual peak area integration of extracted ion chromatograms. For all glycopeptides identified, the presence of the GlcNAc-specific reporter ions (Halim et al., 2014) was inspected in the corresponding MS/MS spectra to confirm the presence of glycan structures. The following m/z-values were expected: LQNNENNISCVER [M + 2H+ ]: 795.3704, LQNNENNISCVER\_ GnGn [M + 3H+ ]: 963.4081, LQNNENNISCVER\_GnAF [M + 3H+ ]: 1066.1116.

#### Cofactor Activity

C3b proteolytic degradation regulatory activity of mossMFHR1 and plasmaFH was compared in a fluid phase cofactor assay modified from Michelfelder et al. (2017). Briefly, in a 20 μl reaction, increasing doses of mossMFHR1 or plasmaFH, and corresponding maximal volume of the parental plant Δ*xt/ft* purified extract as control, were incubated with 2 μg C3b and 500 ng complement factor I (FI) (CompTech, Texas, USA) in PBS at 37°C for 30 min. The reaction was stopped by the addition of sample loading buffer (Bio-Rad) under reducing conditions (50 mM DTT, NuPAGE™, Thermo Fisher Scientific). The FH and FI catalyzed proteolytic cleavage of C3b into iC3b was analyzed by visualizing the α-chain cleavage fragments α'68 and α'43 by SDS-Page in 10% SDS–polyacrylamide gels and Coomassie staining. The remaining intact α-chain was quantified by band densitometry (Quantity One, Bio-Rad).

## Decay Acceleration Activity

The ability of mossMFHR1 to displace the fragment Bb from the preformed C3 convertase complex C3bBb was measured by ELISA as described previously (Michelfelder et al., 2017). Purified C3b (CompTech) was immobilized on Maxisorb plates and 400 ng Factor B and 25 ng Factor D (CompTech) in phosphate buffer (containing 2 mM NiCl2, 25 mM NaCl, and 4% BSA) was added to the wells and incubated for 2 h at 37°C to generate C3bBb complex. Various doses of MFHR1, mossFH, plasmaFH, or Δ*xt/ft* were added and incubated for 30 min at 37°C. Afterward, intact C3bBb complexes were measured by FB-specific antibody (1:2,000; Merck, Darmstadt, Germany), followed by HRP-conjugated rabbit anti-goat (1:5,000; Dako, Hamburg, Germany). The OD at 450 nm (OD450) was obtained after the incubation with TMB Substrate. The OD450 of preformed C3bBb (C3b + FB + FD) without any regulator was set to 100%. C3bBB OD450 (only C3b + FB without any FD and regulator) was set to 0%.

#### Determination of AP-Activity in Human Serum

The ability of mossMFHR1 to inhibit the AP activity in normal human serum (NHS) was determined by measuring the Terminal Complement Complex formation inhibition *via* ELISA as previously described (Michelfelder et al., 2018). In this ELISAbased method, the amount of active C5b-9 formation is measured during incubation of serum in wells pre-coated with lipopolysaccharide (LPS) in the presence of AP-pathway-specific conditions. The amount of bound C5b-9, detected *via* specific antibodies, is directly proportional to the activity of the AP. Extracts from the parental plant Δ*xt/ft* were used as negative controls. The OD450 obtained for samples with heat-inactivated serum instead of NHS were set to 0% and samples without any addition of regulators were set to 100% of C5b-9 formation.

# STATISTICAL ANALYSIS

Analyses were done with the GraphPad Prism software version 8.0 for Windows (GraphPad software, San Diego, California, USA; www.graphpad.com).

#### DATA AVAILABILITY

All datasets generated for this study are included in the manuscript and/or the **Supplementary Files**.

#### AUTHOR CONTRIBUTIONS

OT and JP purified the protein and performed the activity tests. JP carried out the moss cultures in the bioreactor. LB and SH performed the mass spectrometric analysis. PK cloned the expressing vector and transformed *P. patens* with it. CB-S screened the putative producing lines. SM cloned the coding sequence and set up the activity test protocols. OT, JP, KH, PZ, RR and ED designed the study and wrote the manuscript.

#### FUNDING

This work was supported by the Excellence Initiative of the German Federal and State Governments (GSC-4 to OT, EXC

#### REFERENCES


294 to RR); and a grant from the German Federal Ministry of Education and Research (BMBF 0313852C to RR). RR and KH received research funding by Greenovation Biotech GmbH.

#### ACKNOWLEDGMENTS

We thank Dagmar Krischke, Astrid Wäldin, Christina Jaeger, Natalia Ruiz Molina, and Melanie Heck for their support of this work, Bettina Warscheid for the possibility to use the mass spectrometer, Anne Katrin Prowse for language editing, and Greenovation for the Δ*xt/ft* moss line.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00260/ full#supplementary-material


oxonium ion fragmentation profiles in LC–MS/MS of glycopeptides. *J. Proteome Res.* 13, 6024–6032. doi: 10.1021/pr500898r


of plant gene structures and functions. *BMC Genomics* 14:498. doi: 10.1186/1471-2164-14-498


**Conflict of Interest Statement:** The authors are inventors of patents and patent applications related to the production of recombinant proteins in *P. patens*. RR is an inventor of the moss bioreactor and a founder of Greenovation Biotech. He currently serves as advisory board member of this company.

*Copyright © 2019 Top, Parsons, Bohlender, Michelfelder, Kopp, Busch-Steenberg, Hoernstein, Zipfel, Häffner, Reski and Decker. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Colicins and Salmocins – New Classes of Plant-Made Non-antibiotic Food Antibacterials

Simone Hahn-Löbmann<sup>1</sup> , Anett Stephan<sup>1</sup> , Steve Schulz<sup>1</sup> , Tobias Schneider<sup>1</sup> , Anton Shaverskyi<sup>1</sup>† , Daniel Tusé<sup>2</sup> , Anatoli Giritch<sup>1</sup> \* and Yuri Gleba<sup>1</sup>

<sup>1</sup> NOMAD Bioscience GmbH, Halle, Germany, <sup>2</sup> DT/Consulting Group, Sacramento, CA, United States

#### Edited by:

Suvi Tuulikki Häkkinen, VTT Technical Research Centre of Finland Ltd., Finland

#### Reviewed by:

Stephen John Streatfield, Center for Molecular Biotechnology (FHG), United States Markus Sack, RWTH Aachen University, Germany

#### \*Correspondence:

Anatoli Giritch giritch@nomadbioscience.com

#### †Present address:

Anton Shaverskyi, Institute of Clinical Chemistry, Inflammation Research Group, Hannover Medical School, Hanover, Germany

#### Specialty section:

This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science

Received: 15 January 2019 Accepted: 22 March 2019 Published: 09 April 2019

#### Citation:

Hahn-Löbmann S, Stephan A, Schulz S, Schneider T, Shaverskyi A, Tusé D, Giritch A and Gleba Y (2019) Colicins and Salmocins – New Classes of Plant-Made Non-antibiotic Food Antibacterials. Front. Plant Sci. 10:437. doi: 10.3389/fpls.2019.00437 Recently, several plant-made recombinant proteins received favorable regulatory review as food antibacterials in the United States through the Generally Recognized As Safe (GRAS) regulatory procedure, and applications for others are pending. These food antimicrobials, along with approved biopharmaceuticals and vaccines, represent new classes of products manufactured in green plants as production hosts. We present results of new research and development and summarize regulatory, economic and business aspects of the antibacterial proteins colicins and salmocins as new food processing aids.

Keywords: antimicrobials, bacteriocin, colicin, salmocin, plant-based expression system, GRAS, food safety

### INTRODUCTION

Since the early and mid-twentieth century, antibiotics have been amongst the most impactful pharmaceuticals for maintaining public health. However, the broad and indiscriminate use of these medicines has caused evolution of multi-drug resistant (MDR) bacteria that are increasingly insensitive to multiple antibiotic classes, including so-called antibiotics of last resort, such as carbapenems, colistin, and third- and fourth-generation cephalosporins. The threat of MDR pathogens is fully recognized by the World Health Organisation (WHO) and governments worldwide, but coherent actions for integrated management of MDR pathogens are still lacking (Tacconelli et al., 2018). Most antibiotics on the market today are generic drugs and are inexpensive, and pharmaceutical companies have few incentives to develop new classes of antimicrobials. The pipeline of antimicrobials currently in clinical trials includes predominantly modifications of the earlier discovered classes and do not offer novel modes of action; thus, it is fully expected that the pathogens will evolve to become resistant. The pathogens most difficult to control are Gram-negative bacteria such as Campylobacter, Pseudomonas, Escherichia, and Salmonella, as those pathogens have developed resistance to most or all existing antibiotic classes. Novel non-antibiotic antibacterials are one approach to solving the MDR problem and are thus urgently needed.

Many major health threats including the Gram-negative bacteria Salmonella enterica, Escherichia coli, and Campylobacter jejuni, and the Gram-positive bacteria Listeria monocytogenes and Clostridium perfringens are food-borne pathogens. Food-related bacterial outbreaks are occurring with increasing frequency and severity. The problem is exacerbated by the globalization of food manufacturing processes whereby the food is produced and transported from different continents and mixed or blended before use, thus amplifying potential pathogen spread. For example, a hamburger bought at a fast-food restaurant normally contains meat from over 100

different animals, meaning that meat from one infected animal may infect hundreds of customers<sup>1</sup> . Driven by customer demands for so-called 'organic' food production practices, many farmers and companies try to reduce the environmental impact of their operations by avoiding the use of chemicals and antibiotics in the process of animal rearing, plant production and food preparation, and instead use traditional methods of husbandry and agriculture, such as use of animal dung as a fertilizer, or keeping animals and plants in close proximity. These practices may introduce additional risk of bacterial contamination not only to domestic animals but also to vegetables grown in nearby fields that may become exposed to contaminated irrigation water and run-off<sup>2</sup> . Bacteria such as Escherichia, Salmonella, and Listeria are very promiscuous and can survive, and even multiply, in plants, despite the fact that their normal hosts are animals. It is symptomatic that during the last decade, more food-related outbreaks are the result of consumption of infected plants or plant sprouts rather that the animals that are the main reservoir of the pathogenic bacteria<sup>3</sup> .

Several research teams have searched for antibiotic alternatives, and in particular, attempted development of non-antibiotic antibacterial proteins derived from bacteria (E. coli colicins and colicin-like molecules) and bacteriophages (endolysins or "lysins") for control of bacterial pathogens. Escherichia colicins and colicin-like molecules derived from other Gram-negative bacteria are surprisingly easily and wellexpressed in plants, are fully functional and are up to 10<sup>6</sup> times more potent than antibiotics on a molar basis (Schneider et al., 2018). Due to their nature, however, these molecules are narrowly specific and typically cocktails of these proteins are needed for good control of all pathovars of the bacterial species. Antibacterial proteins are being developed as new medicines, as antibacterials for food, or both. Nomad Bioscience GmbH (Halle, Germany) and its subsidiary Nomads UAB (Vilnius, Lithuania) are in the forefront of these research efforts, with an early emphasis on the food antimicrobials market. In particular, using the GRAS (Generally Recognized As Safe) regulatory process in the United States, Nomad has already obtained favorable regulatory review and marketing allowance from the United States Food and Drug Administration (FDA) for its Escherichia-derived antibacterial proteins, colicins, for use in food. The company has also submitted GRAS notices to FDA for its salmocins, colicin-like proteins derived from Salmonella. Similarly, Nomads, Lithuania, has used the GRAS process to confirm marketing allowance of its Clostridium phage lysins.

We summarize herein results of new research and development for two classes of antibacterial proteins, colicins and salmocins, that are being developed by Nomad for the food industry as food processing aids. Our discussion includes perspective on key commercialization aspects of these

are-a-lot-more-cows-in-a-single-hamburger-than-you-

<sup>2</sup>https://www.cdc.gov/ecoli/2018/o157h7-11-18/index.html

product candidates, including industrial manufacturing in green plants, the quality attributes of these proteins including antibacterial activity in vitro and on food matrices, the pathway for regulatory marketing allowance of these products, and the breadth of potential market applications. Current challenges to the commercial adoption of these products are also discussed.

#### MATERIALS AND METHODS

#### Bacterial Strains and Growth Conditions

Escherichia coli DH10B and STEC as well as S. enterica ssp. enterica cells were cultivated at 37◦C in LB medium [lysogeny broth (Bertani, 1951)]. L. monocytogenes cells were cultivated in BHI (Brain heart infusion broth, #X916 purchased from Carl Roth GmbH, Karlsruhe, Germany) medium at 37◦C and Agrobacterium tumefaciens ICF320 (Bendandi et al., 2010) cells were cultivated at 28◦C in LBS medium [modified LB medium containing 1% soya peptone (Duchefa, Haarlem, Netherlands)].

#### Plasmid Constructs

Constructs used were described in Schulz et al. (2015) or Schneider et al. (2018).

#### Plant Material and Transient or Transgenic Bacteriocin Expression

Nicotiana benthamiana WT was grown and transfected with Agrobacterium for transient expression as described in Schulz et al. (2015). The generation of bacteriocin-transgenic N. benthamiana was published in Schulz et al. (2015) and Schneider et al. (2018). Methods for EtOH-induction of transgenic plants were described in Werner et al. (2011).

#### Protein Analysis

Plant leaf material was ground in liquid nitrogen and total soluble protein extracts were prepared with 5 vol. 50 mM HEPES pH 7.0, 10 mM K acetate, 5 mM Mg acetate, 10% (v/v) glycerol, 0.05% (v/v) Tween-20, 300 mM NaCl and the protein concentration of TSP extracts was determined by Bradford assay using Bio-Rad Protein Assay (Bio-Rad Laboratories, GmbH, Munich, Germany) and BSA (Sigma-Aldrich, Co., St. Louis, MO, United States) as a standard if not stated otherwise. Determination of antimicrobial protein concentration in TSP extracts was done semi-quantitatively by comparison of different amounts of TSP extracts with known amounts of BSA on Coomassie-stained SDS/PAGE gels. Protocols for protein purification and purity analysis are described in Stephan et al. (2017) and Schneider et al. (2018).

#### Bacteriocin Antimicrobial Activity Determinations

Semi-quantitative and quantitative determinations of antimicrobial bacteriocin activity by a spot-on-lawn soft agar overlay assay or enumeration of viable counts via dilution plating from liquid cultures was done as described in Schulz et al. (2015).

<sup>1</sup>www.washingtonpost.com/news/wonk/wp/2015/08/05/there-

realize/?noredirect=on&utm\_term=.defc863946a4

<sup>3</sup>www.cdc.gov/foodsafety/outbreaks/index.html

# Reduction of Bacterial Populations on Different Food Matrices

Protocols for E. coli contamination of beef trims prior or without grinding and lamb loin with subsequent colicin treatment were similar to Schulz et al. (2015) whereas protocols for contamination of chicken meat, egg, tuna, and beef meat with S. enterica and subsequent salmocin treatment were basically described in Schneider et al. (2018).

#### RESULTS

#### Colicin Biology

Colicins are antimicrobial proteins produced by certain strains of E. coli for control of other strains of the same or related species. Colicin genes are carried on colicinogenic plasmids and are part of colicin operons, which include also genes for immunity proteins and lysis proteins. Immunity proteins protect colicin-producing cells against cytotoxic activity of accumulated colicin; the immunity protein gene is expressed constitutively. The lysis protein is expressed as a read-through of colicin gene STOP-codon; being accumulated to critical level, the lysis protein destroys the colicin-producing cell and results in the release of colicin to the environment (Cascales et al., 2007; Kleanthous, 2010; Kim et al., 2014).

Mechanisms of colicin antimicrobial action are summarized in **Figure 1**. To enter target cells, colicin proteins first bind to outer membrane cell surface receptors (FhuA, OmpF, BtuB, etc.); the translocation across the cell membrane is operated by innate cell translocation machinery (either Tol or Ton transport systems) that is recruited by the colicin translocation domain. Colicins exert three types of cytotoxic activities. Colicins with nuclease (DNase and RNase) activities (e.g., colicins E2-E9) enzymatically degrade DNA or RNA of the target cell (**Figure 1A**). Poreforming colicins or porins (e.g., colicin Ia, Ib, K, and U) impair the integrity of cell membranes resulting in cell death due to cell membrane depolarization (**Figure 1B**). Inhibitors of cell wall biosynthesis exert their bacteriolytic effect via enzymatic degradation of undecaprenyl phosphate-linked peptidoglycan (murein) precursors (**Figure 1C**). In E. coli, this last group is represented only by colicin M. All colicins have a three-domain structure with the N-terminal translocation domain responsible for the transport of the protein across the cell membrane and periplasmic space; the central receptor-binding domain responsible for binding to the outer membrane cell surface receptor; and the C-terminal cytotoxic domain responsible for exerting the killing effect on the target. There is one exception to this convention, namely, the mechanism of translocation of colicin N is not yet clear (Jakes, 2014).

Most pathogenic species of Gram-negative bacteria employ bacteriocins evolutionarily similar to colicins; those are referred to as colicin-like molecules and are given names usually derived from the name of the genus. Apart from Escherichia colicins and Pseudomonas pyocins, other colicin-like proteins are much less studied (Cascales et al., 2007; Riley, 2009), and the molecular structure and design of some bacteriocins, for example, Pseudomonas pyocins, are more diverse (Barreteau et al., 2009; Ghequire and De Mot, 2014; Paškevicius et al., 2017 ˇ ).

There is a growing number of publications dealing with chimaeric bacteriocins engineered to contain domains derived from different proteins and naturally occurring bacteriocins (e.g., colicin Ia; Qiu et al., 2003; Gupta et al., 2013; Behrens et al., 2017); Naturally occurring bacteriocins are synthesized in bacteria that are commensal in the human intestinal tract. As such, they are benign and have not been associated with adverse effects. This feature of natural bacteriocins allows for their treatment as GRAS substances during regulatory review. Hybrid or chimaeric molecules are not discussed here because their record of safety is not yet "generally recognized" and as such they are unlikely to initially qualify for review via the GRAS process.

Bacteriocins are proteins and are thus fundamentally different from commonly used small molecules antibiotics; those differences include much higher molecular size, higher molar activity, limited bioavailability, narrow specificity, and different mechanisms of action. Consequently, they can't be used as simple replacements for antibiotics and initial indications may be limited to their topical use against known pathogenic species. At the same time, being novel antibacterials, medicinal bacteriocins (e.g., Brown et al., 2012) could command much higher prices in those new indications once they are proven safe and effective (**Table 1**).

#### Plant-Made Colicins

We selected sequences of 23 (almost all) colicins available in public databases (**Figure 2B**) and expressed them in N. benthamiana plants using the magnICON <sup>R</sup> system (Gleba et al., 2005, 2014; Klimyuk et al., 2014). Expression of colicins E2, E3, E6, E7, D, N, K, 5, U, B, Ia, and M was described in our previous publication (Schulz et al., 2015). **Figure 2A** shows the SDS-PAGE analysis of expression for all 23 colicins we tested, including colicins described before. The expression level varied between 0.49 ± 0.18 mg/g FW (6.3 ± 1.9% TSP) for ColN and 5.00 ± 1.55 mg/g FW (45.6 ± 7.3% TSP) for ColK with a majority of colicin proteins expressed at levels between 1 and 3 mg/g FW. ColE1, which was found to accumulate at the lowest yield (approximately 1% of TSP), was excluded from further studies, although it demonstrated antimicrobial activity against some E. coli strains (data not shown).

We also successfully expressed some colicins in Spinacia oleracea (spinach) plants. The expression levels in spinach, however, were approximately 10-times lower than in N. benthamiana (Schulz et al., 2015).

Antimicrobial activities of plant-made colicins against shigatoxin producing E. coli strains comprising the "Big 7" STEC USDA-FSIS panel<sup>4</sup> were studied using a spot-on-lawn soft agar overlay assay as described in Schulz et al. (2015). **Figure 3** summarizes results of these studies for all 23 colicins. We found that colicin activity and host range segregated into several groups. Some colicins showed relatively narrow specificity (e.g., active against only 1–2 strains); some demonstrated a moderately broader activity spectrum (e.g., active against 3–4

<sup>4</sup>https://www.govinfo.gov/content/pkg/FR-2011-09-20/pdf/2011-24043.pdf

strains); and very few colicins (i.e., only ColM, ColIa, and ColIb) exhibited a broad activity spectrum. Based on our data, colicin cocktails composed of several colicins with complementary activity spectra (e.g., 2-component or 4-component blends such as ColM + ColIb + ColU + ColK) should be capable of controlling most pathogenic EHEC strains.

We developed two types of downstream purification processes to isolate colicins from plant biomass (Schulz et al., 2015). Extraction of the biomass followed by ultra/diafiltration and concentration results in COLICIN CONCENTRATE with typically 40–50% product purity. This approach is intended for use when edible plant species are used as expression hosts, because the components of the biomass are food and hence recognized as safe. If N. benthamiana is used as the production host, the downstream process includes a chromatography step resulting in COLICIN ISOLATE with higher purity.

A simple purification protocol comprizing extraction, ion exchange chromatography and dialysis resulted in 71% (ColK) – 97% (ColM) protein purity (Stephan et al., 2017). In the case of protein purity below 95%, detected protein impurities were found to be colicin degradation products. **Supplementary Figure S1** summarizes purification data for three non-consecutive batches of ColM with an average 97.65% protein purity and 67.71% recovery. This protocol also provides for the efficient elimination of plant alkaloids down to safe levels: 22–171 ng/mg protein

TABLE 1 | Bacteriocins versus antibiotics: major biological differences and market potential of bacteriocins.


for nicotine and between undetectable levels and 44 ng/mg for anabasine (Stephan et al., 2017).

Purified colicin proteins were used for stability studies. We compared antimicrobial activities of ColM, ColU, ColIb, and ColK upon storage as solutions and as lyophilized powders at 4◦C and room temperature for up to 309 days (ColK), 447 days (ColM and ColIb), and 552 days (ColU) (**Supplementary Figure S2**). All four lyophilized colicins retained their antimicrobial activities during the entire storage period at both 4◦C and room temperature. Colicin M, Ib, and U demonstrated high stability also in solution when stored at 4◦C (**Supplementary Figure S2**). These three colicins were least stable in solution at room temperature, with retention of activities under such conditions for 2 weeks (ColIb), 3 weeks (ColU), and 8 weeks (ColM). ColK solution was the least stable, with activity significantly declining after 1 week of storage either at room temperature or at 4◦C (**Supplementary Figure S2B**). These data suggest that colicin preparations should be preferably stored in a dry form and reconstituted with water shortly before use. Ideally, colicin solutions should be refrigerated and used within a few weeks of preparation depending on the colicin cocktail composition.


FIGURE 2 | Plant expression of colicins. Transient expression in N. benthamiana upon syringe infiltration with 1:100 dilutions of agrobacterial cultures carrying TMV or TMV and PVX vectors. Recombinant proteins were analyzed in TSP (total soluble protein) extracts of leaf tissue prepared with 5 vol. 50 mM HEPES pH 7.0, 10 mM K acetate, 5 mM Mg acetate, 10% (v/v) glycerol, 0.05% (v/v) Tween 20 <sup>R</sup> , 300 mM NaCl. (A) Coomassie-stained SDS protein gels loaded with TSP extracts prepared from plant material expressing bacteriocins or from (WT) non-transfected leaf tissue; loading corresponds to 1.5 mg FW plant material. Asterisks mark recombinant proteins. (B) The yield is given in mg recombinant colicin/g fresh weight of plant leaf biomass and as a percentage of TSP and is represented as an average value and standard deviation (AV, SD) of several experiments. N, number of independent experiments. Transient expression and yield determination were done as described in Schulz et al. (2015). Plant material expressing bacteriocins was harvested at timepoints in days post inoculation (dpi) as indicated in (B).

Cocktails of plant-made colicins have been tested for control of EHEC contamination on various food matrices, including pork filet, beef steak, beef meat cubes (before grinding), and lamb loin filet. Previously, we reported the reduction of bacterial contamination of E. coli O157:H7 on fresh pork meat by treatment with colM + ColE7 mix (Schulz et al., 2015). **Figure 4** shows decontamination of beef steak (A), beef meat cubes (before grinding) (B), and lamb loin filet (C). In these studies meat matrices were contaminated with a mixture of USDA "Big7 STEC" plus O104:H4 serotypes (8 strains in total). Colicin cocktail (M + E7 + Ia + 5 + K + U) treatment provided 1–3 logs reduction of bacterial population.

We also demonstrated that colicins are able to control multi-drug resistant E. coli. **Figure 5** compares antimicrobial activities of colicins and antibiotics against MDR E. coli strain ATCC <sup>R</sup> BAA-2326TM of serotype O104:H4. This strain is positive for virulence genes aggR and stx2 and negative for virulence genes stx1 and eae. Genome sequencing revealed the presence of acquired antibiotic resistance genes, including β-lactamase of TEM-1 type, β-lactamase of CTX-M-15 type, multidrug-resistance gene cluster (dfA7, sul1, sul2, strA, strB, tetA, mercury resistance) and the tellurite resistance gene cluster (Rohde et al., 2011). ATCC <sup>R</sup> BAA-2326TM is resistant to ampicillin, piperacillin, cefazolin, cefotaxime, ceftazidime, cefepime, and trimethoprim/sulfamethoxazole. The strain is sensitive to cefoxitin, ertapenem, imipenem, amikacin, gentamicin, tobramycin, ciprofloxacin, levofloxacin, tigecycline, and nitrofurantoin<sup>5</sup> .

<sup>5</sup>https://www.lgcstandards-atcc.org/support/faqs/6e2c1/ATCC%20BAA-2326% 20antibiotic%20resistant.aspx?device=modal

FIGURE 4 | Reduction of shiga-toxin producing E. coli on fresh meat matrices. Fresh raw meat trims as beef steak (A), beef meat cubes (before grinding) (B) and lamb loin filet (C) as shown in images were contaminated with a bacterial strain mix in equal cell number proportions of Big7 and O104:H4 serotypes (nalidixic acid resistant derivatives of strains CDC 03-3014, CDC 00-3039, CDC 06-3008, CDC 2010C-3114, CDC 02-3211, CDC 99-3311, ATCC <sup>R</sup> 35150TM, and ATCC <sup>R</sup> BAA-2326TM) either by dipping of steaks into bacterial solution (A,C) or by intermixing bacterial solution with beef cubes (B). Subsequently, contaminated meat was treated with TSP extracts containing colicins by spraying at an application rate of 3 + 1 + 1 + 1 + 1 + 1 mg/kg colicins M + E7 + Ia + 5 + K + U. Beef cubes were used to prepare ground beef upon colicin treatment (B). Graphs show bacterial populations recovered from meat on SMAC medium supplemented with 25 µg/ml nalidixic acid upon storage for various periods of time at 10 or 15◦C upon colicin treatment (dark gray bars, initial contamination level; white bars, carrier treatment; light gray bars, bacteriocin treatment). Error bars indicate standard deviation of biological replicates, N = 4. Data annotations above bars correspond to mean log<sup>10</sup> (cfu/g) reduction carrier vs. colicin treatment (upper line), mean percent (cfu/g) reduction carrier vs. colicin treatment (middle line) and statistical analysis by unpaired parametric t-test with GraphPad Prism v. 6.01 comparing the treatments at one timepoint with significance levels indicated by asterisks [∗p < 0.05 (probability of error less than 5%); ∗∗p < 0.01 (probability of error less than 1%); ∗∗∗p < 0.001 (probability of error less than 0,1%); ∗∗∗∗p < 0.0001 (probability of error less than 0,01%)].

cultures of E. coli strain ATCC <sup>R</sup> BAA-2326TM<sup>∗</sup> of serotype O104:H4 were supplemented with different doses of individual colicins (A), colicins blends (B), antibiotics of different classes (A,B) or carrier [buffer solution, (A,B)]. The graphs show bacterial cell numbers upon co-incubation with antimicrobial test solutions [(A,B) white bars , carrier; green bars , carbenicillin 50 mg/L; blue bars , streptomycine 50 mg/L; gray bars , tetracycline 50 mg/L; black bars , kanamycin 50 mg/L; red bars , (A) colM 0.5 mg/L or (B) colM 5 mg/L; yellow bars (A) colE7 0.5 mg/L or (B) colM + colE7 4.5 + 0.5 mg/L; orange bars , (A) colK 0.5 mg/L or (B) colM + colE7 + colE2 + colE6 3.5 + 0.5 + 0.5 + 0.5 mg/L; pink bars (A) colIa 0.5 mg/L or (B) colM + colE7 + colE2 + colE6 + colK + col5 2.5 + 0.5 + 0.5 + 0.5 + 0.5 + 0.5 mg/L; lilac bars (B) colM + colE7 + colE2 + colE6 + colK + col5 + colIa 2.0 + 0.5 + 0.5 + 0.5 + 0.5 + 0.5 + 0.5 mg/L] for different timepoints at 37◦C. Bacterial cell numbers were quantified by dilution plating (average of N = 3 samples, error bars correspond to SD) at timepoints 0, 2, 4, or 24 h of incubation. Experiments were performed using colicin-containing TSP extracts.

We compared individual colicins (**Figure 5A**) and colicin blends (**Figure 5B**) to four antibiotics representing three structural classes and two modes of action: carbenicillin (βlactams, inhibitor of cell wall biosynthesis), streptomycin and kanamycin (aminoglycosides, inhibitors of protein biosynthesis), and tetracycline (tetracyclines, inhibitors of protein biosynthesis). We evaluated colM, colE7, colK, and colIa individually as well as in several blends. As expected, carbenicillin, streptomycin and tetracycline did not influence growth of the E. coli strain tested, whereas kanamycin eradicated bacterial cells. Individual colicins M and E7 significantly decreased bacterial population during the first 4 h of cultivation (**Figure 5A**). Colicin blends were much more efficient than individual colicins; colicin mixes M + E7, M + E7 + E2 + E6 and M + E7 + E2 + E6 + K + 5 completely eradicated bacterial cells (**Figure 5B**). The colicin effect was shown to be dose dependent; for example, colicin M used alone provided much more stringent bacterial control at 5 mg/l concentration compared to 0.5 mg/l.

#### Plant-Made Salmocins

In contrast to the well-studied E. coli colicins, prior to our report (Schneider et al., 2018) colicin-like bacteriocins from Salmonella were scarcely studied (Patankar and Joshi, 1985). Based on homology to colicins, we identified in GenBank <sup>R</sup> five Salmonella sequences coding for bacteriocins that we termed salmocins (Salmonella colicins): SalE1a and SalE1b with pore-forming activity and SalE2, SalE3, and SalE7 with nuclease activity. We successfully expressed all these proteins in N. benthamiana plants using the magnICON <sup>R</sup> system at levels of 1.0–1.7 mg/g FW (Schneider et al., 2018).

Surprisingly, screening for antimicrobial activity against S. enterica ssp. enterica revealed unusually broad specificity and high activity for salmocins SalE1a and SalE1b (Schneider et al., 2018). These bacteriocins were active against all 109 test strains representing 105 pathogenic serovars with specific activities between 2 and 8 logs of AU/µg protein. Nuclease salmocins SalE2, SalE3, and SalE7 had narrower specificity (Schneider et al., 2018).

We also evaluated salmocins as antibacterials for Salmonella on several food matrices, including skinless chicken meat, skin-on chicken meat, beef steak, tuna filet and raw whole eggs. Food products were spiked with a mixture of seven S. enterica ssp. enterica strains representing seven (Enteritidis, Typhimurium, Newport, Javiana, Heidelberg, Infantis and Muenchen) or two (Enteritidis, Typhimurium) pathogenic serovars in the case of chicken or other food matrices, respectively. Efficient decontamination of skinless chicken meat with individual salmocin SalE1a and salmocin blend SalE1a + SalE1b + SalE2 + SalE7 was described in Schneider et al. (2018). **Figure 6** shows a significant (1–2 log) reduction of a Salmonella contamination on fresh skin-on chicken breast filet by individual salmocin E1b used in several concentrations: 5.0, 1.0, 0.5, and 0.1 mg/kg meat. **Figure 7** shows reduction of Salmonella contamination on whole egg (A), beef trims (B) and tuna filet trims (C) by SalE1b. SalE1b in a concentration of 0.5 mg/kg food provided bacterial load reduction of 3–8 logs in whole egg, 1.8–3 logs in beef trims, and 3.8–5 logs in tuna filet.

We searched for the lowest industrially practical application rates for salmocins to control Salmonella. In vitro, pore-forming SalE1a and SalE1b were highly active against the mix of two Salmonella strains of representative serotypes Enteritidis and Typhimurium at low concentrations of 0.1 and 0.01 mg/l (**Figure 8**). Interestingly, low temperature (10◦C, **Figure 8B**) did not have a significant impact on these salmocins' bactericidal effect compared to their activity at 37◦C (**Figure 8A**).

We also compared four types of antimicrobial proteins of different origin (colicins and salmocins from Gram-negative bacteria, Listeria phage endolysins from pathogens of Grampositive species, and nisin from Gram-positive species) for their activity against Gram-negative E. coli and S. enterica, and Gram-positive L. monocytogenes (**Supplementary Figure S3**). Nisin, a food-approved bacteriocin which is widely used commercially, is a low molecular weight peptide originating from the Gram-positive bacterium Lactococcus lactis. Plant extracts containing corresponding proteins were tested against mixes of bacterial strains listed in **Supplementary Figure S3B**. We found E. coli to be sensitive to colicins only (**Supplementary Figure S3A**). Salmonella was most sensitive to salmocins and, to a lesser extent, to colicins. Interestingly, Salmonella also showed little sensitivity to the mix of Listeria phage endolysins (**Supplementary Figure S3A**). Listeria was completely insensitive to both colicins and salmocins; it showed only slight sensitivity to endolysins but high sensitivity to nisin (**Supplementary Figure S3A**). Our data indicate a clear distinction in specificities between antimicrobial proteins derived from Gram-positive and Gram-negative bacterial species without significant cross-activity between these two classes of microorganisms.

FIGURE 6 | Reduction of a S. enterica ssp. enterica contamination on fresh skin-on chicken breast filet by salmocins. The graph shows bacterial populations recovered from meat shown in the picture upon storage for various periods of time at 10◦C upon salmocin treatment (gray bar, initial contamination level; green bars, carrier treatment; red bars, salmocin treatment SalE1b in concentration of 5 mg/kg meat; orange bars, SalE1b in concentration of 1 mg/kg meat; yellow bars, SalE1b in concentration of 0.5 mg/kg meat; white bars, SalE1b in concentration of 0.1 mg/kg meat) of contaminated meat by spray-application. Error bars indicate standard deviation of biological replicates, N = 4. Statistically significant reductions (p < 0.005) in bacterial contamination were found by assessment of viable bacterial counts obtained from salmocin-treated in relation to carrier-treated meat samples by analysis by unpaired parametric t-test with GraphPad Prism v. 6.01 at all timepoints showing efficacy of salmocin treatment. Experiments were performed analogously to Schneider et al. (2018) (Figure 4) on meat contaminated with nalidixic acid resistant mutants of Salmonella strains of seven serovars mixed in equal cell number proportions [Enteritidis (ATCC <sup>R</sup> 13076TM<sup>∗</sup> ), Typhimurium (ATCC <sup>R</sup> 14028TM<sup>∗</sup> ), Newport (ATCC <sup>R</sup> 6962TM<sup>∗</sup> ), Javiana (ATC1 <sup>R</sup> C0721TM<sup>∗</sup> ), Heidelberg (ATCC <sup>R</sup> 8326TM<sup>∗</sup> ), Infantis (ATCC <sup>R</sup> BAA-1675TM<sup>∗</sup> ), Muenchen (ATCC <sup>R</sup> 8388TM<sup>∗</sup> )] and using semi-purified salmocin SalE1b protein. The purity of salmocin E1b was determined by capillary gel electrophoresis as described in Stephan et al. (2017) and found to be about 55% of total purified protein.

# Production of Colicins and Salmocins in Ethanol-Inducible Transgenic Plant Hosts

The large scale manufacture of antimicrobial proteins for food use will require processing large amounts of plant biomass and low production cost. We believe that ethanol-inducible protein expression using a transgenic plant host is more amenable to cost-efficient scale-up than a transient expression approach.

FIGURE 7 | Reduction of a S. enterica ssp. enterica contamination on different food matrices by salmocins. Fresh raw food matrices as whole egg (A), beef trims (B) and tuna filet trims (C) as shown in images were contaminated with a bacterial strain mix in equal cell number proportions [nalidixic acid resistant mutants of Salmonella strains of two serovars: Enteritidis (ATCC <sup>R</sup> 13076TM<sup>∗</sup> ) and Typhimurium (ATCC <sup>R</sup> 14028TM<sup>∗</sup> )]. The graphs show bacterial populations recovered from foods shown in the pictures above upon storage for various periods of time at 10◦C upon salmocin treatment (dark gray bars, initial contamination level; white bars, carrier treatment; light gray bars, salmocin treatment with SalE1b in concentration of 0.5 mg/kg food of contaminated food by intermixing. Error bars indicate standard deviation of biological replicates, N = 4. Statistically significant reductions (p < 0.005) in bacterial contamination were found by assessment of viable bacterial counts obtained from salmocin-treated in relation to carrier-treated food samples by analysis by unpaired parametric t-test with GraphPad Prism v. 6.01 at all timepoints showing efficacy of salmocin treatment. Experiments were performed using semi-purified salmocin SalE1b protein.

We already reported on the development of ethanol-inducible transgenic N. benthamiana lines for the expression of colicin M (Schulz et al., 2015) and salmocin E1b (Schneider et al., 2018). Currently, we are developing transgenic N. benthamiana lines for production of other colicins and salmocins. Alternative approaches for large-scale protein expression in plants, such as agroinfiltration or agrospray, require a fermentation facility to generate inoculum, plus containerization, plant transport and vacuum infiltration equipment in infiltration-based processes (Chen et al., 2013; Gleba et al., 2014; Tusé et al., 2014).

Such process requirements introduce complexity and ultimately drive up manufacturing capital and operating costs. Although higher cost of goods sold (cogs) might be tolerated for pharmaceutical or other high-value products produced through transient expression, they are undesirable in cost-constrained applications such as food safety.

Here we analyzed the performance of colM-producing N. benthamiana line (T4 generation plants homozygous for single-copy T-DNA insertion) depending on season; we also compared transient and transgenic colicin M expression (**Figure 9**). In our semi-controlled glass-facade greenhouse conditions, seasonal differences in plant biomass yield depend mostly on light intensity and the lowest amount of plant biomass was found in the winter season due to slower plant growth as this is usually observed for all plant species grown in the greenhouse (**Figure 9A**). The lower biomass yield of seasons with unfavorable speed of plant growth can be compensated by prolonged incubation of plants before treatment and harvest, as seen in comparison of summer and autumn plants (**Figure 9A**). There was no prominent difference between transgenic and transient production host (**Figure 9A**). Expression of recombinant proteins was more equally distributed between leaves and stems for transgenics compared to vacuum-infiltrated plants with predominant leaf expression (**Figure 9B**). Despite experiment-to-experiment variability, transgenic and transient expression hosts provided comparable levels of recombinant protein accumulation (**Figure 9B**). Phenotypically, transgenic plants were indistinguishable from wild type plants (**Figure 9C**).

#### Colicins/Salmocins: Regulatory Marketing Allowance as Food Antimicrobials

Development and regulatory approval of any product to be added to food or used as medicine is a complex, lengthy and usually costly process. Regulatory approvals for food additives vary significantly from country to country. We discuss here regulatory approval pathways for food antibacterials in the United States because this country represents by far the largest potential market for these products and because its regulatory review process can be relatively simple and fast (and relatively inexpensive), compared to the regulations in most other countries. In the United States, any substance to be intentionally added to food is a food additive and must be subjected to premarket review and approval by the FDA under the Federal Food, Drug and Cosmetic Act (FFDCA; the "Act"), unless the substance meets a listed exemption in the Act, or is generally recognized, among qualified experts, as having been adequately shown to be safe under the conditions of its intended use<sup>6</sup> .

GRAS' (Generally Recognized As Safe) is an FDA designation that a chemical or substance added to food is considered safe by experts, and so is exempted from the conventional premarket approval process by FDA. The developer of the new substance (the Notifier) conducts an analysis of safety and utility of its product using scientific procedures including corroboration from

<sup>6</sup>www.fda.gov/Food/IngredientsPackagingLabeling/GRAS/

publically available information, and determines and documents that the substance is GRAS as specified by the FDA's Final Rule for GRAS Notices<sup>7</sup> . The Notifier voluntarily submits its GRAS conclusion to FDA for review and comment. The FDA can either reject Notifier's conclusion of safety, cease to evaluate the submission upon request by the Notifier, or, ideally, issue a "No Questions" letter to Notifier. The latter verifies that the FDA agrees with Notifier on its conclusion that the substance is GRAS and equates to marketing allowance by FDA for the substance. The FDA may conduct the GRAS review on its own for certain types of food treatments, or solicit input from the United States Department of Agriculture (USDA) if the substance is to be applied to USDA-regulated products such as meat and egg products. The GRAS pathway can be used for substances added to human food or animal food, as well as for animal feed ingredients.

In addition to the FDA, an alternative body capable of conducting GRAS reviews is the Flavor and Extract Manufacturers Association (FEMA), which is the national association of the United States flavor industry. FEMA works with legislators and regulators to assure that the needs of members and consumers are addressed and can provide GRAS guidance, although their function is restricted to flavor substances.

The GRAS designations fall into several categories, the most relevant of which for food antibacterials is "Food Processing Aid." Substances added to food are classified as Food Processing Aids if they provide a rapid yet temporary effect, degrade and become part of the food matrix and thus have no functional effect on the food. If FDA accepts such a designation based on the evidence provided, the designation allows the manufacturer to avoid listing the substance on the treated food's product label; thus, there is no labeling requirement for the substance. Food additives or food ingredients, on the other hand, are typically persistent, are essential to or can modify the food's functionality, and need to be listed on the product label.

Facilitated regulatory pathways similar to GRAS exist also in a few other countries, for example Canada, Mexico, Australia, and New Zealand. In yet other territories, including countries of the European Union and Japan, approval of a new food additive involves a process similar to the USA's pre-market review of a new non-GRAS substance, requiring extensive toxicity/safety studies.

Nomad Bioscience is the first company to successfully obtain FDA concurrence for GRAS designation of its plant-made bacteriocins, such as colicins, as food antimicrobials. In its first GRAS notice (GRN 593<sup>8</sup> ), the following arguments were used to support safety and suitability of colicins made in food species hosts as food antimicrobials.

Safety:

– Colicins are naturally occurring antibacterial proteins produced endogenously by commensal enteric bacteria in the human gut;

<sup>7</sup>www.federalregister.gov/documents/2016/08/17/2016-19164/substancesgenerally-recognized-as-safe

<sup>8</sup>https://www.accessdata.fda.gov/scripts/fdcc/index.cfm?set=GRASNotices &id=593

FIGURE 9 | Transgenic EtOH-inducible hosts for colicin expression. Independent experiments performed in different seasons, summer, autumn, and winter, comparing transient and transgenic colicin M expression are shown. Methods of plant cultivation, transient expression of colicin M using N. benthamiana WT plants and vacuum infiltration of 1:100 dilutions of agrobacterial cultures or ethanol-induction of transgenic plants using 4% (v/v) EtOH solutions were described in Werner et al. (2011) or Schulz et al. (2015), respectively. The TMV-based constructs for transient colM expression (pNMD10221) and for EtOH-inducible colM expression (pNMD18381) were described in Schulz et al. (2015), Supplementary Figures S1, S3, respectively. Transgenic plants for EtOH-inducible colM expression used were T4 generation plants homozygous for single copy T-DNA insertion with characterized T-DNA ends and genomic insertion point of N. benthamiana plant line Nb18381T0#29 initiated as described in Schulz et al. (2015) (Supplementary Figure S2). (A) shows the yield of plant biomass in g fresh weight (FW) as average and standard deviation of 6 or 3 plants (also true for B) and sample description giving plant age at timepoint of treatment (vacuum infiltration or EtOH-induction) in days post sowing (dps), harvesting timepoint for biomass and recombinant protein analysis in days post treatment (dpt) and antimicrobial activity of colicin-containing TSP extracts in AU/mg FW plant biomass. (B) inspection of plant TSP extracts prepared with 2 vol. 20 mM citrate, 20 mM Na2HPO4, 30 mM NaCl, pH 4.0 using Angel Juicer 7500 and feeding buffer using peristaltic pump iPump2S by SDS-PAGE and Coomassie-staining; recombinant colM is marked with asterisks, loading corresponds to 2.5 mg FW (summer and winter) or 3.75 mg FW (autumn). (C) Plant phenotypes before treatment or at harvest.



#### Suitability:


In 2015, based on Nomad's submission, FDA accepted colicins as first-in-class GRAS antimicrobials for controlling pathogenic E. coli in fruits and vegetables (GRN 593), and in 2017, FDA and USDA accepted colicins as antimicrobials for controlling E. coli in meat products (GRN 676<sup>9</sup> ). In both cases, the Agencies agreed with the "Food Processing Aid" definition and USDA has added colicins to its FSIS Directive 7120, which is a list of safe and suitable ingredients allowed for use in the production of meat, poultry and egg products (FSIS Directive 7120.1, Revision 42). Subsequently, Nomad also filed GRAS notices for Salmonella salmocins (GRN number pending) and C. perfringens bacteriophage endolysins (GRN 802<sup>10</sup>). Independently, Nomad also submitted a GRAS notice for the use of N. benthamiana (non-edible) production host for the manufacture of colicins (GRN 775<sup>11</sup>), which led to a "No Questions" letter from FDA. A list of allowed GRAS notices, notices currently under review by FDA, and notices in preparation is provided in **Supplementary Table S1**.

Regulatory experience to date suggests that additional plantmade colicins, phage-derived endolysins, other bacteriocins, defensins, etc., could also gain rapid marketing allowance. In particular, in 2018 Nomad received 'No Questions' letter from FDA for a product candidate in another category of food treatments, the natural sweeteners/taste modifiers thaumatins (GRN 738<sup>12</sup>). As long as the GRAS notice describes a natural or nature-identical substance, GRAS designation is a relatively simple, fast and inexpensive way for obtaining regulatory review and product marketing allowance, as evidenced by the success of Nomad's GRAS submissions to date. Based on this experience, we see a potential GRAS allowance 'space' for multiple classes of natural proteins such as:


Natural products that are used to treat food but that have no functional effect on food (food processing aids) are obviously the easiest cases to take through the GRAS process. For colicins, the inherent safety of these proteins was supported in part by the fact that colicins and colicin-like bacteriocins are very sensitive to proteases, and any traces of these proteins remaining in the treated food would be rapidly degraded in the stomach and duodenum; Nomad provided FDA with extensive data on gastroduodenal degradation of colicins in its dossier. Future uses of bacteriocins as food treatments to control bacteria in the human or animal gastrointestinal tract would likely require additional data on bacteriocin safety, bioavailability, and their functionality in the intestinal lumen.

#### Potential Markets for Bacteriocins as Food Antibacterials

Bacteriocins such as colicins and salmocins are promising alternatives to antibiotics for many markets, perhaps most importantly for food and medicinal uses. At present, antibiotics still constitute our main therapeutic toolbox for controlling pathogens; the situation is, however, rapidly changing with increasing number of pathogenic bacteria becoming resistant to most antibiotics. In the health care market, bacteriocins could probably be effectively used today in specific market niches in which the pathogen species are known and a topical application (direct surface delivery to skin, surface of lungs, surface of urogenital tract, and intestinal tract) is possible. Such indications could include treatment of cystic fibrosis patients by inhalation-based delivery of pyocins to treat Pseudomonas, or treatment of uropathogenic Escherichia by catheter delivery of colicins (**Tables 1**, **2**).

<sup>9</sup>https://www.accessdata.fda.gov/scripts/fdcc/?set=GRASNotices&id=676&sort= GRN\_No&order=DESC&startrow=1&type=basic&search=676

<sup>10</sup>https://www.accessdata.fda.gov/scripts/fdcc/?set=GRASNotices&id=802 <sup>11</sup>https://www.accessdata.fda.gov/scripts/fdcc/?set=GRASNotices&id=775

<sup>12</sup>https://www.accessdata.fda.gov/scripts/fdcc/index.cfm?set=GRASNotices& id=738

TABLE 2 | Potential uses of bacteriocins in food and medicines/medical devices.


TABLE 3 | Potential United States food/feed safety and animal health markets for bacteriocins.


<sup>∗</sup>GRAS status agreement by FDA obtained; ∗∗GRAS status agreement by FDA expected Q1 2019.

Potential markets for food antibacterials are large and even more immediate because of the facile marketing channels in countries such as United States and Canada. **Table 3** lists potential market segments that are in need of better food safety through bacterial control, and those markets are sizeable and numerous. Given the current antibacterial intervention costs accepted or acceptable by the industry (\$0.025–0.1 per kg of food product), the estimated markets for antibacterials could potentially be very attractive. Some of the segments, such as processing of fresh and ground beef or processing of poultry and pigs, are oligopolistic with only a few companies controlling the majority of the United States market (i.e., Tyson Foods, JBS, Cargill, National Beef, and Pilgrim's Chicken). Therefore, a commercial alliance with just one such partner would provide access to 20% or more of the market segment.

#### DISCUSSION

Food safety market needs are shaped by two major trends, both of which revolve around real and perceived food safety issues. The first trend is the rapid increase of multidrug resistant forms among common bacterial pathogens present in food. The second is the desire by consumers to have a 'natural' food that is devoid of chemical additives or genetically modified ingredients ('organic food'). This second trend is actively exploited by food companies because it allows them to charge a premium for food classified as 'natural,' 'GMO-free,' 'antibiotic-free,' 'organic,' 'bio,' etc. Unfortunately, the modification of agricultural and husbandry practices and subsequent food processing so as to exclude previously accepted chemical interventions drastically increases the likelihood of food contamination by bacteria. Two examples illustrate this point. Contamination of food with pathogenic E. coli was originally dubbed a 'hamburger disease' because such contamination was initially traced to contaminated beef. The statistics from the United States Centers of Disease Control and Prevention (CDC) that tracks outbreaks in United States demonstrate that during 2006–2010 only one out of ten outbreaks was traced to contaminated vegetables. However, during 2011– 2016, nine out of thirteen E. coli foodborne outbreaks was due to vegetables/vegetable products, including five due to organically grown vegetables and sprouts. Another sad illustration is the case with Chipotle, a United States restaurant chain that in 2013 declared itself as the one intending to provide its customers with natural, organic and GMO-free foods. Within approximately the next 18 months, there were four outbreaks due to contamination involving three different pathogens (E. coli, Salmonella, and norovirus). As a result, the company stock took a serious hit and its market value had fallen by almost 65% by the end of 2017, somewhat recovering during the first half of 2018. However, in August 2018 the company experienced its largest outbreak to date, this one due to C. perfringens. In the minds of the general public, there still appears to be no correlation between 'organic' farming/food and higher risk of bacterial contamination. Nevertheless, since bacteriocins are natural proteins identical to the ones made by our intestinal bacterial flora, we believe that bacteriocins are more likely to be accepted as novel food safety interventions by the industry, non-governmental and governmental organizations and ultimately by consumers.

There is a concern about increase in resistance to colicins and salmocins upon their use as food antimicrobials. Bacterial resistance to bacteriocins is well-known, it was described in numerous publications (e.g., Riley and Gordon, 1996; Feldgarden and Riley, 1998; Kirkup and Riley, 2004, etc.). We believe that the application of colicins/salmocins

to food is unlikely to allow for selection of bacteriocinresistant bacteria in the intestinal tract of humans, for three main reasons:


There are other serious challenges for the developers of bacteriocins as food antibacterials. The food industry has traditionally been reluctant to make significant investments to develop new products unless market or other dynamics mandate it. In particular, antibiotics have been developed by pharmaceutical companies and afterwards adopted by the food industry. Development of the most recent antibacterial products, bacteriophages, has been pioneered and conducted almost entirely by small companies and academia. Most of our discussions with large companies active in food production and processing indicated that they would consider adopting a bacteriocin product if it were available on the market already, but not if it was a product candidate in late development, even if it had already been accepted by regulatory agencies. In other words, a bacteriocin developer may be expected to shoulder the majority of the costs and risks of not only development and regulatory review of the product but also of building manufacturing capacity and marketing resources. This is in stark contrast to the pharmaceutical industry, where there is a "division of labor" in which the R&D pipeline is serviced by multiple small and medium size companies each specializing in certain steps of product development, such as discovery, preclinical studies, or Phases I–II clinical studies. The risk is compounded by the fact that in case of eventual acquisition of the whole product pipeline and accompanying infrastructure by a food industry company (the preferred exit for a small developer), the developer is unlikely to receive multiples on the investment made that are comparable to the multiples enjoyed by developers in the pharmaceutical business (i.e., depending on the development phase at trade sale, average 3.7–4.8 multiples on investment can be realized).

Additional challenges stem from the regulatory requirements imposed on the food industry. Whereas there are strict 'zero tolerance' rules concerning contamination of food with E. coli, there are, for practical reasons, no such limits on contamination with Salmonella or other food pathogens. Correspondingly, large food producers' only "incentives" are the costs of food recalls due to contamination and the damage to their product brands, and sometimes, to their share price.

Some optimism is offered by recent successes of companies developing plant-based meat analogs, such as Beyond Meat and Impossible Foods. The latter company includes recombinant (yeast) leghemoglobin in its ingredients as a flavor enhancer, with the argument that it is less environmentally impactful to produce the heme protein recombinantly by fermentation than to obtain it by natural extraction of legumes (soybeans) grown in vast acreages. This concept appeals to the overall consumer base targeted by the company, whose business model is to offer meat-replacing foods that taste like meat but do not lead to deforestation to raise grains for animal feed, thereby combating global warming. The recombinant heme has achieved GRAS status (GRN 737<sup>13</sup>), and the plant-based meats are gaining acceptance with consumers in spite of GM ingredients. There is a lesson to be learned in this example that could apply to consumer acceptance of natural proteins such as colicins, salmocins and others that are known to be safe and effective, can address a major worldwide safety issue, and can be manufactured at scale in plant-based systems that are sustainable and environmentally compatible. The future will be interesting indeed.

# AUTHOR CONTRIBUTIONS

SH-L, ASt, AG, and YG designed the research. SH-L, ASt, SS, TS, and ASh performed the research. SH-L, ASt, SS, TS, DT, AG, and YG analyzed the data. SH-L, DT, AG, and YG wrote the manuscript.

# FUNDING

This work was partially financed by Investitionsbank Sachsen-Anhalt, Magdeburg, Germany (Grants 1204/00033, 1704/00088, and 1704/00087).

# ACKNOWLEDGMENTS

We thank Dr. Mirko Buchholz (Fraunhofer Institute for Cell Therapy and Immunology, Department of Drug Design and Target Validation, Halle, Germany) for protein and alkaloid analytical services, Dr. Antje Breitenstein, Lia Bluhm, and Anja Banke (BioSolutions Halle GmbH, Halle, Germany) for their support in evaluating colicins' and salmocins' antimicrobial activity. We thank Dr. Kristi Smedley (Center for Regulatory Services, Woodbridge, VA, United States) and Prof. Chad Stahl (University of Maryland, College Park, MD, United States) for valuable advice and guidance on regulatory strategy. We also thank Dr. Aušra Ražanskienë (Nomads UAB, Vilnius, Lithuania) for fruitful discussions.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00437/ full#supplementary-material

<sup>13</sup> https://www.accessdata.fda.gov/scripts/fdcc/index.cfm?set=GRASNotices& id=737

#### REFERENCES

fpls-10-00437 April 8, 2019 Time: 7:56 # 16


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Hahn-Löbmann, Stephan, Schulz, Schneider, Shaverskyi, Tusé, Giritch and Gleba. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Bioproduction of a Therapeutic Vaccine Against Human Papillomavirus in Tomato Hairy Root Cultures

*Silvia Massa1 \*, Francesca Paolini2 , Carmela Marino3 , Rosella Franconi3 and Aldo Venuti 2 \**

*1 Biotechnology Laboratory, Biotechnology and Agroindustry Division, Department of Sustainability, ENEA (Italian National Agency for New Technologies, Energy and Sustainable Economic Development), Rome, Italy, 2Virology Laboratory, HPV-UNIT, Department of Research, Advanced Diagnostic and Technological Innovation (RIDAIT), Translational Research Functional Departmental Area, IRCSS Regina Elena National Cancer Institute, Rome, Italy, 3Biomedical Technologies Laboratory, Health Technologies Division, Department of Sustainability, ENEA, Rome, Italy*

#### *Edited by:*

*Suvi Tuulikki Häkkinen, VTT Technical Research Centre of Finland Ltd, Finland*

#### *Reviewed by:*

*Carlo De Giuli Morghen, University of Milan, Italy Hugh S. Mason, Arizona State University, United States*

#### *\*Correspondence:*

*Silvia Massa silvia.massa@enea.it Aldo Venuti aldo.venuti@ifo.gov.it*

#### *Specialty section:*

*This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science*

*Received: 12 January 2019 Accepted: 26 March 2019 Published: 11 April 2019*

#### *Citation:*

*Massa S, Paolini F, Marino C, Franconi R and Venuti A (2019) Bioproduction of a Therapeutic Vaccine Against Human Papillomavirus in Tomato Hairy Root Cultures. Front. Plant Sci. 10:452. doi: 10.3389/fpls.2019.00452*

Human papillomavirus (HPV) tumor disease is a critical public health problem worldwide, especially in the developing countries. The recognized pathogenic function of E5, E6, and E7 oncoproteins offers the opportunity to devise therapeutic vaccines based on engineered recombinant proteins. The potential of plants to manufacture engineered compounds for pharmaceutical purposes, from small to complex protein molecules, allows the expression of HPV antigens and, possibly, the regulation of immune functions to develop very specific therapies as a reinforcement to available nonspecific therapies and preventive vaccination also in developed countries. Among plant-based expression formats, hairy root cultures are a robust platform combining the benefits of eukaryotic plant-based bioreactors, with those typical of cell cultures. In this work, to devise an experimental therapeutic vaccine against HPV, hairy root cultures were used to express a harmless form of the HPV type 16 E7 protein (E7\*) fused to SAPKQ, a noncytotoxic form of the saporin protein from *Saponaria officinalis,* that we had shown to improve E7-specific cell-mediated responses as a fusion E7\*-SAPKQ DNA vaccine. Hairy root clones expressing the E7\*-SAPKQ candidate vaccine were obtained upon infection of leaf explants of *Solanum lycopersicum* using a recombinant plant expression vector. Yield was approximately 35.5 μg/g of fresh weight. Mouse immunization with vaccine-containing crude extracts was performed together with immunological and biological tests to investigate immune responses and anticancer activity, respectively. Animals were primed with either E7\*-SAPKQ DNA-based vaccine or E7\*-SAPKQ root extract-based vaccine and boosted with the same (homologous schedule) or with the other vaccine preparation (heterologous schedule) in the context of TC-1 experimental mouse model of HPV-associated tumor. All the formulations exhibited an immunological response associated to anticancer activity. In particular, DNA as prime and hairy root extract as boost demonstrated the highest efficacy. This work, based on the development of low-cost technologies, highlights the suitability of hairy root cultures as possible biofactories of therapeutic HPV vaccines and underlines the importance of the synergic combination of treatment modalities for future developments in this field.

Keywords: plant molecular farming, hairy root cultures, plant-produced antigens, HPV – human papillomavirus, cancer, therapeutic vaccines, heterologous prime – boost

# INTRODUCTION

Over the past four decades, a wealth of literature has demonstrated the production of exogenous proteins in plants for health applications. Indeed, plant-based expression systems have great potential to produce different types of biologics at reasonable costs and with reduced risks of contamination by threatening pathogens. This approach is especially advantageous in the field of prevention and treatment of infections and cancer (Rybicki, 2014; Streatfield et al., 2015; Loh et al., 2017).

Transient expression of target proteins achieved by plant viruses or by agroinfiltration often allows higher protein yield with respect to stable transformation due to the absence of chromosomal integration (Komarova et al., 2010) and represents also a means for evaluation of expression before starting the generation of transgenic plant-based expression platforms. Nevertheless, expression of therapeutic proteins using *in vitro* plant systems under contained conditions represents a profitable manufacturing approach in terms of uniform cultivation conditions, product quality, and downstream purification process (Rischer et al., 2013; Santos et al., 2016; Massa et al., 2018). Together with cell suspensions, organ cultures such as hairy root cultures (HRCs) offer advantages including containment, established cultivation conditions in hormone-free media, product homogeneity (Franconi et al., 2010; Schillberg et al., 2013). Hairy roots are particularly attractive for the industrialscale production of secondary metabolites (Miralpeix et al., 2013), but are also considered for the expression of pharmaceutical proteins, due to better performances over plant cell suspension cultures in terms of genetic and biochemical stability, reduced presence or absence of toxic compounds, such as alkaloids, with respect to leaves. Among plants used for generating hairy roots, crop plants such as tomato and potato were also used. Indeed, hairy roots of many different plant species have been utilized to produce various both secondary metabolites and recombinant proteins of pharmaceutical value at varying yields, such as enzymes (Woods et al., 2008), vaccines and hormones (Woffenden et al., 2008; Skarjinskaia et al., 2013), antibodies in different formats (Wongsamuth and Doran, 1997; Lonoce et al., 2016, 2018). Production of enzymes for replacement therapy of rare diseases was also reported (Rodriguez-Hernandez et al., submitted; Naphatsamon et al., 2018). Recombinant proteins produced in engineered hairy root cultures can be also secreted in the culture medium simplifying downstream purification processes (Guillon et al., 2006; Häkkinen et al., 2014).

Cervical cancer and cervical intraepithelial neoplasia (CIN) are known consequences of human papillomavirus (HPV) infection. Cervical cancer is the fourth most common cancer in female population, with about 569,847 new cases per year (of which 88% in developing countries) and over 311,365 deaths (GLOBOCAN 2018, https://gco.iarc.fr/today). HPV is also the agent behind the development of other tumors and of oropharyngeal carcinogenesis, now in significant rise, and has a causal role in 13% of all female cancers (i.e., 5% of all cancers). Expression of viral oncogenes such as E6 and E7, and, as it was more recently demonstrated, E5, leads to correlated malignant disease.

Although HPV infection is preventable through very efficient recombinant vaccines developed against variously incident oncogenic genotypes in yeast and insect cells, and despite cervical cytology and DNA testing, HPV-related preinvasive and invasive diseases remain critical public health problems. Furthermore, currently available treatments against HPV-related disease are only moderately successful, with radiotherapy, chemotherapy, and surgery very poorly efficient against highgrade lesions (Vici et al., 2014; Cordeiro et al., 2018). This highlights the need for specific treatment strategies. Among the most promising, there are therapeutic vaccines and novel therapeutics that may target the ability of HPV to influence host immune tolerance. If available, these tools may also imply milder side effects than conventional approaches such as radiotherapy and/or chemotherapy.

Immunological therapeutic approaches against HPV have been investigated in the last decades, facilitated by the availability of E5, E6, and E7 tumor-associated antigens, optimal targets for cancer immunotherapy. First examples of experimental HPV therapeutic vaccines were able to block tumor growth in animal models, with some being able to evoke specific cell-mediated immune responses (Cordeiro et al., 2018). However, poor presentation of viral antigens that are expressed at low levels and poor trafficking of effector T-cell populations to non-inflamed mucosal/skin sites were common limitations. The use of adjuvants was, indeed, demonstrated to be crucial for therapeutic efficacy (Gerard et al., 2001).

Studies have focused on enhanced E6 and E7 HPV peptidefusions in combination with bacterial toxins and/or adjuvants. Nevertheless, these approaches showed negligible correlation to good clinical outcomes and tumor regression (Skeate et al., 2016). An advantage of whole protein-based vaccines compared to peptides is that they theoretically cover all available cytotoxic T lymphocytes (CTLs) and T-helper epitopes. Therefore, the use of whole HPV E6 and E7 proteins or fusion proteins as the antigenic source has been widely employed in preclinical therapeutic vaccines tested in animal models and advanced into phase II and III clinical trials (Vici et al., 2016; Barros et al., 2018; Roden and Stern, 2018).

Among protein-based formulations, production of candidate HPV therapeutic/prophylactic vaccines using plant-derived expression platforms was also proven. Different plant-based expression systems were considered, from whole plant approaches for transient expression to stably transformed green microalgae, using single HPV antigens or fusion to peptides to improve accumulation yield, to intracellular targeting strategies. In many cases, evidence of immunogenicity and efficacy in animal models were reported (reviewed in Chabeda et al., 2018). In our previous experience, crude plant extracts containing HPV16 E7 antigen were shown to provide protection against challenge with E7-expressing TC-1 cells (Franconi et al., 2002, 2006). These responses were improved when E7 was produced in plants as fusion to a bacterial carrier (LicKM, *Clostridium thermocellum* beta-glucanase) (Massa et al., 2007; Venuti et al., 2009).

Besides protein-based formulations, genetic vaccination is also a promising immunotherapeutic tool due to amenability to engineer sequences (e.g., addition of sequences of immunological value), stability and ease of manufacturing, cost-effectiveness, safety, and general tolerability (Gurunathan et al., 2000). Genetic immunization is able to induce adaptive cell-mediated immune responses, including activation of CD4+ helper T cells and CD8+ cytotoxic T cells, crucial to the resolution of cancer (Stevenson et al., 2004). Indeed, the most remarkable result in the field of therapeutic HPV vaccines is the clinical activity showed by a phase IIb randomized trial performed with a DNA-based vaccine (VGX-3100) in either HPV16- or HPV18-positive CIN2/3 patients (Trimble et al., 2015). In this trial, DNA vaccine delivery included the use of intramuscular injection coupled with electroporation (i.e., administration of short electrical pulses at the site of the DNA vaccine injection to increase plasmid uptake and correlated immune response) and showed, for the first time, significant regression of CIN2 lesions.

In our search for innovative immune-stimulatory tools for the rational design of therapeutic DNA-based vaccines against HPV (Massa et al., 2008), we focused on the sequence encoding the saporin protein (SAP) from *Saponaria officinalis*, a member of the "Ribosome-Inactivating Proteins" (RIPs) family (Hartley and Lord, 2004; Stirpe, 2004; Zarovni et al., 2009). We demonstrated that the combination of a mutagenized SAP sequence (SAPKQ) with an attenuated, synthetic HPV16 E7 gene (E7GGG, thereafter indicated as E7\*), in the context of DNA-based vaccination, determined a modulation of E7-specific humoral and cell-mediated immune responses affecting the growth of E7-expressing tumors (Massa et al., 2011).

Heterologous DNA prime-protein boost regimen (i.e., administration of an immunogen as a DNA-based preparation in the first dose, followed by the same immunogen as a proteinbased preparation in the booster dose) is emerging as a tool for envisaging new therapeutic options in HPV-associated infection and cancer (Peng et al., 2016).

In the present study, hairy root cultures derived from tomato (*Solanum lycopersicum*) were used to stably express E7\*-SAPKQ in order to devise a protein-based experimental HPV therapeutic vaccine. HRC turned out to be a better platform to express E7\*-SAPKQ than *E. coli*, due to its ability to accumulate the recombinant vaccine in the soluble fraction of root extracts. The administration of the DNA-based E7\*-SAPKQ vaccine (prime dose) followed by E7\*-SAPKQ protein-containing hairy root extract (boost dose) was considered, together with homologous prime-boost regimens. All the formulations exhibited an immunological response associated to anticancer activity. In particular, DNA as prime and hairy root extract as boost demonstrated the highest efficacy.

This work, based on the development of low-cost technologies (i.e., DNA-based vaccination and plant-based expression systems), highlights the suitability of hairy root cultures as possible biofactories of therapeutic HPV vaccines and underlines the importance of the synergic combination of treatment modalities for future developments in this field.

#### MATERIALS AND METHODS

#### Cells

*Agrobacterium tumefaciens* C58C1 strain was used for plantbased transient expression experiments. *Agrobacterium rhizogenes* A4 (*Rhizobium rhizogenes* ATCC 43057; American Type Culture Collection, Manassas, VA, USA) was used to generate hairy root clones. Bacteria were grown in YEB medium (5 g/l beef extract, 1 g/l yeast extract, 5 g/l peptone, 5 g/l sucrose, 2 mM MgSO4) at 28°C with shaking at 220 rpm. When necessary, kanamycin (50 μg/ml) was added to the culture medium.

E7-expressing TC-1 tumor cells were kindly gifted by T.C. Wu (Johns Hopkins Medical Institutions, Baltimore, MD) and were cultivated in RPMI (Invitrogen, Paisley, UK) containing 400 μg/ml G418 and 10% fetal calf serum.

#### Animals

Six- to eight-week-old female C57BL/6 mice were supplied by Charles River Laboratories. Animal handling and sacrifice were performed under specific pathogen-free conditions at the Animal House of the Regina Elena National Cancer Institute. All experimental procedures were approved by the Government Committee of National Minister of Health (85/2016-PR) and were carried out in accordance with EU Directive 2010/63/ EU for animal experiments.

#### Plant Material, Genes, and Construction of the Plant Expression Vector

Whole plant-based transient expression of E7\*-SAPKQ was assessed in 3-week-old *Nicotiana benthamiana* plants upon agroinfiltration. Plants were grown in soil in the Bio-Safety Level-2 green-house available at ENEA, under LED lighting (650 LED Lumigrow lamp; spectroradiometric data: lux 3106.5; total PAR 138.83; Watts 0.0011) with daylight integration and dark condition (16/8 h) until use. Nutrients (Idrofill base, K Adriatica, Italy) were added to water every 3 weeks.

Tomato (*S. lycopersicum* L.) cv. Micro-Tom leaf explants were used to generate hairy root clones upon infection with recombinant *A. rhizogenes* A4. Micro-Tom plants were grown in greenhouse under hydroponic conditions and LED lighting, as described for *N. benthamiana*.

The attenuated E7GGG gene (in this work indicated as E7\*) was obtained from the E7 gene of HPV16 (HPV16 genome NCBI Reference Sequence: K02718), as previously described (Massa et al., 2008). The gene encoding the leaf apoplastic saporin isoform (SAP, Genbank Acc. No. DQ105520) had been previously mutagenized to abolish toxicity (SAPKQ, IQMTAE176AAR179FRY > IQMTA**K**176AA**Q**179FRY) and to obtain the pVax-E7\*-SAPKQ fusion construct by fusing the E7\* sequence to the 3′ end of the SAPKQ gene (Massa et al., 2011).

In order to obtain the plant-expression constructs, genes were PCR-amplified from the abovementioned DNA construct. Cloning into the binary Ti plasmid pEAQ-HT (PBL Technologies) (Sainsbury et al., 2009) was performed with primers designed to add either a N- or C-terminal His6-tag, or, on the contrary, no His6-tag to final products (**Figure 1**).

from *A. tumefaciens* (NPTII). Genes were cloned between the indicated sites to gain either a N-terminal, C-terminal, or no His6-tag after expression.

Genes were inserted under the CaMV 35 promoter/Nos terminator cassette downstream of the 5′-UTR of the Cowpea Mosaic Virus (CPMV) RNA-2 harboring the U162C mutation ("hypertranslatable," HT) and upstream of the 3′-UTR of the CPMV RNA-2. In this vector, the tomato bushy stunt virus (TBSV) p19 sequence serves as a posttranscriptional silencing suppressor.

#### Agroinfiltration of Whole *N. benthamiana* Plants for Evaluation of Expression of the E7\*, SAPKQ, and E7\*-SAPKQ Proteins in Plant Cells

The resulting pEAQ-HT constructs were transferred into *A. tumefaciens* strain C58C1 by electroporation and, then, introduced by vacuum infiltration into *N. benthamiana* for transient expression. Briefly, 2-ml starter culture harboring the pEAQ-HT-based constructs started from a fresh colony was grown overnight in YEB containing 50 mg/l kanamycin. Then, the culture was diluted 1:500 into 500 ml of YEB, 50 mg/l kanamycin, 10 mM 2-(N-morpholino) ethanesulfonic acid (MES) pH 5.6, 2 mM MgSO4 and 20 μM acetosyringone, and grown overnight at 28°C, 220 rpm to O.D600nm = 1.7.

Bacteria were pelleted by centrifugation at 3000 × *g* for 15 min and resuspended to a final OD600nm = 2.4 with the addition of 200 μM acetosyringone in MMA medium (4.4 g/l Murashige and Skoog salts, 10 mM MES pH 5.6, 20 g/l sucrose). After 1–3 h of incubation at RT, the suspension was applied to 3-week-old *N. benthamiana* plants by vacuum infiltration cabinet (O.M.EC. Impiantistica, Grassobbio, Italy), and plants were returned to the growth module for observation. Leaves were harvested and stored at −80°C until use.

#### Assessment of E7\*, SAPKQ, and E7\*-SAPKQ Expression in *N. benthamiana* Plants

Leaf tissues of infiltrated plants were harvested 1–7 days post infiltration. Leaf biomass samples were finely ground in liquid N2 with mortar and pestle, resuspended and homogenized in extraction buffer (1:3 w/v; phosphate-buffered saline "PBS": 21 mM Na2HPO4, 2.1 mM NaH2PO4, 150 mM NaCl, pH 7.2; alternatively, GB buffer was used: 100 mM Tris-HCl pH 8.1; 10% glycerol; 400 mM saccharose; 5 mM MgCl2; 10 mM KCl; 10 mM 2-β-mercaptoethanol) containing a protease inhibitor cocktail (Complete™; Roche, Mannheim, Germany). Samples were incubated on ice for 30 min with gentle rocking and extracts were clarified by centrifugation at 11,000 g for 20 min. Supernatants were transferred to a fresh tube and kept on ice until use and total soluble protein (TSP) content was estimated by the Bradford assay (Bio-Rad Inc., Segrate, Italy). Pellets were resuspended in appropriate volumes of SDS-PAGE sample buffer (10% glycerol, 60 mM Tris-HCl pH 6.8, 0.025% bromophenol blue, 2% SDS, 3% 2-mercaptoethanol), constituting the insoluble fraction of leaf extracts. Samples containing 15 μg of TSP and the corresponding insoluble fraction in SDS-PAGE sample buffer were denatured at 95°C for 5 min and subsequently electrophoresed on a 12% SDS-polyacrylamide gel. Known amounts of proteins purified from *E. coli* were used as reference standards. Extract from leaves infiltrated with pEAQ-HT harboring an irrelevant gene was used as negative control. SDS Molecular Weight Standard Mixture (Sigma) was used during SDS-PAGE separation. Separated protein samples were transferred to a PVDF membrane (Millipore, Bedford, MA) by electro-transfer at 100 V with a Trans-Blot apparatus (BioRad). Filters were probed with either mouse anti-E7 or rabbit anti-SAP polyclonal antibodies (Massa et al., 2011) used at a dilution of 1:2,000 and, then, with 1:10,000 dilution of either a horseradish peroxidase-conjugated goat anti-mouse IgG antibody or a horseradish peroxidase-conjugated goat anti-rabbit IgG antibody (GE-Healthcare) for 1 h. The immune complexes were detected by developing chemiluminescence with the Immobilon Western Chemiluminescent HRP Substrate (Merck Millipore). The ImageQuant™ LAS 500 (GE Healthcare) was used for chemiluminescence signal detection and densitometric quantification of bands was performed by ImageJ software.

#### Hairy Roots Generation

*S. lycopersicum* (cultivar Micro-Tom) clonal root lines were obtained from wild-type leaf explants co-cultured with *A. rhizogenes* A4 (ATCC, 43057™) harboring the constructs. Bacteria were grown in YEB medium containing 50 mg/l rifampicin and 50 mg/l kanamycin to OD600nm = 0.6, at 28°C and 220 rpm. Bacteria were, then, pelleted by centrifugation at 3000 × *g* for 15 min and resuspended at OD600nm = 1 in Murashige and Skoog medium (MS, Duchefa) with 30 g/l sucrose and 200 μM acetosyringone, pH 5.8.

Leaves from 3-week-old Micro-Tom plants were harvested, sterilized in 0.1% (v/v) sodium hypochlorite solution (NaClO) for 15 min, and aseptically cut into explants of 1 cm × 1 cm. Explants were subsequently inoculated by immersion in the recombinant *A. rhizogenes* suspension for 15 min, in a rotary shaker at the minimum speed, in the dark. Explants were dried onto sterilized tissue paper and transferred on their adaxial side, onto co-culture plates containing MS agar medium and 100 μM acetosyringone and incubated under dark conditions for 3 days. The co-cultured explants were, then, blotted and transferred to a hormone-free MS medium supplemented with 200 μg/l cefotaxime (Cef; Sandoz, Varese, Italy) at 25°C. Fresh growing hairy roots were obtained after 8–10 days.

Emerging roots were excised and transferred to new plates. *A. rhizogenes* was eradicated by transferring roots every 15 days onto MS agar plates using decreasing Cef concentrations (0.25, 0.125, and 0.05 g/l) until no antibiotic was added. Transformation and *A. rhizogenes* eradication were confirmed by PCR using specific oligonucleotides for the exogenous sequences and for *rol B*/*rol C* genes, and *vir C* specific primers, respectively. The selected, kanamycin-resistant hairy root clones were considered for subsequent growth and analysis. The growth rate of one representative clone for each transformation was measured as root fresh weight at different time points after liquid subculture in Erlenmeyer flasks over a 28-day culture period and this was performed on two biological replicates for each hairy root clone. Hairy root biomass harvested for subsequent analysis was carefully handled, pulverized in liquid nitrogen, and immediately stored in −80°C.

#### Screening of Clonal Hairy Root Lines Expressing the Antigens by Immunoblotting Analysis

The selection of antigen-expressing hairy root clones was performed by immunoblotting. Root tissues were finely ground in liquid N2 with mortar and pestle and resuspended and homogenized in phosphate-buffered saline pH 7.2 (PBS, 1:3 w/v) containing a protease inhibitor cocktail (Complete™; Roche, Mannheim, Germany). Sample preparation for 12% SDS-PAGE acrylamide gels, immunoblotting, and detection of bands were performed as described for *N. benthamiana* samples. Protein molecular mass marker Color burst™ (Sigma) was used as reference for bands upon immunoblotting.

#### Immunofluorescence Detection of Antigens in Hairy Roots

Recombinant His6-E7\*-SAPKQ expressed from Micro-Tom hairy tissue samples located on glass slides was detected by immunofluorescence after softening tissues with 2% driselase in PBS for 40 min at 37°C following fixation with paraformaldehyde 4% in PBS for 1 h. Thereafter, samples were washed in PBS (pH 7.2) three times for 5 min each time with gentle shaking. Then the target was covered with 10% DMSO and 0.5% NP40 in PBS for 1 h at room temperature. Following removal of permeabilization solution, 5% BSA was applied for 1 h at RT and then, samples were dipped with primary rabbit anti-SAP polyclonal antibody diluted 1:300 and incubated at RT in wet box overnight. Then, the samples were washed three times. After a gentle drying, samples were incubated with antirabbit polyclonal antibody conjugated with phycoerythrin (sc-3739 goat anti-rabbit IgG-PE, Santa Cruz) diluted 1:1,000, in the dark for 1 h, washed, added with DAPI staining solution, and incubated in the dark at RT for 10 min. Subsequently, slides were washed three times in PBS. Samples were sealed with glycerol:PBS (1:1) for image collection under Nikon Eclipse TE2000-S epifluorescence microscope equipped with a Hg 100 lamp and filter sets appropriate for DAPI and Cy3 fluorescence.

#### Vaccination Schedules in Mouse Model

Six- to eight-week-old female C57BL/6 mice were immunized according to two immunization protocols (**Figure 6**). In the first protocol (i.e., immunization protocol), mice were subjected to vaccination and, thereafter, analyzed for immune response. In the second one (i.e., therapeutic protocol), mice were subcutaneously (s.c.) injected with TC-1 tumor cells before administration of the vaccines. Immunization protocol implied a priming with the E7\*-SAPKQ DNA vaccine (50 μg/mouse, intramuscularly, i.m.) into the tibia muscle, followed by electroporation according to standardized procedures (Cordeiro et al., 2015). To perform immunizations, root tissues were finely ground in liquid N2 with mortar and pestle, resuspended and homogenized in phosphate-buffered saline pH 7.2 (PBS, 1:3 w/v), and administered to mice. Antigen doses were quantified in the extract by immunoblotting using *E. coli*-purified fusion protein as standard, as previously described (Franconi et al., 2002). Animals were boosted after 1 week either with the same DNA- or with the E7\*-SAPKQ root extract-based vaccine (1 μg/mouse, s.c. in the trunk). Alternatively, mice were primed with the E7\*-SAPKQ root extract-based vaccine and boosted with the same E7\*-SAPKQ root extract-based vaccine.

The therapeutic protocol implied that mice were injected with 5 × 104 TC-1 tumor cells in 200-μl saline solution and, 3 days post tumor challenge, primed and boosted with the same preparations by the same time intervals used in the immunization protocol. Tumor growth was monitored by visual inspection and palpation two times a week. Animals were scored as tumor bearing when tumors reached a size of approximately 1–2 mm in diameter. For ethical reasons, the experiment was ended and all animals euthanized when tumor growth reached 3 cm3 in the control animals. Finally, all tumors were carefully removed from euthanized animals and tumor weight recorded.

#### IFN-Gamma Enzyme-Linked Immuno-Spot Assay

HPV16 E7-specific T-cell precursors were detected by enzymelinked immunospot assay (ELISPOT) 1 week after the boost, according to previous reported protocols (Massa et al., 2007). Briefly, single-cell suspension of splenocytes (1 × 106 cells per well) from each group of vaccinated mice was added to microtiter wells coated with anti-mouse IFN-γ antibody (5 μg/ml, BD Biosciences PharMingen, San Diego, CA), along with interleukin-2 (50 units/ml, Sigma-Aldrich Italia, Milan, Italy). Triplicate samples were incubated at 37°C for 48 h with the E7-specific H-2Db (10 μg/ml) cytotoxic T-lymphocyte (CTL) MHC class I epitope (amino acids 49–57, RAHYNIVTF). After peptide incubation, a biotinylated anti-mouse IFN-γ antibody (2 μg/ml) was added for 4 h at room temperature. Cell spots were detected by streptavidin-HRP incubation for 1 h at room temperature and staining with filtered 3-amino-9-ethylcarbazole substrate (BD Biosciences PharMingen, San Diego, CA), for 5 min. Spots were counted using a dissecting microscope.

#### Statistical Analysis

Comparisons between individual data points were analyzed by two-tailed Student's *t*-test using the GraphPad Prism 8 software.1 Data were expressed as means ± standard deviations (SD) or ± standard error of mean (SEM). A *p* < 0.05 was considered statistically significant.

#### RESULTS

#### E7\*-SAPKQ Protein Expression in *N. benthamiana* Plants

To express a recombinant therapeutic HPV vaccine in plantbased systems, the E7\*-SAPKQ fusion protein was cloned into the CaMV 35 promoter/Nos terminator cassette of the pEAQ-HT plant expression vector, between appropriate restriction sites, in order to add an affinity purification His6-tag either at the N- or C-terminus of the final product synthesized by plant cells. Also the single E7\* and SAPKQ sequences were cloned in the same manner in order to verify the accumulation behavior of the single components of the fusion protein E7\*-SAPKQ (**Figure 1**).

Constructs were introduced into plant cells either by agroinfiltration mediated by *A. tumefaciens* C58C1 in *N. benthamiana* plants for transient expression, or by transformation mediated by *A. rhizogenes* A4 to obtain stably expressing hairy root cultures from Micro-Tom leaf explants.

The engineered proteins were firstly introduced in *N. benthamiana* plants by transient methodology. Protein extracts from leaves were analyzed by immunoblotting over a 7-day period (from day 1 to day 7 post infiltration) to assess the accumulation of the recombinant antigens. Extractions, initially performed using PBS-based buffer, turned out to be insufficient to satisfactorily detect expression of the recombinant proteins. Then, a stronger extraction using GB buffer was used, revealing that both E7\*-SAPKQ and the single antigens E7\* and SAPKQ were expressed preferentially as N-terminal His6-tagged and non-tagged form in extracts of infiltrated leaves (**Figure 2**).

Among all the constructs, mainly His6-E7\*-SAPKQ and E7\*-SAPKQ were produced upon agroinfiltration (**Figure 2A**), and accumulated as soluble proteins in leaf biomass (**Figure 2B**). The theoretical weight of plant-expressed His6-E7\*-SAPKQ and E7\*-SAPKQ was confirmed by the separation of the *E. coli* standard His6-E7\*-SAPKQ (Massa et al., 2011), showing a most distinct, specific band of the apparent molecular weight ≤ 45 kDa, that indicated the accumulation of the monomeric form of the fusion protein (**Figures 2A,B**). Additional faint bands at lower molecular weight are present as probable degradation by-products. Both His6-E7\*-SAPKQ and E7\*-SAPKQ accumulation show a peak 5 d.p.i. with a maximum expression level estimated of 3.20 and 3.90 μg/g fresh weight corresponding to 0.15 and 0.17% TSP, respectively. E7\*-SAPKQ-His6 was calculated as 0.40 μg/g fresh weight.

E7\* is produced upon agroinfiltration, given that a specific band is recognized by the anti-E7 antibody in plant extracts especially for the non-tagged protein 5 d.p.i. (**Figure 2C**). The band separates at a higher molecular weight than the *E. coli* reference standard (recombinant His6-E7\*, about 17 kDa) (Massa et al., 2008). E7\*-His6 seemed to accumulate at low levels 5 d.p.i., and as a dimer (about 34 kDa) 7 d.p.i. The maximum concentration calculated for His6-E7\* and for the non-tagged E7\* was 9.4 μg/g fresh weight (0.41% TSP), and 14.6 μg/g fresh weight (0.64% TSP), respectively. The dimeric E7\*-His6 was estimated to be 8.2 μg/g fresh weight (0.36% TSP).

His6-SAPKQ accumulated with a peak 7 d.p.i., accounting for at least 14.52 μg/g fresh weight (0.64% TSP). Non-tagged product was found to accumulate 10.27 μg/g fresh weight (0.45% TSP), mainly 7 d.p.i. Lower expression was found for SAPKQ-His6 that mainly accumulated 3 d.p.i. (3.20 μg/g fresh weight; 0.14% TSP). The theoretical weight of plant-expressed SAPKQ in the three forms was confirmed by the separation of the *E. coli* standard His6-SAPKQ, showing a specific band of the apparent molecular weight of 30 kDa (**Figure 2D**).

#### E7\*-SAPKQ Expression in Hairy Root Cultures

After confirming the possibility to express the antigens in plant cells, the N-terminal His6-tagged constructs were chosen to engineer clonal hairy root lines for *in vitro*, stable, plant-based

<sup>1</sup> www.graphpad.com

bioproduction of the experimental therapeutic vaccine. Fortythree clonal hairy root lines were isolated from *S. lycopersicum* cv. Micro-Tom after leaf explants infection with recombinant *A. rhizogenes* A4 strain transformed with the expression vector pEAQ-HT/His6-E7\*-SAPKQ for subsequent analysis. Thirty-four hairy root clones were isolated after transformation with pEAQ-HT/His6-E7\*. Forty-five hairy root clones were isolated after transformation with pEAQ-HT/His6-SAPKQ.

Hairy root clones were isolated from Micro-Tom for the different constructs and were subcultured. Recombinant hairy root clones showed slower growth rate with respect to clones obtained after transformation with non-transformed *A. rhizogenes* A4 (**Figure 3A**), and different growth rate depending on the construct. The best growing clones were those expressing His6- E7\*-SAPKQ (**Figure 3B**), followed by those expressing His6- SAPKQ (**Figure 3C**) and by those transformed with pEAQ-HT/ His6-E7\* that had very low growth rates (**Figures 3D,E**).

The selection of hairy roots expressing the different products was performed by immunoblotting using either anti-E7\* or anti-SAP polyclonal antibodies (**Figure 4**). In the case of His6- E7\*-SAPKQ, 34 clones (about 79% of the isolated clones) showed the expected band at about 45 kDa indicating the presence of the full-size fusion protein. Four clones exhibited the most intense bands for the given quantity of total protein transferred (**Figure 4A**). Also in the case of the hairy rootproduced His6-E7\*-SAPKQ, the fusion protein accumulated mainly in the soluble fraction (**Figure 4B**). TSP from two out of the four best clones expressing His6-E7\*-SAPKQ was assayed by immunoblotting analysis at different time points after subculture (day 7, 14, 28, 35, 42). As shown in **Figure 4C**, accumulation of His6-E7\*-SAPKQ is stable until day 15 (i.e., along almost the whole cultivation period since clones were normally subcultured for maintenance every 21 days), to decrease thereafter. The lower weight band revealed by the anti-SAP polyclonal antibody (**Figure 4A**) but not by the anti-E7 polyclonal antibody (**Figures 4B,C**) might represent a proteolysis product containing SAPKQ, or an alternative splicing activity.

Fourteen clones resulted positive for expression of His6- SAPKQ (about 31% of the clones), with three clones showing the most intense bands at the expected molecular weight of about 30 kDa (**Figure 4A**). Among those transformed with His6-E7\*, only eight clones survived on selective medium and none of them showed detectable expression of His6-E7\* (**Figure 4D**).

All these clones were kept for subsequent subculture and measurements. His6-E7\*-SAPKQ best clone accounted for 35.49 ± 2.69 μg/g of fresh weight (1.25% TSP) and His6-SAPKQ accounted for 31.06 ± 4.79 μg/g of fresh weight (0.96% TSP).

Immunofluorescence analysis performed on samples of Micro-Tom hairy root clones expressing His6-E7\*-SAPKQ confirmed the accumulation of the recombinant protein by the detection of an intracellular specific signal. His6-E7\*-SAPKQ (red labeling) appeared to be localized both in the root cap/apical meristem and along the region of elongation/maturation of transformed hairy roots. This showed vaccine expression both in the root tips where cell proliferation occurs and also that it was stably accumulated in mature tissues. Tissue sampling in view of preparation of extracts for administration upon preclinical studies was performed taking into account this observation. No expression of His6-E7\*-SAPKQ in hairy roots obtained after transformation with an irrelevant gene was observed (**Figure 5**).

period. Data represent average values ± SD of triplicate assays from two

#### Mouse Immunological Responses to Vaccination

The immunological effects of vaccines were studied in C57BL/6 mice with the prime-boost schedule described in **Figure 6**. Cell-mediated immune responses were analyzed in ELISPOT assay for INF-γ secreting cells. One week after the boost, spleens were collected from immunized mice. Significantly positive scores were obtained only in the animal immunized with the root extract boost (**Figure 7**). In particular, heterologous prime (i.e., E7\*-SAPKQ DNA)/boost (i.e., E7\*-SAPKQ-containing root extract) exerted a dramatic increase in cell-mediated immune response specifically directed against E7 with respect to the other immunization combinations. The specific activity of vaccine preparation in root extracts was further confirmed by the absence of any activity of the root extracts without E7 (**Figure 7**).

The immunological responses evoked by E7\*-SAPKQcontaining root extract vaccines induced us to determine if a therapeutic potential of the heterologous schedule may still be envisaged in the TC-1 preclinical model. Indeed, TC-1 cell expressing the E7 oncogene is a well-known and validated animal model for the preclinical evaluation of therapeutic candidate HPV vaccines. The homologous schedule (i.e., administration of E7\*-SAPKQ DNA both as a prime and as a boost) was utilized for comparison. After injecting TC-1 cells into naïve C57BL/6 mice, prime and boost immunizations were performed on days 3 and 10, respectively (**Figure 6**). Tumor volume recorded 24 days after boost was mostly affected by the vaccine treatments implying the administration of the E7\*-SAPKQ-containing root extracts, showing higher, statistically significant differences with respect to controls (empty pVax and mock extract, respectively) (**Figure 8A**). Although not statistically significant, the heterologous schedule scored a slightly higher reduction in tumor volume than the homologous administration of E7\*-SAPKQ DNA both as prime and as boost.

Since tumor volume calculation *in vivo* cannot take into account all the tumor dimensions, tumors were excised from immunized mice and their weight recorded when tumor controls reached >3 cm3 and all animals were euthanatized for ethical reasons. Data obtained from tumor weight substantially confirmed those related to tumor volume: the heterologous E7\*-SAPKQ DNA/E7\*-SAPKQ-containing root extract schedule showed 4-fold reduction in tumor growth with respect to controls. In addition, the heterologous E7\*-SAPKQ DNA/E7\*-SAPKQcontaining root extract schedule showed statistically significant differences with respect to homologous E7\*-SAPKQ DNA (both as prime and boost) or E7\*-SAPKQ-containing root extracts (both as prime and boost) schedules, with *p* = 0.0009 or *p* = 0.0056, respectively (**Figure 8B**).

#### DISCUSSION

HPV-related cancers include cancer of the cervix, vulva, vagina, penis, or anus. HPV infection can also cause cancer in subsets of oropharyngeal tumors, including the base of the tongue and tonsils. Although HPV prevention is feasible, the effects of preventive vaccination on the incidence of HPV tumors

biological replicates.

will only be visible in the long term with a prediction of a 7.5% reduction in cases, if vaccination policies remain primarily "female-only" (Hartwig et al., 2017). In addition, current treatments are distant from the ideal ones and less effective in high-grade lesions. Expression of viral oncogenes E6, E7, and E5 leads to HPV-related malignant progression. Due to their peculiarities, HPV oncogenes represent an excellent target for cancer immunotherapy (de Freitas et al., 2017).

HPV peptide- and protein-based experimental vaccines have reached phase II trial phases, the whole protein-based having the advantage to more probably evoke efficient cell-mediated responses in patients. Nevertheless, it has been suggested that adjuvants and tools able to improve immunogenicity and endowed with less safety concerns for human health than those that are currently tested (e.g., calreticulin, lysosome-associated proteins, heat shock proteins, tetanus toxoid) are necessary to minimize autoimmune reactions, or preexisting immunity (Wolkers et al., 2002; Millar et al., 2003; Savelyeva et al., 2003). Therefore, there is still need for further improvements in terms of safety, efficacy and, possibly, of decreased costs of vaccine preparation.

Plant molecular farming (PMF) may be the approach to achieve these results. Indeed, PMF is devoted to produce active and secure cost-effective pharmaceutical proteins, and plants can be a source of immune-stimulating tools in terms of both primary and secondary metabolites. So far, production of valuable pharmaceutical proteins by PMF has been demonstrated, which can help the treatment of patients particularly in developing countries, where production and preservation costs of medicines cannot be afforded. Platforms for PMF are different and may involve the use of whole plants or plant cell/organ cultures subjected to a transient or stable expression. PMF may be intended for purification, or administration as a crude extract or whole plant tissues. All these aspects may emphasize the advantages of the plant-based systems for expression of pharmaceutical proteins. Indeed, it has been shown that genes encoding tumor-associated antigens and viral coat proteins of HPV can beexpressed in plants not only retaining their native immunological activity but also receiving adjuvant activity from plant extract itself (Franconi et al., 2002, 2006; Chabeda et al., 2018).

On the other hand, safety, efficacy, and potential immunogenicity are also features of DNA vaccines targeting HPV (Gurunathan et al., 2000; Stevenson et al., 2004; Massa et al., 2008). Literature data indicate that genetic immunotherapy is becoming a pharmacological tool and therapeutic option against cervical disease, with HPV DNA vaccines reaching encouraging results in phase II clinical trials phases (Vici et al., 2016).

We have already reported that the genetic fusion of E7\* with SAPKQ modulates E7-specific humoral and cell-mediated immune responses resulting in antitumor effects against E7-expressing tumors when administered as a DNA vaccine. These findings opened the way to a new application of this plant protein, known in medicine mainly as the cytotoxic component of immuno-toxins, and gave us further impulse to the development of candidate HPV therapeutic vaccine, expanding the nature of the possible immune-enhancers of HPV E7.

This manuscript describes the use of hairy root culture expression technology for the bioproduction of a candidate therapeutic vaccine endowed with specific cell-mediated response associated to anticancer activity against HPV in a mouse model. This is the first time, to our knowledge, that such plant-based expression platform is preclinically proven to produce HPV antigens with antitumor activity.

We report expression of E7\*-SAPKQ in both whole *N. benthamiana* plants by transient expression technology and hairy roots from tomato by stable transformation with the idea to develop a safe and affordable therapeutic vaccine for HPV malignancies. Our purpose was to accumulate the E7\* protein in a plant-based expression system maintaining/improving its antigenicity.

Expression of the SAPKQ protein had been already tried in *E. coli* (Massa et al., 2011). Indeed, bacterial expression of

SAPKQ was obtained with no bacterial growth block, demonstrating that the mutagenesis that had been introduced into the saporin leaf apoplastic isoform coding sequence of *S. officinalis* was able to highly reduce its cytotoxicity, otherwise leading to procaryotic/eucaryotic cell death (Stirpe, 2004). Nevertheless, when expression of different versions of E7\*-SAPKQ fusion protein was attempted, the majority of the recombinant fusion proteins was found in the insoluble fraction of the bacterial lysates, as a possible consequence of residual toxicity of the SAPKQ proteins and of its E7GGG fused derivatives in the bacterial host (usually able to neutralize "disturbing" exogenous proteins in inclusion bodies found in the insoluble fraction of lysates upon extraction procedures). Therefore, alternative expression systems were tried, and, in particular, eukaryotic plant-based expression chassis.

The engineered proteins E7\*, SAPKQ, and E7\*-SAPKQ were firstly introduced into *N. benthamiana* plants by transient methodology, primarily to assess if the recombinant protein expression was generally tolerated in plant cells. The single components of the fusion were utilized to investigate if protein bioaccumulation by plant cells was in function of the type of protein itself. It is noteworthy that while E7\* alone was difficult to accumulate (at least in the presence of the His6-tag), the different forms of SAPKQ were quite easily accumulated. This gave a clue about a possible positive influence of the SAPKQ in the accumulation of the fusion product E7\*-SAPKQ. Indeed, even though at low levels, His6-E7\*-SAPKQ was expressed in plants upon agroinfiltration in both un-tagged and His6-tagged forms, and turned out to be accumulated mainly as a soluble protein. This result also suggested the possibility to devise a different plant-based platform for stable transformation to possibly achieve better protein yield and purification in native conditions. However, the low and different accumulation levels of all the forms of E7\* prompted us not to purify these antigens from the infiltrated leaf biomass.

We demonstrated that infection of Micro-Tom leaf explants with recombinant *A. rhizogenes* can be used for rapid establishment (approximately 6 weeks) of hairy root clones stably expressing His6-E7\*-SAPKQ. Expression yield accounted for at least 35.5 μg/g of fresh weight. This value was tenfold higher than that obtained in whole plants upon agroinfiltration and it is comparable to or higher than those reported for

rejection, respectively.

other exogenous proteins accumulated in recombinant hairy roots. His6-E7\*-SAPKQ in hairy roots yielded approximately fivefold higher than a similar fusion protein intended for immunization, the rabies glycoprotein-ricin toxin B chain (i.e., another plant toxin adapted for therapeutic use like naturally found saporin) chimera, in optimized air-lift tomato hairy root-based bioreactors (6–8 μg/g) (Singh et al., 2015). Fusion of ricin toxin B chain with F1:V pneumonic plague vaccine antigen in tobacco hairy roots yielded 140-fold less than His6- E7\*-SAPKQ (0.25 μg/g) (Woffenden et al., 2008). De Guzman et al. (2011) reported production of the *E. coli* B-subunit heat labile toxin antigen in tomato hairy root cultures (~10 μg/g; the same protein was expressed in tobacco and petunia hairy roots ~100 μg/g).

His6-E7\*-SAPKQ expression in tomato clonal hairy root lines was maintained along subculture intervals, and lowering of expression was observed when hairy root clones were forced to age in culture vessels. All these results demonstrate the suitability of the hairy root system to our purposes also in absence of optimization of culture conditions. It is noteworthy that a mild extraction (i.e., performed in PBS, the same solution that was subsequently used for immunizations) was sufficient to obtain the recombinant proteins, while a stronger buffer was necessary to extract the same products from leaf tissues.

No hairy root clones expressing His6-E7\* alone were selected. This is not surprising since, in contrast to transient expression, that was demonstrated several times for E7 in plant-based expression systems (reviewed in Chabeda et al., 2018), constitutive expression could imply some toxicity affecting the onset and subsequent survival of hairy root clones upon stable transformation. Actively growing clones were, on the contrary, obtained for His6-SAPKQ, as for His6-E7\*-SAPKQ. This suggests also that the SAPKQ carrier may be helpful in determining a growth-compatible bioaccumulation of the candidate vaccine in the root tissues.

It is well known that heterologous DNA prime followed by recombinant protein boost regimens can be used to enhance HPV therapeutic vaccine potency (Smyth et al., 2004; Fiander et al., 2006; Peng et al., 2016; Chabeda et al., 2018).

Indeed, in this work, the boosting strategy with the E7\*-SAPKQ-containing extract was effective in strongly improving E7-specific T-cell responses and related anticancer activity after E7\*-SAPKQ DNA-based priming with respect to homologous prime-boost regimens, as proven by ELISPOT assay and biological tests. The model chosen to test the efficacy of these experimental therapeutic vaccines was the well-known and widely used TC-1 model. TC-1 cells are immortalized mouse lung epithelial cells transduced with HPV E6 and E7 genes, able to express these viral oncoproteins that are continuously processed through the proteasome pathway. This processing allows their epitopes to be exposed in the context of the MHC class I complex located on the cell membrane, mimicking the mechanism leading to epitope-specific killing by CTLs upon integration of the HPV genome in an infected target cell (Feltkamp et al., 1995).

E7-specific immunological and anticancer responses were also obtained with the homologous prime-boost regimen based on E7\*-SAPKQ-containing hairy root extracts, suggesting that hairy root expression system could be a useful tool in devising a therapeutic vaccine strategy against HPV. The good elicitation of specific immunity and anticancer activity was statistically significantly higher than that of the homologous prime-boost regimen based on E7\*-SAPKQ DNA, already demonstrated to be effective in a previous study (Massa et al., 2011). Data from tumor weight measurement confirmed efficacy of the treatment with statistically significant differences between experimental vaccines (either homologous or heterologous

FIGURE 8 | Anticancer activity His6-E7\*-SAPKQ in TC-1 mouse model. C57BL/6 mice were challenged with 5 × 104 TC-1 cells and thereafter treated with the different vaccine preparations as in Materials and Methods. Schedules of vaccination were as follows: A, empty pVax prime/mock hairy root extract boost; B, mock hairy root extract prime/mock hairy root extract boost; C, pVax-E7\*-SAPKQ prime/pVax-E7\*-SAPKQ boost; D, pVax-E7\*- SAPKQ prime/E7\*-SAPKQ extract boost; E, SAPKQ extract prime/SAPKQ extract boost. (A) Tumor volumes were recorded at different intervals after boost as in Materials and methods. Data represent means of five animals ± SEM. (B) Tumor weights were recorded when tumor controls reached >3 cm3 and all animals were euthanatized for ethical reasons. Data are means of five animals ± SD. \*\*\*\**p* < 0.0001; \*\*\**p* < 0.001; \*\**p* < 0.02.

prime-boost regimens with E7\*-SAPKQ-based preparations) and controls.

Accumulation of E7 in hairy root tissues was demonstrated as a fusion to the LicKM carrier (Massa et al., 2009); however, in the present work, for the first time, the therapeutic potential of an E7\* produced in clonal hairy root lines is determined in animal studies.

Hairy root technology offers an easy handling alternative that can be conveniently biocontained in a controlled *in vitro* environment and scaled up depending on the demand for the target protein by the use of improved bioreactors. As mentioned, important experimental pharmaceutical proteins, among which vaccine antigens and their fusions with carriers, together with enzymes, hormones, and antibodies in different formats, have been reported. The hairy root system has also the advantage to theoretically allow to regenerate transgenic plants and to retain germplasm (De Guzman et al., 2011). In particular, generation of hairy roots from *S. lycopersicum* for production of foreign proteins has the advantage that there is less hazardous accumulation of alkaloids like in the case of tobacco (Singh et al., 2015), reducing the presence of unwanted compounds in extract or purified protein administration. These characteristics are essential to acquire regulatory approval for clinical administration of hairy rootproduced proteins in the future.

In addition, in the case of raw and partially purified preparations from candidate vaccine-expressing tomato hairy root clones, the presence of functional elements can be exploited. As an example, a molecular complex formulation (Tomatine) based on the natural adjuvant α-tomatine (i.e., nontoxic to humans when assumed in the amounts present in green tomatoes), was reported to stimulate antigenspecific humoral and cellular immune responses that determined protection against malaria, *Francisella tularensis* and regression of experimental tumors (Morrow et al., 2004; Zhang et al., 2006; Granell et al., 2010). Indeed, preliminary analysis of our candidate vaccine-expressing hairy roots showed the presence of α-tomatine (manuscript in preparation) which may have an additive role to that of SAPKQ in the immunological and anticancer responses observed after immunization with the E7\*SAPKQ-containing hairy root extracts, possibly enhancing the overall effect. Partially purified recombinant proteins have already proven to be able to induce an immune response in animal models (reviewed in Rigano et al., 2013) and may be considered in "rationally designed" formulates containing one or more active principles (e.g., vaccine antigen, bioactive secondary metabolites, and adjuvants such as α-tomatine).

Our data show that it is possible to obtain a recombinant E7\*-SAPKQ from clonal hairy root lines with immunological and anticancer activities against HPV experimental tumors, especially in combination with a DNA vaccine based on the same sequences. These results pave the way to undergo more studies on production of vaccines in plant-based expression systems and their combination with other treatment modalities for the development of effective and more specific therapeutic intervention against HPV infection and related cancer.

#### ETHICS STATEMENT

Animal handling and sacrifice were performed under specific pathogen-free conditions at the Animal House of the Regina Elena National Cancer Institute. All experimental procedures were approved by the Government Committee of National Minister of Health (85/2016-PR) and were carried out in accordance with EU Directive 2010/63/EU for animal experiments.

#### AUTHOR CONTRIBUTIONS

SM is the recipient of the special Grant from MIUR (Italian Ministry of University and Research) "ENEA 5 × Mille" (Young investigator Project: New therapeutic strategies for the treatment of cancer). She planned and designed the project, assembled the plant-expression constructs, undertook *N. benthamiana* agroinfiltration and generation of hairy root clones, analyzed plant material and hairy root recombinant clones, prepared extracts for immunization, planned and wrote the paper. FP undertook large-scale preparation of DNA plasmids, immunization experiments with mice, collected and analyzed data on immune responses and biological activity of the different vaccine preparations, wrote and revised the manuscript. AV planned the immunization protocols; supervised immunization experiments; analyzed and interpreted data on immune responses and biological activity of the different vaccine preparations; planned, wrote, and revised the manuscript. RF contributed to planning of the immunization experiments and in revising the manuscript. CM (Head of the Health Technologies Division) was committed in the active search of funding and revised the manuscript.

#### REFERENCES


#### FUNDING

This work was partially supported by the Special Grant 2015 from MIUR (Italian Ministry of University and Research) to ENEA (Young Investigator Progect 'New therapeutic strategies for the treatment of cancer) and by the Research Project "Regione Lazio-Lazio-Innova," funded under the L.R. 13/08.

#### ACKNOWLEDGMENTS

The authors thank PBL Technologies and Prof. George Lomonosoff for the license to use the pEAQ-HT vector; Dr. Eugenio Benvenuto (ENEA) who contributed important reagents and for the availability of the BSL2 Containment Greenhouse; Mrs Elisabetta Bennici (ENEA) for plant material growth setting and maintenance and for general technical assistance; and Dr. Debora Giorgi (ENEA) for help in fluorescence microscopy. SM is deeply grateful to Dr. Vidadi Yusibov and Dr. Marina Skarjinskajia (Fraunhofer Center for Molecular Biotechnology, DE, USA) for teaching and sharing advice and protocols concerning hairy root culture technology.

activity of a plant-derived HPV16 E7 vaccine. *Int. J. Immunopathol. Pharmacol.* 19, 785–795.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Massa, Paolini, Marino, Franconi and Venuti. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Epitope Presentation of Dengue Viral Envelope Glycoprotein Domain III on Hepatitis B Core Protein Virus-Like Particles Produced in Nicotiana benthamiana

Ee Leen Pang<sup>1</sup> , Hadrien Peyret<sup>2</sup> , Alex Ramirez<sup>3</sup> , Hwei-San Loh<sup>1</sup> \*, Kok-Song Lai<sup>4</sup> , Chee-Mun Fang<sup>5</sup> , William M. Rosenberg<sup>3</sup> and George P. Lomonossoff<sup>2</sup> \*

#### Edited by:

Suvi Tuulikki Häkkinen, VTT Technical Research Centre of Finland Ltd., Finland

#### Reviewed by:

Markus Sack, RWTH Aachen University, Germany Hugh S. Mason, Arizona State University, United States Johannes Felix Buyel, Fraunhofer Institute for Molecular Biology and Applied Ecology (IME), Germany

#### \*Correspondence:

Hwei-San Loh sandy.loh@nottingham.edu.my George P. Lomonossoff george.lomonossoff@jic.ac.uk

#### Specialty section:

This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science

Received: 01 December 2018 Accepted: 26 March 2019 Published: 16 April 2019

#### Citation:

Pang EL, Peyret H, Ramirez A, Loh H-S, Lai K-S, Fang C-M, Rosenberg WM and Lomonossoff GP (2019) Epitope Presentation of Dengue Viral Envelope Glycoprotein Domain III on Hepatitis B Core Protein Virus-Like Particles Produced in Nicotiana benthamiana. Front. Plant Sci. 10:455. doi: 10.3389/fpls.2019.00455 <sup>1</sup> School of Biosciences, University of Nottingham Malaysia, Semenyih, Malaysia, <sup>2</sup> Department of Biological Chemistry, John Innes Centre, Norwich, United Kingdom, <sup>3</sup> iQur Limited, London, United Kingdom, <sup>4</sup> Faculty of Biotechnology and Biomolecular Sciences, Universiti Putra Malaysia, Serdang, Malaysia, <sup>5</sup> Division of Biomedical Sciences, School of Pharmacy, University of Nottingham Malaysia, Semenyih, Malaysia

Dengue fever is currently ranked as the top emerging tropical disease, driven by increased global travel, urbanization, and poor hygiene conditions as well as global warming effects which facilitate the spread of Aedes mosquitoes beyond their current distribution. Today, more than 100 countries are affected most of which are tropical Asian and Latin American nations with limited access to medical care. Hence, the development of a dengue vaccine that is dually cost-effective and able to confer a comprehensive protection is ultimately needed. In this study, a consensus sequence of the antigenic dengue viral glycoprotein domain III (cEDIII) was used aiming to provide comprehensive coverage against all four circulating dengue viral serotypes and potential clade replacement event. Utilizing hepatitis B tandem core technology, the cEDIII sequence was inserted into the immunodominant c/e1 loop region so that it could be displayed on the spike structures of assembled particles. The tandem core particles displaying cEDIII epitopes (tHBcAg-cEDIII) were successfully produced in Nicotiana benthamiana via Agrobacterium-mediated transient expression strategy to give a protein of ∼54 kDa, detected in both soluble and insoluble fractions of plant extracts. The assembled tHBcAg-cEDIII virus-like particles (VLPs) were also visualized from transmission electron microscopy. These VLPs had diameters that range from 32 to 35 nm, presenting an apparent size increment as compared to tHBcAg control particles without cEDIII display (namely tEL). Mice immunized with tHBcAg-cEDIII VLPs showed a positive seroconversion to cEDIII antigen, thereby signifying that the assembled tHBcAgcEDIII VLPs have successfully displayed cEDIII antigen to the immune system. If it is proven to be successful, tHBcAg-cEDIII has the potential to be developed as a cost-effective vaccine candidate that confers a simultaneous protection against all four infecting dengue viral serotypes.

Keywords: virus-like particles, epitope display, hepatitis B core antigen, dengue envelope glycoprotein, envelope glycoprotein domain III, dengue vaccine, tandem core technology

# INTRODUCTION

fpls-10-00455 April 13, 2019 Time: 8:57 # 2

The alarming rise of dengue epidemics has been highlighted to affect over 40% of the world population (Brady et al., 2012). The disease can be manifested as undifferentiated dengue fever or life-threatening conditions such as dengue haemorrhagic fever (DHF) and dengue shock syndrome (DSS) (Murrell et al., 2011). Classified under the Flaviviridae virus family, dengue virus (DENV) is a single-stranded, positive-sense nonsegmented RNA virus with 40–50 nm enveloped particles (Guzman et al., 2010). The 10.6 kbp viral genome encodes a polypeptide that is processed into structural proteins [capsid (C); envelope glycoprotein (E); and precursor membrane (prM)] and non-structural biomolecules (NS1, 2A, 2B, 3, 4A, 4B, and 5) (Henchal and Putnak, 1990). During virus assembly, the C protein encapsidates the viral RNA to form nucleocapsid particles whereas the prM assists the folding of surface-exposed E glycoprotein (Whitehead et al., 2017). Transmission of the endemic virus has been observed in over 100 countries. The incidence rate has expanded 500-fold, spreading from Southeast Asia to the Americas and Western Pacific within just a-half century (Pang and Loh, 2016). Global distribution of dengue disease is strongly influenced by urbanization, demographic, and environmental factors including global warming which has enabled Aedes mosquitoes to survive beyond their usual distribution. Concerns are also driven by increasing movement of travelers (Pang and Loh, 2016). The annual incidence has grown dramatically in recent decades, in which 390 million cases are predicted per annum and 96 million amongst these cases manifest an apparent clinical or sub-clinical severity (Bhatt et al., 2013). Out of these, it was reported that 500,000 people were hospitalized with severe dengue and approximately 2.5% of them would succumb to the disease (World Health Organization [WHO], 2017). The reported figure may be under-estimated due to the passive surveillance system adopted by many countries (Runge-Ranzinger et al., 2014).

To date, no specific medication is available for dengue treatment. Current clinical practices mainly rely on administration of paracetamol and intravenous fluid, together with close monitoring of the haematocrit and platelet levels (Anfasa et al., 2015). The absence of specific drugs and lack of confidence in currently marketed vaccines have also driven the public's reliance on folk remedies that are yet to be scientifically proven. Thus, development of a dengue vaccine is still being aggressively pursued to address the unmet medical needs of people in tropical regions (Pang and Loh, 2017). For subunit vaccine production, the dengue E glycoprotein has been the most studied antigenic determinant. Its structure is organized into three ectodomains (EDIII), and serves to assist attachment and entrance into host cells via receptors (Faheem et al., 2011). The immunoglobulin-like domain III (EDIII) is an ideal immunogen as it harbors receptor binding motifs that can elicit neutralizing monoclonal antibodies production (Crill and Roehrig, 2001). In recent years, EDIII has been expressed as a consensus sequence (cEDIII) aligned between four DENV serotypes (Chiang et al., 2011; Kim et al., 2012, 2015, 2016, 2017; Huy and Kim, 2017). A proof-of-concept study showed that cEDIII could inhibit the infectivity of four dengue serotypes simultaneously following mice immunization (Leng et al., 2009). Therefore, this sequence was adopted with the aim of conferring protection against all four co-circulating dengue serotypes.

Virus-like particles (VLPs) have gradually emerged as vaccine delivery vehicles that are spontaneously assembled from viral structural proteins. These are multimeric structures that can directly stimulate immune cells by mimicking the threedimensional conformation of native viruses. Moreover, VLPs are devoid of infectious genetic material which makes them inherently safer than attenuated or inactivated virus preparations (Pang, 2018). VLPs are known to elicit higher B- and T-cell immune responses and hence lower dosage is usually sufficient (Gamvrellis et al., 2004; Roy and Noad, 2008). The repetitive array of protein subunits in VLPs has the potential to confer superior properties as a stand-alone vaccine when compared to those of recombinant subunit-based ones which may be weak immunogens despite the use of an adjuvant (Noad and Roy, 2003). All these features make VLPs a premium platform for the production of a safe and effective vaccine (Jain et al., 2015).

In this study, hepatitis B core antigen (HBcAg) was exploited for cEDIII epitope display. Early work on HBcAg was done by Clarke et al. (1987) to produce foot and mouth disease virus fusion particles with proven serological response in guinea pigs. The icosahedral VLPs are shaped by association of two HBcAg monomers into 90 (T = 3) or 120 (T = 4) dimers, with a hairpin structure bridged by c/e1 loops to form protruding spike (Crowther et al., 1994). The flexibility of inserting foreign sequence into the immunodominant c/e1 loop for surface exposure, while retaining its antigenic properties, is ideal for antigen presentation (Pumpens and Grens, 1999). In fact, insertion in the c/e1 region was shown to impart a stronger protective response compared to N- and C-terminal fusions to the core particles (Koletzki et al., 2000). Thus, the cEDIII gene was incorporated into the c/e1 loop for the benefit of maximized exposure on protruding spikes of the assembled VLPs. "Tandem Core" technology was adopted here to produce the chimeric HBcAg VLPs. It has been shown that this technology which covalently links two core proteins into dimer forms (Peyret et al., 2015) can alleviate potential steric hindrance between two inserts at each immunodominant c/e1 loop of the dimer interface and thus promotes chimeric VLP assembly. This technology has recently been applied successfully to make other viral vaccine candidates (Ramirez et al., 2018).

In the context of molecular pharming, plants have certain advantages when compared to bacterial and animal expression systems relating to lower production cost, rapid scalability, biocontainment warranty, and eukaryotic processing machinery (Sack et al., 2015; Tschofen et al., 2016; Loh et al., 2017). The emergence of transient expression systems has ultimately sped up the process, whereby rapid candidate screening and largescale production are achievable within days (Thuenemann et al., 2013). Moreover, there is evidence that plants may be a better choice than bacteria for the production of tandem core-based VLPs (Peyret et al., 2015). This study aimed to demonstrate the production of chimeric HBcAg particles displaying dengue cEDIII epitopes (tHBcAg-cEDIII) using a plant-based system,

with the hope of developing a novel VLP-based vaccine against the deadly dengue disease. Given that DENV clade replacements had been detected in recent years (Teoh et al., 2013), the development of a vaccine that can provide a consistent protection in the long run will be highly valued (Pang, 2018). Moreover, generating a vaccine based on the single consensus cEDIII antigen should reduce the underlying cost as it obviates the need to test the best tetravalent formulation from monovalent components of each DENV serotype.

# MATERIALS AND METHODS

#### Recombinant Vector Construction

The 103 amino acid residues of cEDIII consensus sequence used (see **Supplementary Material**) were based on the alignment of four DENV serotypes as described by Leng et al. (2009). The synthesized gene sequence of cEDIII was codon-optimized (GeneArt, United States) for the expression in Nicotiana benthamiana. Primers were designed to incorporate AvrII and SbfI restriction sites (underlined) at 5<sup>0</sup> and 3<sup>0</sup> ends of the cEDIII gene (cED3F: 5<sup>0</sup> - GAATACCTAGGAAGGGAATGTCATACGCTATGTGTACTG GAAAG-3<sup>0</sup> ; cED3R: 5<sup>0</sup> -CATTGCCTGCAGGTGAAGATCCCT TCTTGAAC-3<sup>0</sup> ) for sub-cloning into pEAQ-HT::tHBcAg-VHH2, a plasmid derived from pEAQ-τGFP (GenBank accession number KM396759, Peyret et al., 2015) which contains long glycine-rich linkers [(GGS)n] with unique restriction sites at the c/e1 loop region of the downstream second core (Core II) for ease of cloning. Following heat-shock transformation of competent Escherichia coli, putative clones harboring the expression vector (pEAQ-HT::tHBcAg-cEDIII) were screened and verified by sequencing (Eurofins, Germany). **Figure 1** illustrates the expression cassette and corresponding recombinant vector used in this study. Additional information on the construct sequence can be found in the **Supplementary Material**.

## Agrobacterium tumefaciens Transformation and Plant Infiltration

Nicotiana benthamiana plants were grown on custom-mixed soil comprising of peat, 2.5 kg/m<sup>3</sup> dolomite limestone, 1.3 kg/m<sup>3</sup> base fertilizer, 2.7 kg/m<sup>3</sup> Osmocote <sup>R</sup> (applied every 3–4 months), 0.3 kg/m<sup>3</sup> Exemptor <sup>R</sup> , and 0.25 kg/m<sup>3</sup> wetter in a controlled environment of 16-h photoperiod generated by 400 W sodium lamps, 24◦C and 70% relative humidity. Plants at 5–6 weeks old (until they reached the pre-flowering stage) were used in the study. The recombinant vector, pEAQ-HT::tHBcAg-cEDIII was introduced into Agrobacterium tumefaciens strain LBA4404 via electroporation. Transformed colonies were then selected from agar plates supplemented with 50 µg/ml kanamycin and 50 µg/ml rifampicin. The protocol from Sainsbury et al. (2012) was adopted here. Agrobacterial suspensions were cultured at 28◦C in a shaking incubator (200 rpm) for 24–48 h. The agrobacterial cells were harvested by 4,000 × g centrifugation for 10 min and resuspended in MMA solution [10 mM MES (2-[N-morpholino]ethanesulfonic acid) at pH 5.6, 10 mM MgCl<sup>2</sup> and 100 µM acetosyringone] to a final OD<sup>600</sup> of 0.4. The abaxial side of the developed N. benthamiana leaves was pricked and infiltrated with the agrobacterial suspensions using a needleless syringe. Time-course evaluation of the selective leaves of infiltrated plants was performed until 9 days post-infiltration (dpi). Control agrobacterial suspensions containing the empty pEAQ-HT vector (Sainsbury et al., 2009) without a gene insert was included to compare with pEAQ-HT::tHBcAg-cEDIII for physical observation of plants post-infiltration.

#### Protein Extraction

Small-scale extraction was conducted to test for protein expression and accumulation. Approximately 100 mg of the infiltrated leaf was harvested and homogenized with a <sup>1</sup> ⁄4-inch ceramic bead (MP Biomedicals, United States) in 3× volume of extraction buffer [100 mM sodium phosphate at pH 6.8, 150 mM NaCl, 0.1% Triton-X and protease inhibitors (Roche, Switzerland)]. The Omni Bead Ruptor 24 homogeniser (Camlab, United States) was adjusted at speed setting 4 for 30 s for tissue homogenization. Samples were then centrifuged at 16,000 × g for 10 min and the supernatant was kept as soluble protein (SP) fraction. Nevertheless, insoluble protein (IP) fraction could be extracted from the pellet through boiling with protein denaturing buffer [NuPAGE LDS buffer (Life Technologies) mixed 3:1 with 2-mercaptoethanol] and centrifugation at 16,000 × g for 10 min. To check for protein integrity, sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) was conducted at 200 V for 50 min and stained with InstantBlue (Expedeon, United Kingdom).

For large-scale extraction, infiltrated leaves were excised, weighed and homogenized with 3× volume of the same extraction buffer in a Waring blender. Crude extracts were filtered through a layer of Miracloth before subjecting to centrifugation (20,000 × g) at 4◦C for 20 min using an SS34 rotor (Thermo Fisher Scientific, United States). The clarified supernatant was filtered over 0.45 µm syringe filters prior to subsequent purification procedures.

# Purification of Virus-Like Particles (VLPs)

Virus-like particles samples were subjected to a two-stage purification process as described by Peyret (2015). Firstly, clarified extracts were overlain above different concentrations of sucrose solution; specifically 6 ml of 25% (w/v) and 2 ml of 70% (w/v). Double-layered sucrose cushions were prepared in UltraClear ultracentrifuge tubes (Beckman Coulter, United States) and centrifuged in a Surespin 630/36 swing-out rotor (Thermo Fisher Scientific, United States) at 167,000 × g, 4 ◦C for 2.5 h. The gradient was fractionated by piercing the bottom of the tube with a needle and recovering the bottom and interface fractions. These fractions were then dialyzed thoroughly against 200 mM ammonium bicarbonate buffer (pH 8.0) overnight. Next, samples were concentrated using SpeedVac (Thermo Fisher Scientific, United States), and loaded onto a Nycodenz step gradient extending from 60 to 20% (w/v), with 2 ml of each concentration in UltraClear ultracentrifuge tubes (Beckman Coulter, United States). High-speed centrifugation was operated at 274,000 × g using the TH-641 swing-out rotor (Thermo Fisher Scientific, United States) for 20 h at 4◦C. Bottom

of the tubes was punctured with a needle and divided into successive fractions. These fractions were assayed with Western blotting to determine the distribution of the desired particles.

#### Purification of cEDIII Protein

Recombinant cEDIII protein was used in this study as an antigen positive control for immunization work. In brief, the plant-expressed protein was harvested at 6 dpi following agroinfiltration with pEAQ-HT::PR1a-cEDIII-sGFPH-KDEL (Pang, 2018). As the recombinant protein was produced as a cleavable fusion to green fluorescent protein (sGFP) with a histidine tag, first-step isolation was achieved via native immobilized metal affinity chromatography (IMAC) (agarose resin derivatized with nickel ion-nitrilotriacetic acid; Qiagen, Germany). Following AcTEVTM protease digestion (Thermo Fischer Scientific, United States), the final product (cEDIII alone) was harvested from the flow-through fraction of a second IMAC procedure. The cEDIII protein was dialysed against phosphate-buffered saline (PBS) prior to the downstream testing.

# Western Blotting Analysis

For immunoblotting analysis, the electrophoresed proteins were electroblotted onto a nitrocellulose membrane (GE Healthcare, United States) via wet transfer at 100 V for 1 h. Blotted membrane was blocked with 5% (w/v) milk powder in PBS containing 0.05% (v/v) Tween-20 (PBST) for at least 1 h. The membrane was subsequently incubated with either mouse anti-HBcAg monoclonal antibody (10E11; Abcam, United Kingdom) (1:4,000 dilution) or mouse anti-DENV 1–4 monoclonal antibody [D1- 11(3); Thermo Fisher Scientific, United States] (1:2,000 dilution) for 1 h. Then, the membrane was washed three times with PBST at 5-min intervals before incubation with mouse immunoglobulin

G (IgG) horseradish peroxidase (HRP)-conjugated secondary antibody (M30107; Invitrogen, United States) (1:10,000 dilution). Washing steps were repeated three times before the blot was soaked in chemiluminescence solution and detected using ImageQuant LAS 500 (GE Healthcare, United States).

# Transmission Electron Microscopy (TEM) Examination

All VLP samples (tHBcAg-cEDIII) were dialyzed against PBS using Float-A-Lyzer (Sigma, United States). Approximately 10 µl of the VLPs sample was adsorbed onto copper-palladium grids, washed with sterile distilled water and negatively stained with 2% (w/v) uranyl acetate. As an experimental control, the tandem core particles without gene insert (tEL, Peyret et al., 2015) was also examined. Both particles (tHBcAg-cEDIII and tEL) were viewed using the FEI Tecnai 20 transmission electron microscope (FEI, United States).

## Protein Quantification

Pierce modified Lowry protein assay kit was used according to the manufacturer's instruction (Thermo Fisher Scientific, United States) for protein quantification. The absorbance of sample replicates and diluted albumin standards was then measured at 750 nm using CLARIOstar microplate reader (BMG LABTECH, Germany). The purified tHBcAg-cEDIII VLPs were also analyzed by SDS-PAGE (refer to **Supplementary Figure S1** for this SDS-PAGE profile).

#### Mouse Immunization

Two independent animal immunization experiments were conducted using female BALB/c mice (Envigo, United Kingdom) at 6–8 weeks old. For each experiment, at least four mice per group were immunized with 5 µg of purified VLP particles containing the cEDIII antigen insert (tHBcAg-cEDIII) mixed with 50 µl of Imject Alum (Thermo Fisher Scientific, United States) in a final volume of 100 µl made up by sterile saline solution. As a positive control group for cEDIII antigen, mice were immunized with 5 µg of recombinant cEDIII protein purified by using nickel ion-nitrilotriacetic acid agarose resin (Qiagen, Germany). Another group of mice was immunized with 5 µg of empty VLPs without cEDIII insert (tEL), which served as a negative control for cEDIII antigen. In addition, there was another control group of mice were injected with 100 µl of sterile normal saline solution. Three immunizations were delivered intra-peritoneally, administered 1 week apart. Tail bleeds were performed to check for seroconversion 4 weeks after primary immunization, and terminal bleeds at the completion of experiment at 9 weeks post-immunization. All animal works were done in compliance with United Kingdom Home Office approved animal protocols under license PPL 70/7376.

# Enzyme-Linked Immunosorbent Assay (ELISA)

Detection of antigen-specific IgG antibodies was performed by ELISA. Briefly, 96-well Nunc Maxisorp plates (Sigma, United States) were coated with 1 µg/ml pure cEDIII protein in carbonate bi-carbonate buffer overnight at 4◦C. Coated plates were washed three times with PBS-Tween 20 (0.05% v/v) and blocked with 10% skimmed milk solution (Sigma, United States) for 1 h at 37◦C. Primary sera were serially diluted (twofold) in 2.5% (w/v) milk solution from 1:100 to 3,200. All samples were added to duplicate test wells, incubated at 37◦C for 1 h then washed three times as mentioned before. Goat-anti-mouse IgG-peroxidase secondary antibody was added at 1:2,500 dilution (Sigma, United States) and incubated at 37◦C for 1 h then washed three times again. 3, 3<sup>0</sup> , 5, 5<sup>0</sup> -Tetramethylbenzidine substrate (Sigma, United States) was added to each well for 20 min and reaction was stopped with 1M H2SO4. Absorbance at 450 nm with a 630 nm correction was read using the SpectraMAX 190 plate reader (Molecular Devices, United States). Unpaired Student's t-test was conducted to compare the immunization groups of mice receiving either recombinant cEDIII protein or tHBcAg-cEDIII VLPs against the empty tandem core VLPs control (tEL) in order to determine the significance level of cEDIII-specific IgG produced. The significant difference of cEDIII-specific IgG levels between cEDIII and tHBcAg-cEDIII groups was also compared. The significance levels were denoted at p ≤ 0.001 with ∗∗ and p ≤ 0.0001 with ∗∗∗ in the graph.

# RESULTS

## Expression of Recombinant tHBcAg-cEDIII Construct in N. benthamiana

Following agroinfiltration, N. benthamiana leaves were collected on 6 dpi and analyzed by Western blotting technique to check for expression. Western profiles indicated that the tHBcAgcEDIII proteins had been expressed well in N. benthamiana infiltrated with pEAQ-HT::tHBcAg-cEDIII (**Figure 2**), based

FIGURE 2 | Expression profiles of tHBcAg-cEDIII protein in N. benthamiana analyzed at 6 dpi, which was predominantly obtained in insoluble form. Immunoblot detection of tHBcAg-cEDIII proteins at ∼54 kDa in size (black arrow) was performed using anti-HBcAg monoclonal antibody. Lane M: SeeBlue <sup>R</sup> Plus2 Pre-Stained Standard; Lane 1: SP extracted from pEAQ-HT::tHBcAg-cEDIII-infiltrated leaf disks; Lane 2: IP extracted from pEAQ-HT::tHBcAg-cEDIII-infiltrated leaf disks; Lane 3: SP extracted from pEAQ-HT-infiltrated leaf disks; Lane 4: IP extracted from pEAQ-HT-infiltrated leaf disks.

on the band at around 54 kDa which was not present in the pEAQ-HT-infiltrated plant leaf sample. However, the yield of soluble tHBcAg-cEDIII (SP) was very low (barely detectable) compared to the insoluble tHBcAg-cEDIII (IP) fraction. It was estimated that over 90% of the tHBcAg-cEDIII produced by the plant was insoluble.

## Kinetic Expression and Physical Observations on the Infiltrated N. benthamiana

A time-course evaluation was performed to determine the optimal harvest time of tHBcAg-cEDIII protein for subsequent extraction and purification procedures. In general, symptoms of leaf chlorosis were observed from 7 dpi onward in all pEAQ-HT::tHBcAg-cEDIII-infiltrated plants, which led to visible necrosis by 9 dpi. This leaf chlorosis symptom was not observed with the empty vector, pEAQ-HT-infiltrated plants. A representative series of a pEAQ-HT::tHBcAg-cEDIIIinfiltrated leaf is shown in **Figure 3A**. Concurrently, infiltrated leaves were harvested on a daily basis to monitor the accumulation of soluble proteins. As illustrated in **Figure 3B**, increasing amounts of tHBcAg-cEDIII SP could be seen from 6 dpi onward. As advanced necrosis was observed on 9 dpi in these infiltrated plants, it was deemed sensible to set 8 dpi as the optimal harvest time for soluble tHBcAgcEDIII protein.

# Purification of Chimeric Tandem Core Particles Displaying cEDIII Epitopes

Leaves infiltrated with the recombinant vector, pEAQ-HT::tHBcAg-cEDIII were harvested on 8 dpi for large-scale purification. The first isolation step via sucrose cushion yielded 70% sucrose and interface fractions, which were collected for maximal recovery of VLPs present in the sample (**Figure 4A**). Further purification by Nycodenz gradient gave rise to a single grayish band (**Figure 4B**). From here, the band was separated from the sedimentation of green impurities toward the top of the gradient. The single band was confirmed to be tHBcAg-cEDIII VLPs distribution following Western blotting analysis (**Figure 4C**), with sedimentation point estimated at around 40% Nycodenz concentration. The yield of purified tHBcAg-cEDIII VLPs was in the range of ∼12–16 mg/kg, which is comparable to that of purified recombinant cEDIII (∼13–14 mg/kg).

Following the purification process, TEM imaging analysis revealed that plant-produced tHBcAg-cEDIII assembled into VLPs (**Figure 5A**), which were visualized as a mixture of corelike particles of slightly different sizes. These particles exhibited an irregular spherical morphology that characterizes tandem core particles displaying a heterologous sequence in the c/e1 loop. The diameters of these particles ranged from 32 to 35 nm, with an average particle size of ∼34 nm. As illustrated in **Figure 5B**, purified tEL sample (empty tandem core VLPs without insert) formed smaller, more evenly sized particles, with an average size of ∼27 nm in diameter. Comparatively, tHBcAg-cEDIII particles appeared to be somewhat larger than the empty tEL particles.

# Immunogenicity of Chimeric Tandem Core Particles Displaying cEDIII Epitopes

The seroconversion results in BALB/c mice immunized with the VLP-based dengue vaccine candidate are presented in **Figure 6**. cEDIII-specific IgG antibody was successfully detected in mice immunized with tHBcAg-cEDIII VLPs and recombinant cEDIII protein alone. However, the IgG antibody level elicited by the recombinant cEDIII protein alone was higher than that of tHBcAg-cEDIII VLPs at all 1:100 to 1:3,200 dilutions tested (**Figure 6A**). The control groups (normal saline and tEL) did not trigger any cEDIII-specific IgG antibody responses and this observation has indicated the absence of pre-exposure of DENV (cEDIII). **Figure 6B** shows that cEDIII-specific IgG antibody level elicited by tHBcAg-cEDIII VLPs was significantly higher than that of the negative control, tEL-immunized mice (p ≤ 0.0001 at

week 4 and week 9). However, when comparing between cEDIII and tHBcAg-cEDIII groups, the specific IgG level of cEDIIIimmunized group mice was in fact significantly higher (p ≤ 0.001 at week 4 and p ≤ 0.0001 at week 9). In general, our data showed that IgG antibody levels appeared to peak after 4-week post-immunization, then began to wane but were still detectable more than 2 months after immunization. These results suggest that the cEDIII antigen inserted into the c/e1 loop of tandem core particles was still immunogenic and able to elicit a specific humoral response.

#### DISCUSSION

Development of VLPs as a highly structured form of subunit vaccine has been increasingly explored in recent years, whereby they present themselves as multimeric structures that mimic native virions. Production of dengue VLPs had been attempted previously (Sugrue et al., 1997; Liu et al., 2010; Zhang et al., 2011; Mani et al., 2013); however, several shortfalls were identified such as the neutralizing antibody responses acquired by mice were generally weak and co-expressing the DENV prM and EDI/EDII can trigger antibody-dependent enhancement (ADE) phenomenon. This is where pre-existing, non-neutralizing antibodies from an initial DENV infection can bind with the new infecting serotype and infect Fc gamma receptor (FcγR) bearing cells to gain entry into host cells (Bäck and Lundkvist, 2013). Hence, ADE is often affiliated with disease aggravation and known to be the strongest risk factor of DHF/DSS development (Kliks et al., 1989). Therefore, generating a durable immunity against all four serotypes of DENV by using serotype-specific VLPs would require the expression of four monovalent VLPs together in the optimal tetravalent formulation, which would represent a significant technical challenge.

Because of this, the rationale for the work presented here was to produce chimeric VLPs which are able to self-assemble while presenting a consensus DENV antigen on their surfaces. In this study, the VLPs derived from HBcAg were utilized for cEDIII epitope display. As the stand-alone stability of DENV domain III has made it intrinsically different from other parts of the E protein (Soares and Caliri, 2013), it is believed that cEDIII can behave as an independent entity and would not interrupt the assembly of viral particles (Pang, 2018). The resulting chimeric HBcAg VLPs are therefore expected to display a high density of cEDIII epitopes on 90 or 120 copies of core protein dimers per particle.

In this study, "Tandem Core" technology was adopted due to the concern that two copies of cEDIII inserts at HBcAg dimer interface might suffer from steric clashes which could abrogate particle assembly. Thus, it is anticipated that this strategy could resolve the steric constraints as VLPs are now assembled from the dimers of HBcAg protein, expressed from a single open reading frame coding for two copies of HBcAg that have been covalently linked (Peyret et al., 2015). The cEDIII gene was inserted into the immunodominant c/e1 loop of Core II (the C-terminal copy of HBcAg), so that this would minimize the disruption to VLPs assembly as the translation moved in 5<sup>0</sup> → 3 <sup>0</sup> direction from the unmodified Core I. With this, the tandem core would have greater flexibility as only one of the two c/e1 loops on each dimer was decorated with cEDIII antigens (Pang, 2018).

The cEDIII gene expressed in this study was codon-optimized based on N. benthamiana preference to boost translation by recoding the rare codon in foreign gene with synonymous codon preferred by the expression host (Angov, 2011). Apart from that, a glycine and serine-rich linker was designed to avoid spurious interaction between the cEDIII inserts and core protein subdomains: this linker consisted of 15 amino acids on either side of the cEDIII insert (the exact sequence is shown in the **Supplementary Material**). As this approach was shown to be useful by Kratz et al. (1999), it was hoped that the (GGS)n linkers can help to minimize the steric constraints and stabilize the chimeric HBcAg VLPs expressing the cEDIII epitopes (Pang, 2018). Indeed, these linkers allowed the display of a nanobody protein inserted in Core II of plant-produced tandem cores

successfully assembled into viral particles.

(Peyret et al., 2015). Nevertheless, the length of the linkers was not optimized in the context of this work, and it is possible that higher yield and better solubility could be achieved upon optimization.

Based on the kinetic expression studies (**Figure 3**), it is postulated that the chimeric VLP construct does not confer significant toxicity to plants as early necrosis or apoptosis was not evident. The profiles were useful to gauge the ideal harvest time which is assessed based on the peak accumulation of soluble target protein, and the post-infiltration morphological distortions of the leaves. Necrotic tissues are usually avoided as they are flaccid and may contain higher amount of phenolics that could be introduced into downstream processing (Tanguy and Martin, 1972). In fact, the antimicrobial exudate produced by necrotic tissues can inhibit efficient colonization and gene delivery by Agrobacterium (Pitzschke, 2013). In this case, the optimal harvest time for the chimeric tHBcAg-cEDIII VLPs was determined to be around 8 dpi. A constant OD<sup>600</sup> of 0.4 for the agrobacterial infiltration suspension was used throughout this study in line with the recommended range of OD<sup>600</sup> at 0.3–0.4 as reported previously (Li, 2011; Pua et al., 2012; Shamloul et al., 2014). This is because high bacterial density tends to trigger hypersensitive response that can lead to tissue necrosis, whereas low amount of Agrobacterium may result in insufficient gene delivery (Leuzinger et al., 2013). As presented in this study, an acceptable level of transient gene expression was achieved at the chosen OD<sup>600</sup> of 0.4, without severely triggering the hypersensitivity responses (Pang, 2018).

To purify the VLPs of interest, the procedures begin with a discontinuous two-step sucrose cushion to enrich the isolation of core particles from clarified lysate in a fast and reproducible manner (Peyret, 2015). After that, an additional isopycnic gradient was applied to complement the earlier technique in preparation of high purity particles (Brakke, 1961). Generally, Nycodenz is an inert chemical that can generate a self-forming gradient (Gugerli, 1984) and it worked well for the purification of tHBcAg-cEDIII VLPs as evidenced in **Figure 4**. TEM observation revealed that the assembly of particles was successful (**Figure 5A**). Instead of being uniformly shaped, tHBcAg-cEDIII particles appeared to be rather "knobbly" due to the epitopes protruding from HBcAg spikes. These VLPs, which ranged from 32 to 35 nm in diameter are somewhat larger than their empty counterpart labeled as tEL (**Figure 5B**). This type of surface morphology is in fact expected for tandem core particles displaying a heterologous sequence on their surface (Peyret et al., 2015). Such finding indicates that cEDIII epitopes were presented on the protruding spikes of HBcAg that retained the inherent propensity to fold into discrete VLPs. In fact, the size range of 35–40 nm for chimeric HBcAg VLPs with dengue EDIII epitopes produced in microbial cells, was previously reported by Arora et al. (2012, 2013).

To our knowledge, this is the first study that reports the successful production of chimeric HBcAg VLPs with dengue protein epitopes in a plant system. It was shown that the cEDIII epitopes remain immunogenic when presented on the VLP scaffold (HBcAg-cEDIII): the specificity of the IgG responses detected is demonstrated by the lack of cEDIIIreactive antibodies in the saline control and empty tHBcAg (tEL) groups (**Figure 6**). The specific IgG antibody levels induced by both HBcAg-cEDIII VLPs and recombinant cEDIII (positive control) were higher at week 4 post-immunization and declined thereafter as detected at week 9, which is consistent with the kinetics of a normal immune response (Leo et al., 2011). As the ELISA plates were coated with purified cEDIII protein, it is possible that the higher IgG antibody level detected in the recombinant cEDIII group could be due to an optimum antigenantibody match as compared to that of tHBcAg-cEDIII group. Additionally, the low antibody response (tHBcAg-cEDIII) could also be explained by the lower molar antigen dose received by the HBcAg-cEDIII-immunized mice as the antigen dose for immunization was normalized to a total protein concentration of 5 µg. Therefore, the actual dosage of cEDIII on the VLPs is much lower (in molar terms) than the subunit recombinant proteins. The tHBcAg-cEDIII is composed mainly of tHBcAg carrier by protein mass, and cEDIII only contributes to 103 out of the total 508 amino acids. Taken the suggestion of Whitacre et al. (2016), the amount of cEDIII presented in

relation to the entire chimeric particle size should be considered in future in order to gauge the optimal dose needed for in vivo study.

In any case, the results presented here highlight potential limitations to chimeric VLP vaccine development strategies. The recovered yield of tHBcAg-cEDIII VLPs is considered relatively low, as it was estimated that less than 10% of the tHBcAg-cEDIII protein formed soluble particles. Extraction under denaturing conditions was not attempted, since denaturing and refolding steps for such a complex structure (an assembly of 90 or 120 copies of a triple protein fusion) are unlikely to result in the proper formation of core-like particles. In this study, the tHBcAg-cEDIII protein was targeted to the cytosol, since this is the strategy that had previously been used successfully with tandem cores (Peyret et al., 2015), whereas, by contrast, the recombinant cEDIII protein alone was targeted to the endoplasmic reticulum (ER). Subcellular localization for tHBcAg-cEDIII protein was not attempted initially due to the concerns that distinct pH conditions in different plant organelles could abolish VLPs assembly as shown by van Zyl et al. (2016). Nevertheless, it has recently been shown that HBcAg VLPs are still capable of assembly despite localization to ER (Yang et al., 2017). In fact, accumulation of tHBcAg-cEDIII VLPs in the plant cytosol may have affected the antigen stability via improper disulphide bond formation in cEDIII. This issue is highlighted because

the correct folding of antibody recognition epitope on DENV EDIII relies upon the disulphide linkage (Suzarte et al., 2014). Given that the redox environment is highly regulated in the ER (Ellgaard, 2004), optimal oxidoreductase activity can be achieved to form proper disulphide bridges. The relatively low antibody response generated in mice immunized with tHBcAg-cEDIII VLPs as compared to recombinant cEDIII immunized group might therefore also be explained by the presence of VLPs with improper folding produced in the plant cytosol.

Hence, as the way forward, future optimizations to modulate protein yield and stability at the post-translational level may include attempting different extraction buffers to improve the solubility of crude extract as well as targeting the heterologous protein to a subcellular compartment. Researchers carrying out the future works would be well-advised to test the pre-immune sera from each individual mouse in order to increase confidence in any post-immunization results. The full benefits of VLP presentation should be explored further, as its three-dimensional structure has a higher potency to activate cell-mediated immunity including the cytotoxic T cell response, which is crucial for better viral clearance as reported by Yang et al. (2017). Besides, since cEDIII represents a consensus antigen, it would also be of great interest to test the cross-reactivity of polyclonal responses as well as the protection levels of tHBcAg-cEDIII VLPs against the different DENV serotypes.

### CONCLUSION

While there has been a previous initiative that used a yeast system to produce a chimeric VLP-based dengue vaccine candidate (Arora et al., 2013); development of a plantderived vaccine can offer scalability and safety advantages that revolutionize the accessibility of dengue vaccines (Pang and Loh, 2017). This is particularly important to address the alarming burden of dengue disease that has yet to meet a promising resolution. The adoption of "Tandem Core" technology was proven to be feasible although there is still room for improvements. Overall, a successful assembly of VLPs displaying a consensus dengue antigen has been achieved. The immunization data have shown that seroconversion to cEDIII is possible when mice are immunized with tHBcAg-cEDIII VLPs. The current findings have validated the viability of using a VLP system as an antigen-presentation platform; this warrants further investigation into its potential as a next-generation dengue vaccine.

# REFERENCES


# ETHICS STATEMENT

All animal works in this study were done in compliance with United Kingdom Home Office approved animal protocols under license PPL 70/7376.

#### AUTHOR CONTRIBUTIONS

H-SL, WMR, and GPL conceived and designed the study. ELP, HP, and AR performed the experiments. C-MF performed the statistical analysis. ELP and H-SL wrote the first draft of the manuscript. HP, AR, K-SL, and C-MF wrote sections of the manuscript. All authors contributed to manuscript revision and read and approved the current version.

# FUNDING

This research work was supported by John Innes Centre, United Kingdom; University of Nottingham Malaysia, Malaysia; and iQur Limited, United Kingdom. At John Innes Centre, this work was supported by the United Kingdom Biotechnology and Biological Sciences Research Council (BBSRC) Grant BB/L020955/1, the Institute Strategic Programme Grants, "Understanding and Exploiting Plant and Microbial Secondary Metabolism" (BB/J004596/1) and "Molecules from Nature – Enhanced Research Capacity" (BBS/E/J/000PR9794), and the John Innes Foundation.

# ACKNOWLEDGMENTS

The authors would like to thank the Ministry of Higher Education, Malaysia for supporting ELP in her Ph.D. study. Some of the contents presented in this paper are based on the ELP's Ph.D. thesis (University of Nottingham Malaysia).

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00455/ full#supplementary-material



and protective potential of core-derived particles. Virology 276, 364–375. doi: 10.1006/viro.2000.0540



**Conflict of Interest Statement:** GPL declares that he is a named inventor on granted patent WO 29087391 A1 that describes the system used for transient expression in this manuscript. The Tandem Core vaccine technology described in this paper is covered by the patent application PCT/GB01/01607 licensed by iQur Limited. AR and WMR are the current employees of iQur Limited.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Pang, Peyret, Ramirez, Loh, Lai, Fang, Rosenberg and Lomonossoff. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Rapid and Scalable Plant-Based Production of a Potent Plasmin Inhibitor Peptide

*Mark A. Jackson1 , Kuok Yap1 , Aaron G. Poth1 , Edward K. Gilding1 , Joakim E. Swedberg1 , Simon Poon2 , Haiou Qu1 , Thomas Durek1 , Karen Harris2 , Marilyn A. Anderson2 and David J. Craik1 \**

*1 Institute for Molecular Bioscience, The University of Queensland, Brisbane, QLD, Australia, 2Department of Biochemistry and Genetics, La Trobe Institute for Molecular Science, La Trobe University, Melbourne, VIC, Australia*

#### *Edited by:*

*Suvi Tuulikki Häkkinen, VTT Technical Research Centre of Finland Ltd, Finland*

#### *Reviewed by:*

*Joshua Lee Fuqua, University of Louisville, United States Somen Nandi, University of California, United States*

*\*Correspondence: David J. Craik d.craik@imb.uq.edu.au*

#### *Specialty section:*

*This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science*

*Received: 10 December 2018 Accepted: 24 April 2019 Published: 15 May 2019*

#### *Citation:*

*Jackson MA, Yap K, Poth AG, Gilding EK, Swedberg JE, Poon S, Qu H, Durek T, Harris K, Anderson MA and Craik DJ (2019) Rapid and Scalable Plant-Based Production of a Potent Plasmin Inhibitor Peptide. Front. Plant Sci. 10:602. doi: 10.3389/fpls.2019.00602*

The backbone cyclic and disulfide bridged sunflower trypsin inhibitor-1 (SFTI-1) peptide is a proven effective scaffold for a range of peptide therapeutics. For production at laboratory scale, solid phase peptide synthesis techniques are widely used, but these synthetic approaches are costly and environmentally taxing at large scale. Here, we developed a plant-based approach for the recombinant production of SFTI-1-based peptide drugs. We show that transient expression in *Nicotiana benthamiana* allows for rapid peptide production, provided that asparaginyl endopeptidase enzymes with peptideligase functionality are co-expressed with the substrate peptide gene. Without co-expression, no target cyclic peptides are detected, reflecting rapid *in planta* degradation of non-cyclized substrate. We test this recombinant production system by expressing a SFTI-1-based therapeutic candidate that displays potent and selective inhibition of human plasmin. By using an innovative multi-unit peptide expression cassette, we show that *in planta* yields reach ~60 μg/g dry weight at 6 days post leaf infiltration. Using nuclear magnetic resonance structural analysis and functional *in vitro* assays, we demonstrate the equivalence of plant and synthetically derived plasmin inhibitor peptide. The methods and insights gained in this study provide opportunities for the large scale, cost effective production of SFTI-1-based therapeutics.

Keywords: peptide, therapeutic, asparaginyl endopeptidase, cyclyzation, stability, biofactory, *Nicotiana benthamiana*, sunflower trypsin inhibitor

#### INTRODUCTION

Beyond its biological role as a plant defense peptide, the 14 amino acid sunflower trypsin inhibitor-1 (SFTI-1) has attracted significant interest in the drug development field (Craik et al., 2006; Lesner et al., 2011). This interest largely stems from the cyclic backbone of SFTI-1, which together with a bridging disulfide bond imparts exceptional stability and conformational rigidity to the peptide (Korsinczky et al., 2001). Furthermore, SFTI-1 is readily tolerant of residue substitutions, exemplified by a wide range of combinatorial and rational design variants that have been applied to convert SFTI-1 into potent and stable inhibitors of therapeutically relevant proteases, including matriptases (Quimbar et al., 2013; Fittler et al., 2014; Gitlin et al., 2015), kallikreins (Shariff et al., 2014; Chen et al., 2016; de Veer et al., 2016; Jendrny and Beck-Sickinger, 2016), chymotrypsin (Swedberg et al., 2017), and furin (Fittler et al., 2015). SFTI-1 has also proven to be a useful scaffold for presenting and stabilizing small bioactive epitopes that by themselves would be unstable and not effective as pharmaceuticals (Chan et al., 2011; Qiu et al., 2017; Durek et al., 2018). These engineered cyclic SFTI-1 analogues uniformly display significantly enhanced serum stability compared with their linear counterparts, overcoming a major limitation of peptide-based therapeutics (Wang and Craik, 2018).

Peptide drugs are primarily produced *via* solid phase peptide synthesis techniques, which in large scale have considerable economic and environmental costs (Andersson et al., 2000). For some peptides, recombinant production is a feasible alternative, with prokaryotes and lower eukaryotic hosts most commonly used (Demain and Vaishnav, 2009). In the case of SFTI-1, backbone cyclization is required for maximum potency (Colgrave et al., 2010), so recombinant production strategies that incorporate this post-translational modification are required. Recently, an intein-mediated protein splicing approach to SFTI-1 cyclization in *E. coli* was reported, with SFTI-1 yields estimated at 180 μg/L of bacterial culture (Li et al., 2016). Although promising, intein splicing efficiency is highly sensitive to the residues at the extein-intein junction, potentially reducing its broad applicability (Aboye and Camarero, 2012). As an alternative, and considering that SFTI-1 is naturally produced and cyclized in sunflower, a plant-based production system is appealing. However, small peptides have typically proven difficult to produce *in planta*, presumably due to the unintended effects of proteolysis either *in planta* or during extraction phases (Benchabane et al., 2008; Habibi et al., 2017). Strategies to overcome this limitation have included expressing peptides with stabilizing fusion partners (Yasuda et al., 2005; Sainsbury et al., 2013), downregulating interfering plant proteases (Robert et al., 2015), and the development of subcellular targeting approaches (Jackson et al., 2010; Yang et al., 2017). Despite advances using these strategies, the yields from plant-produced peptides have generally been low, typically in the low μg g−1 fresh weight (FW) range (Lico et al., 2012; Viana et al., 2012). In contrast, some endogenous cyclic plant peptides are known to accumulate to very high levels [~1.8 mg g−1 dry weight (DW)], most notably exemplified by the class of cyclic peptides termed cyclotides (Craik et al., 1999; Seydel and Dornenburg, 2006). Thus, determining the *in planta* biosynthetic pathways that govern cyclotide synthesis and accumulation in plants will be of great benefit if translatable to the recombinant production of "designer" therapeutic peptides.

SFTI-1 is produced in sunflower seeds, where it is posttranslationally processed from the precursor protein PawS1 (preproalbumin with sunflower trypsin inhibitor-1) (Mylne et al., 2011) (**Figure 1A**). Because sunflower transformation is inefficient, most PawS1 processing studies have been done

FIGURE 1 | Transient expression analysis of SFTI-1 production in *N. benthamiana*. (A) Native SFTI-1 is produced in sunflower seed *via* expression of the *PawS1* gene. SFTI-1 processing occurs *via* the concerted action of asparaginyl endopeptidases (AEPs), which have strict preference for asparagine or aspartic residues (shown as filled triangles). The 14-amino acid SFTI-1 peptide sequence and immediately flanking residues are displayed with the SFTI-1 sequence highlighted in grey. (B) For transient expression in *N. benthamiana* leaves, the pEAQ-Dest1 vector (Sainsbury et al., 2009) was used which provides high level transgene expression due the presence of the 35 s promoter and cowpea mosaic virus (CPMV) 5′ and 3′ UTRs. To produce SFTI-1, the *Oak1* gene was engineered to include the SFTI-1 peptide encoding sequence, replacing that of kB1. Cleavage after an amino-terminal repeat (NTR) by an as yet unidentified protease is thought to occur first to liberate the N-terminal glycine required for AEP mediated backbone cyclization to the C-terminal aspartic residue. (C) MALDI-TOF MS analysis of peptides produced in *N. benthamiana* leaves upon co-expression of pEAQ-OaAEP1b with pEAQ-Oak1-SFTI-1. The mass for cyclic SFTI-1 (*m/z* 1513.7) was readily detected. An unrelated and endogenous peptide (*m/z* 1764.7) was also readily detected. (D) SFTI-1 peptides were quantified using the method of standard addition where a standard curve was built into each crude plant extract.

in the model plant Arabidopsis (Mylne et al., 2011), *in situ* using sunflower seed extracts (Bernath-Levin et al., 2015), or *in vitro* with recombinant processing enzymes and synthetic or recombinant substrates (Bernath-Levin et al., 2015; Franke et al., 2017; Haywood et al., 2018). Together these studies have unequivocally demonstrated the involvement of vacuolar cysteine proteases termed asparaginyl endopeptidases (AEPs) for both the cleavage and subsequent cyclization of SFTI-1. Similarly, cyclotides are known to be backbone cyclized by AEPs (Bernath-Levin et al., 2015; Harris et al., 2015; Poon et al., 2017), where a detailed understanding of mechanisms and structural requirements has emerged (Jackson et al., 2018; James et al., 2018). These ligase competent AEPs not only represent useful biotechnological tools for *in vitro* peptide and protein engineering applications (Harris et al., 2015; Nguyen et al., 2015; Hemu et al., 2016) but also open up opportunities for their deployment in plant biofactory applications for the production of cyclic peptides (Poon et al., 2017).

In this study, we evaluated a series of gene expression parameters for optimizing *in planta* SFTI-1 production using *N. benthamiana* as a biofactory host. We demonstrate that resultant yields are influenced by the choice of AEP ligase, the AEP recognition site used, and by using multi-unit peptide expression cassettes. We demonstrate the scalability and usefulness of this transient plant-based production system by producing and purifying a recently developed potent plasmin inhibitor therapeutic based on SFTI-1 (Swedberg et al., 2019).

# MATERIALS AND METHODS

#### Vector Construction

DNA encoding Oak1-SFTI-1\_GLDN, Oak1-[D14N]SFTI-1\_GLDN, Oak1\_GLDN, Oak1-[N29D]\_GLDN, Oak1-[T4Y,I7R]SFTI-1, and Oak1-[T4Y,I7R]SFTI-1\_3R were synthesized by Integrated DNA Technologies (Singapore) as gene block fragments in preparation for in house cloning (**Supplementary Figure S7**). All peptide precursor genes and AEP genes were recombined into the plant expression vector pEAQ-DEST1 (Sainsbury et al., 2009) using Gateway™ LR Clonase™ technology (Invitrogen, Carlsbad CA). Sequence verified vectors were then transferred to *Agrobacterium tumefaciens* LBA4404 by electroporation.

#### Transient Expression in *Nicotiana benthamiana*

*Nicotiana benthamiana* plants were cultivated in Jiffy peat pellets in a plant growth room at 28°C under 160 μmol of LED illumination (AP67 spectra, Valoya Oy, Helsinki, Finland). Agrobacterium cultures harboring pEAQ-DEST1 expression cassettes were grown in Luria-Bertani media to stationary phase before centrifugation and resuspension in infiltration buffer (10 mM MES)(2-[*N*-morpholino]ethanesulfonic acid) (pH 5.6, 10 mM MgCl2, 100 uM acetosyringone). Separate cultures (OD600 of 1.0) harboring the AEP and peptide expression vectors were mixed at a ratio of 1:1 before vacuum infiltration of *N. benthamiana* plants at 5–6 weeks of age. For relative quantification experiments, a third Agrobacterium culture was added to the mix that contained an expression vector encoding a truncated kB6 peptide. This design which allowed only linear kB6 to be produced, irrespective of AEP expression, served as an internal control for normalizing SFTI-1 and kB1 levels.

## Peptide Quantification

At 6 days post-infiltration, plant tissue was harvested for peptide extraction and analysis. To enable absolute quantification of SFTI-1 and [T4Y,I7R]SFTI-1, leaf tissue was lyophilized and ground using a Geno/Grinder® (SPEX Sample Prep) with homogenous subsamples used for MS-based quantification. Routinely, 5 mg of dry tissue was reconstituted in buffer [50% (v/v) acetonitrile, 0.1% (v/v) formic acid] with 0.05 μM codeine included as internal standard. Triplicate samples were spiked with a standard curve of concentrations of analyte before centrifugation. The analytes were quantified using targeted multiple reaction monitoring (MRM) analyses conducted on a SCIEX QTRAP 6500+ mass spectrometer interfaced with a SCIEX UPLC system. Studies were conducted using a Phenomenex Kinetex C18 UPLC column (150 mm × 2.0 mm, 1.7 μm particle size), maintained at 60°C with a linear acetonitrile gradient delivered at a flow rate of 0.4 ml min−1. Ion spray voltage was set at 5000 V, source temperature at 400°C, and MRM scans were conducted with unit resolution settings for both Q1 and Q3. MRM transition details for each analyte are provided in **Supplementary Table S1**. SCIEX MultiQuant (v 3.0.2) software was used to plot analyte signal intensities against concentrations added to the sample matrix and back-extrapolated to find negative X-axis intercepts, which were finally adjusted for dilution to calculate analyte concentrations in the sample.

For relative quantification, leaf disks were punched from infiltrated leaves and placed in microfuge tubes with ball bearings, before grinding to powder in liquid nitrogen using a Geno/Grinder® (SPEX Sample Prep). Peptides were extracted in 200 μl of aqueous [50% (v/v) acetonitrile, 1% (v/v) formic acid] with gentle mixing overnight. After centrifugation, the supernatant was diluted 1:5 with 1% formic acid before being desalted and concentrated using C18 ZipTips (Millipore). Samples were then mixed 1:1 with a-cyano-4-hydroxycinnamic acid [5 mg ml−1 in 50% acetonitrile, 0.1% TFA, 5 mM (NH4)H2PO4] before being spotted and dried onto a MALDI sample plate for matrix assisted laser desorption/ionization (MALDI)-time of flight (TOF) MS using an Applied Biosystems 4700 TOF-TOF Proteomics Analyzer. For relative quantification, the sum of the isotope cluster area corresponding to cyclic SFTI-1 or kB1 was normalized to the sum of the isotope cluster area of linear kB6 peptides.

#### Peptide Synthesis and Purification

All peptides were synthesized in house using established Fmoc solid-phase peptide synthesis methods (Cheneval et al., 2014). Peptides were isolated by RP-HPLC and characterized by high resolution MS and NMR spectroscopy.

#### *In vitro* Peptide Cyclization Assays

Recombinant OaAEP1b was prepared using the methods detailed in (Harris et al., 2015) with activated enzyme concentration estimated by BCA assay according to the manufacturer's instructions. Linear target peptides (6.67 μM) were incubated with recombinant OaAEP1b (7.5 μg ml−1 final concentration) in activity buffer (50 mM sodium acetate, 50 mM NaCl, 1 mM EDTA, pH 5) for up to 40 h at room temperature. The reaction mixture (10 μl) was desalted using C18 ZipTips (Millipore) and mixed 1:1 with a-cyano-4-hydroxycinnamic acid [5 mg ml−1 in 50% acetonitrile, 0.1% TFA, 5 mM (NH4)H2PO4] before MALDI-TOF MS using an Applied Biosystems 4700 TOF-TOF Proteomics Analyzer.

#### Peptide Extraction and Purification

Harvested plant tissue at 6 days post-infiltration was lyophilized then homogenized using a Geno/Grinder® (SPEX Sample Prep) prior to solvent extraction with 50% (v/v) acetonitrile, 0.1% (v/v) formic acid. The supernatant collected was lyophilized, then redissolved with 10% (v/v) acetonitrile, 0.1% (v/v) formic acid before Solid-Phase Extraction (SPE) using a Phenomenex Strata C18-E SPE cartridge with 10 g resin capacity. The eluted fraction of 5–20% (v/v) acetonitrile, 0.1% (v/v) formic acid was then collected, lyophilized, and then reconstituted in 5% (v/v) acetonitrile, 0.1% (v/v) trifluoroacetic acid. The elution was then passed through a 0.45 μm filter before separation to homogeneity by HPLC on a semipreparative Phenomenex Jupiter C18 RP-HPLC column (250 mM × 10 mM, 5 μm particle size) followed by a preparative analytical Phenomenex Jupiter C18 RP-HPLC column (250 mm × 4.6 mm, 5 μm particle size). Fractions yielding homogeneous plant derived [T4Y,I7R]SFTI-1 were identified using MALDI-TOF as described above and lyophilized.

#### Structural Characterization of Plant Derived [T4Y,I7R]SFTI-1

High resolution MS comparison of synthetic and plant-produced peptides was conducted *via* UPLC-MS analysis on a SCIEX X500R mass spectrometer interfaced with a SCIEX UPLC. UPLC column details were identical to those used in the QTRAP analyses detailed earlier. A linear acetonitrile gradient was delivered over 15 min (flow rate 0.4 ml min−1) and monitored *via* positive ion TOF-MS acquisition (250 ms per scan, *m/z* 100–1,000). Retention times were determined from post-run XICs with a mass extraction width of 0.05 Da centered on monoisotopic peaks.

Prior to tandem MS, plant-derived [T4Y,I7R]SFTI-1 was redissolved in 100 mM NH4HCO3 (pH 8) for reduction, alkylation, and enzymatic digestion with bovine trypsin (Sigma T1426). Tandem MS of linear reduced and carboxyamidomethylated [T4Y,I7R]SFTI-1 fragment with sequence SRPPICFPDGR was collected on a SCIEX 5600 TripleTOF instrument interfaced with a Shimadzu UPLC. Sample was separated on a Agilent Zorbax 300SB-C18 column (100 mm × 2.1 mm, 1.8 μm particle size) and eluted with a linear acetonitrile gradient, and eluent was monitored using an information dependent acquisition experiment with a TOF-MS survey scan (mass range *m/z* 80–1,000 Da) triggering up to 20 MS/MS on precursor ions (mass range *m/z* 80–1,100 Da) with 50 ms scan times.

#### NMR Spectroscopy

The heterologously plant-produced and purified peptide [T4Y,I7R] SFTI-1 was dissolved in 90% H2O/10% D2O at a concentration of 80 μg ml−1. Spectra were recorded on a Bruker Avance III 600 MHz spectrometer equipped with a cryoprobe at 298 K. Phase-sensitive mode using time-proportional phase incrementation for quadrature detection in the t1 dimension was used for all two-dimensional spectra. Excitation sculpting with gradients was used to achieve water suppression. NMR experiments included TOCSY using a MLEV-17 spin lock sequence with an 80-ms mixing time, and NOESY with a 200-ms mixing time. Spectra were recorded with 4,096 data points in the F2 dimension and 512 increments in the F1 dimension. The t1 dimension was zero-filled to 1,024 real data points, and the F1 and F2 dimensions were multiplied by a sine-squared function before Fourier transformation. Chemical shifts were referenced to DSS. All spectra were processed using TopSpin (Bruker) and manually assigned with CCPNMR using the sequential assignment protocol (Wüthrich, 1986; Vranken et al., 2005).

#### Plasmin Inhibitory Assays

A serial dilution of plant-derived [T4Y,I7R]SFTI-1 was incubated with 1 nM native human plasmin (Sigma-Aldrich) for 30 min in assay buffer (0.1 M Tris-HCl, pH 8.0, 0.1 M NaCl, and 0.005% Triton X-100) in low binding 96-well plates (Corning). After addition of 100 μM of the colorimetric peptide substrate Acetyl-Arg-Met(sulphone)-Tyr-Arg-*p*NA (*K*M = 23.5 μM) to a final volume of 200 μl, the rate of substrate cleavage was monitored by the release of the *p*NA moiety at *λ* = 405 nm over 7 min. The inhibition constant (*K*i) was determined from three independent assays by the Morrison equation and non-linear regression using GraphPad Prism 6.

#### Statistical Analysis

One-way ANOVA followed by Tukey's multiple comparisons test was performed using GraphPad Prism version 7.0c for Mac OS X, GraphPad Software, La Jolla California USA, www. graphpad.com.

#### Accession Numbers

AEP gene sequences that have assigned GenBank accession numbers include OaAEP1b (KR259377), CtAEP1 (KF918345), PxAEP3b (MG720076), and HeAEP3 (MG720074).

# RESULTS

#### *Nicotiana benthamiana* Leaf-Based Transient Expression of Sunflower Trypsin Inhibitor-1

To express and cyclize SFTI-1 *in planta*, we used the pEAQ vector (**Figure 1B**) (Sainsbury et al., 2009) for recombinant production in *N. benthamiana*. Initially, to produce native SFTI-1, we tested peptide accumulation in leaves upon expression of a modified *Oak1* gene [described in (Poon et al., 2017)] where the peptide sequence for SFTI-1 replaces the cyclotide peptide kalata B1 (kB1) (construct pEAQ-Oak-SFTI-1) (**Figures 1A,B**). In addition, we chose to replace the C-terminal GLPSLAA residues normally present in Oak1 with the residues GLDN that naturally flank SFTI-1 within the PawS1 precursor protein. As previously shown (Poon et al., 2017), cyclic SFTI-1 (*m/z* 1513.7) could only be detected in leaf extracts upon co-expression of the SFTI-1 precursor with the ligase-efficient AEP from *O. affinis* (OaAEP1b) (**Figure 1C**). To quantify the yield of SFTI-1 produced in *N. benthamiana* leaves, we used a quantitative mass spectroscopy (MS)-based approach (Bronsema et al., 2012) (**Figure 1D**). This approach, which requires a standard curve to be included in each replicate extraction, provides for a more accurate measurement by eliminating any effect that sample matrix might have on SFTI-1 signal intensity. Using this method, the yield of cyclic SFTI-1 extracted was determined to be 12.8 ± 3.0 μg g−1 DW (s.d., *n* = 3), which is substantially lower than the reported 199 μg g−1 DW yield obtained for the cyclotide kB1, using a similar expression strategy (Poon et al., 2017).

To determine if this lower *in planta* SFTI-1 yield correlates with a lower efficiency of OaAEP1b on SFTI-1 substrates, we set out to compare processing efficiencies between SFTI-1 and kB1 substrates *in vitro* (**Figure 2**). AEPs are known to have strict preference for either an Asn or Asp at the P1 position, thus we additionally wished to determine the effect that reciprocal Asn/Asp residue changes would have on processing efficiencies. For this, we directly compared recombinant OaAEP1b activity on the peptide substrates kB1\_GLDN (**Figure 2A**), [N29D] kB1\_GLDN (**Figure 2B**), SFTI-1\_GLDN (**Figure 2C**), and [D14N]SFTI-1\_GLDN (**Figure 2D**). Peptide cyclization assays were performed at pH 5.0 to simulate the low pH of leaf cell vacuoles, where AEP-mediated cyclization is predicted to occur (Jackson et al., 2007; Conlan et al., 2011). For all substrates, peptide cyclization was favored over hydrolysis by recombinant OaAEP1b with resulting MS signals for cyclic peptide dominating over linear peptide. For both kB1 substrates, the precursor was essentially quantitatively converted to cyclic kB1/[N29D] kB1 within 30 min (**Figures 2A,B**), while for the SFTI-1\_GLDN and [D14N]SFTI-1\_GLDN precursor peptides, unprocessed peptides were still detectable after 18 h (**Figures 2C–E**). Further analysis after 40 h of incubation revealed that processing was essentially complete with resulting cyclic to linear peptide MS signal ratios calculated at 96.86 ± 0.66% (s.d., *n* = 6) and 92.19 ± 1.89% (s.d., *n* = 6) for SFTI-1\_GLDN and [D14N]SFTI-1\_GLDN, respectively (**Figure 2F**). These findings indicate that SFTI-1 precursor peptides, irrespective of containing Asp or Asn at the AEP processing site, are although amenable to enzymatic cyclization, relatively

poor substrates for OaAEP1b with significantly slower processing when compared to kB1 cyclotide precursors. Thus, we reasoned that the lower *in planta* yields observed for SFTI-1 over kB1 may be caused by OaAEP1b being outcompeted for SFTI-1 substrates by endogenous AEPs, which lack the ability to stabilize SFTI-1 by way of backbone cyclisation. Optimization or discovery of AEP ligases more conducive to SFTI-1 substrates, or downregulation of interfering endogenous AEP machinery are thus two approaches to increase the *in planta* yield of SFTI-1.

#### Assaying Diverse Asparaginyl Endopeptidases for *in planta* Cyclization of Sunflower Trypsin Inhibitor-1

In addition to OaAEP1b from *O. affinis* (Harris et al., 2015), several other AEP ligases have recently been characterized, including HeAEP3 (*Hybanthus enneaspermus*) (Jackson et al., 2018), PxAEP3b (petunia) (Jackson et al., 2018), and CtAEP1 (butelase-1; *Clitoria ternatea*) (Nguyen et al., 2015). To determine if any of these newly discovered AEP ligases have improved *in planta* processing ability for SFTI-1 over OaAEP1b, we set up an *in planta* assay wherein the precursor gene Oak1-SFTI-1, the AEP gene in question, and a C-terminally truncated kB6 peptide precursor gene were co-expressed (**Figures 3A,B**). Expression of the latter construct resulted in the production of a kB6 peptide precursor without the required C-terminal residues for AEP-mediated cyclization. Thus, the accumulation level of linear kB6 could serve as an internal control to normalize for differences in infiltration efficiencies between *N. benthamiana* leaves. Importantly, this approach produced very similar results to that obtained using normalization with a spiked peptide on a per dry weight basis (**Supplementary Figure S1**). Of the AEPs tested, OaAEP1b from *O. affinis* proved to be the best performing peptide ligase for SFTI-1 with a ~3 fold increase in relative abundance, when compared to the next best performing ligases, HeAEP3 and PxAEP3b, which yielded similar SFTI-1 levels (**Figures 3A,B**). CtAEP1 (butelase-1) produced only minimal cyclic SFTI-1 in agreement with the *in vitro* characterization of this enzyme as inefficient for cyclisation at Asp residues (Nguyen et al., 2015).

Co-expression of AEPs with *Oak1* (that harbors an asparagine at the cyclization site of kB1) revealed that CtAEP1 produced the highest relative level of kB1 when compared to OaAEP1b, HeAEP3, and PxAEP3b (**Supplementary Figures S2A,C**). With this in mind, we wished to determine if cyclization levels of SFTI-1 could be simply improved by combining CtAEP1 expression with a modified SFTI-1 where the cyclization residue was changed from the native aspartic acid to an asparagine (construct pEAQ-Oak1-[D14N]SFTI-1). Somewhat surprisingly however, co-expression of this modified peptide precursor, with any of the four AEPs tested, resulted in no detectable cyclic [D14N]SFTI-1. This result contrasts to our *in vitro* assessment of the [D14N]SFTI-1-GLDN substrate where recombinant OaAEP1b predominantly processed the substrate to cyclic [D14N]SFTI-1 (**Figure 2D**). These results suggest that *in planta*, [D14N]SFTI-1\_GLDN is cyclizable but is prone to rapid degradation. Interestingly,

FIGURE 3 | Comparison of AEP ligases for the *in planta* SFTI-1 peptide cyclization. (A) MALDI-TOF MS analysis of representative (*n* = 3) peptide extracts from *N. benthamiana* leaves which were co-infiltrated with pEAQ-Oak1-SFTI1\_GLDN and pEAQ-Oak6\_trun with or without pEAQ driven AEP ligase genes. Without AEP transgene expression, no cyclic or full length linear SFTI-1 related peptides were detectable. Smaller masses, however, were observed, with low signal strengths, and likely represent truncated SFTI-1 peptides (e.g., *m/z* 1172.6 consistent with linear oxidized GRCTKSIPPIC. By co-expressing AEP ligase genes from *O. affinis* (OaAEP1b), *H. enneaspermus* (HeAEP3) and Petunia "Mitchell" (PxAEP3b) cyclic SFTI-1 was readily detected which contrasted to expression of CtAEP1 which failed to produce any cyclic SFTI-1. For all infiltrations, the expression of pEAQ-Oak6\_trun served as an internal control where MS signal intensities for linear kB6 were used to normalize SFTI-1 MS signals for relative quantification. (B) Relative MS signal intensities for cyclic SFTI-1 among co-expressed ligase capable AEPs (*n* = 3). Treatments carrying unique Greek lettering are significantly different (*p* < 0.05) as determined by Tukey's ANOVA. Error bars are s.e.m.

this does not seem to be the case for kB1, where cyclic peptides containing either asparagine or aspartic acid residues at the kB1 cyclization point accumulate to high levels in *N. benthamiana* leaves (**Supplementary Figure S2**).

#### Plant Produced [T4Y,I7R] Sunflower Trypsin Inhibitor-1 and Synthetically Produced Peptide are Equivalent

The SFTI-1-based plasmin inhibitor [T4Y,I7R]SFTI-1 is the most potent inhibitor of plasmin developed to date (Swedberg et al., 2019). In addition to its high potency (*K*i = 0.041 nM), the inhibitor displays a million-fold selectivity over other serine proteases found in blood and is a promising lead compound for certain antifibrinolytic treatments. As this inhibitor carries only two residue changes to SFTI-1 and retains an aspartic residue for AEP-mediated cyclization, we hypothesized that it is a good candidate for plant-based production. Similar to wild-type SFTI-1 peptide, [T4Y,I7R]SFTI-1 production required the co-expression of the *O. affinis* OaAEP1b ligase (**Figure 4A**) where yields of 12.3 ± 3.3 μg g−1 DW (s.d., *n* = 5) cyclic [T4Y,I7R]SFTI-1 were obtained (**Figure 4B, Supplementary Figure S3**). As an approach to further improve these yields, we then reengineered the Oak1 precursor to contain three tandem repeats of the [T4Y,I7R]SFTI-1 peptide (**Figure 4A**). Tandem repeats of kalata type peptides are commonly observed in cyclotide precursor genes (Craik and Malik, 2013) and, at least in part, may be responsible for the observed high yields. By making this change, we improved the *in planta* yield of

as three tandem repeats, adjacent the signal peptide (SP) and amino terminal propeptide (NTPP). (B) *In planta* yields (μg/g DW) were calculated using the

method of standard addition (Supplementary Figure S3).

To validate the functional and structural equivalence of synthetic peptide vs. the *in planta* produced [T4Y,I7R]SFTI-1, we purified [T4Y,I7R]SFTI-1 peptide from lyophilized leaf tissue. Peptides were extracted, fractionated, and purified using C18 SPE with several rounds of reverse phase HLPC. Structural equivalence to synthetically produced [T4Y,I7R] SFTI-1 was demonstrated *via* LC-MS coelution (**Supplementary Figure S4**), MS-MS fragmentation patterns (**Supplementary Figure S5**), and NMR analysis (**Figure 5A, Supplementary Figure S6**). Functional equivalence of plant-produced [T4Y,I7R] SFTI-1 was demonstrated by determining the inhibitory constant of the purified peptide. The calculated *K*i was 0.025 ± 0.004 nM (**Figure 5B**), which is comparable to the *K*i 0.041 ± 0.005 nM previously calculated for synthetic peptide (Swedberg et al., 2019).

#### DISCUSSION

Peptides as therapeutics are posited to bridge the gap between traditional small molecule drugs and larger biologics, offering the potential of higher specificity, reduced off-target effects, and potentially lower production costs (Craik et al., 2013). However, one drawback is their poor stability, which directly affects efficacy due to shorter *in vivo* half-lives. To counter this limitation, much emphasis has been placed on developing strategies to stabilize peptides, of which backbone cyclization has shown great promise (Poth et al., 2013; Thapa et al., 2014). A recombinant production system that provides for posttranslational backbone cyclization of peptides is thus highly desired. Here, we demonstrate a rapid, plant-based approach to produce and cyclize SFTI-1 peptide analogues in an environmentally friendly manner with capacity for scale-up.

In sunflower seed, SFTI-1 is processed from the PawS1 precursor, which additionally encodes for a seed storage albumin that is exclusively found in seed (**Figure 1A**) (Mylne et al., 2011). For expression in plant leaves, we chose to use the strategy described in (Poon et al., 2017) where the *Oak1* gene was reengineered to include SFTI-1, replacing the kB1 domain (**Figure 1B**). This ensured that the vacuole targeting elements within Oak1 (Conlan et al., 2011) that work efficiently in *N. benthamiana* leaf cells would be sufficient to direct the engineered SFTI-1 precursor to the vacuole where functional AEP enzymes are believed to reside. An additional benefit of this approach was simplified biosynthesis, as it did not require the concerted action of multiple AEP isoforms, which are required for SFTI-1 maturation from the PawS1 precursor protein (Mylne et al., 2011). The success shown here for a *N. benthamiana* leaf-based SFTI-1 production provides hope that the "plug n play" type approach presented here could be extended to other bioactive peptides derived from seed, such as the trypsin inhibitor class of cyclic peptides derived from *Momordica cochinchinensis* seed (Hernandez et al., 2000).

By using transient gene expression technology, we were able to produce 12.8 ± 3.0 μg g−1 DW (s.d., *n* = 3) SFTI-1 upon 6 days of incubation (**Figure 1D**). Although this is an improvement on the reported natural SFTI-1 levels in mature sunflower seed (0.5 μg g−1 seed) (Bernath-Levin et al., 2015), it is lower then that previously reported for the production of the cyclotide kB1, using similar transient gene expression conditions (Poon et al., 2017). By comparing *in vitro* cyclization efficiencies, we demonstrated that this is due to the slower processing of SFTI-1 peptide precursors by OaAEP1b compared to kB1 precursor substrates. One strategy to increase efficiency would be to use the native AEP ligase from sunflower, which may be more efficient for SFTI-1 precursor processing, if expressed heterologously in *N. benthamiana*. However, so far no efficient ligase-type sunflower AEP has been reported (Bernath-Levin et al., 2015; Haywood et al., 2018). Attempts to improve the *in planta* yield of SFTI-1 by co-expressing other known AEP ligases were unsuccessful, with OaAEP1b remaining the superior ligase for SFTI-1. In contrast for kB1, the superior ligase for *in planta* activity is CtAEP1, with OaAEP1b being on par with HeAEP3 (**Supplementary Figure S2**). These results indicate that substrate preference plays a key role in the *in planta* performance of AEP ligases; however, differences in transcript stability, translational efficiency, enzyme maturation, and stability are also likely.

Plant AEPs are known to have a strict preference for processing at either asparagine or aspartic residues and thus it was surprising that expression of OaAEP1b *in planta* could only produce cyclic SFTI-1 and not [D14N]SFTI-1. We confirmed through *in vitro* cyclization experiments that this was not due to an inefficiency of the enzyme, but more likely due to instability of the precursor or cyclic [D14N]SFTI-1 *in planta*. As only the AEP processing residue was changed, we reasoned that the instability observed is likely governed by the pool of endogenous AEPs present in *N. benthamiana* leaf cells, which may either outcompete the transgene-derived AEP ligase or degrade any correctly cyclized [D14N]SFTI-1. Interestingly, this was not observed with kB1 cyclized with either an asparagine or aspartic residue at the cyclization site (**Supplementary Figure S2**) and suggests that the observed instability of [D14N]SFTI-1 may be specific to SFTI-1 peptides. This finding is particularly relevant in the case of SFTI-1 peptide analogues that require the D14N residue substitution for potency (Swedberg et al., 2011), in which case, developing strategies to downregulate or knock out endogenous interfering AEPs with emerging gene editing technologies (Puchta, 2017) would be beneficial.

SFTI-1 peptide analogues have been engineered for diverse therapeutic applications ranging from anti-cancer (Swedberg et al., 2009, 2011), anti-obesity (Durek et al., 2018), pro and anti-angiogenesis (Chan et al., 2011; Qiu et al., 2017) as well as for treatment of a range of skin conditions (Chen et al., 2016; Zhu et al., 2017). Here, we have shown that like SFTI-1, the engineered plasmin inhibitor [T4Y,I7R]SFTI-1 (Swedberg et al., 2019) is amenable to plant-based production with the resultant purified peptide displaying both structural and functional equivalence to synthetically produced material. Through expression of a multi peptide domain gene construct the yield of the 14 amino acid cyclic [T4Y,I7R]SFTI-1 reached ~60 μg/g DW, roughly equivalent to that obtained previously for kB1 (29 aa) on a molar basis. Importantly, this approach provides for economies of scale, with lower inputs and infrastructure costs than synthetic peptide synthesis. Although currently still an emerging industry, commercial facilities for plant-based recombinant production have begun to be established, primarily for vaccine production. One such facility operated by iBio Biotherapeutics has a reported capacity to process ~3,500 kg of plant material per week (Holtz et al., 2015). Although, the economics of scaling up a plant-based production approach for peptide therapeutic production must be considered on a case-by-case basis, backbone cyclized peptide scaffolds such as SFTI-1 represent a particularly suitable case, given their natural occurrence in plants.

#### AUTHOR CONTRIBUTIONS

MJ, EG, TD, KH, DC, and MA conceived the experiments. MJ, SP, and HQ made gene constructs and performed transient assays. KY and AP performed MS analysis. KY and JS performed

functional analysis. TD performed NMR analysis. KH and KY produced and assayed recombinant AEP. All authors contributed to the writing of the manuscript.

#### FUNDING

We acknowledge funding from the Australian Research Council (ARC Laureate Fellowship FL150100146 to DC, ARC grant DP150100443 to DC, EG, and TD). This research was also supported by the 2015 Ramaciotti Biomedical Research Award to DC and MA and from the Simon Axelsen Memorial Fund.

#### REFERENCES


#### ACKNOWLEDGMENTS

The pEAQ vectors were kindly provided by Prof. George Lomonossoff at the John Innes Centre and Plant Bioscience Ltd.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00602/ full#supplementary-material

on the cyclic peptide framework of sunflower trypsin Inhibitor-1. *J. Med. Chem.* 61, 3674–3684. doi: 10.1021/acs.jmedchem.8b00170


origin via N -> S acyl transfer: potential inhibitors of human Kallikrein-5 (KLK5). *Tetrahedron* 70, 7675–7680. doi: 10.1016/j.tet.2014.06.059


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Jackson, Yap, Poth, Gilding, Swedberg, Poon, Qu, Durek, Harris, Anderson and Craik. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Characterization of a GDP-Fucose Transporter and a Fucosyltransferase Involved in the Fucosylation of Glycoproteins in the Diatom Phaeodactylum tricornutum

Peiqing Zhang<sup>1</sup>† , Carole Burel2,3† , Carole Plasson2,3† , Marie-Christine Kiefer-Meyer2,3 , Clément Ovide2,3, Bruno Gügi2,3, Corrine Wan<sup>1</sup> , Gavin Teo<sup>1</sup> , Amelia Mak<sup>1</sup> , Zhiwei Song<sup>1</sup> , Azeddine Driouich2,3, Patrice Lerouge2,3 and Muriel Bardor2,3,4 \*

<sup>1</sup> Bioprocessing Technology Institute, Agency for Science, Technology and Research (A∗STAR), Singapore, Singapore, <sup>2</sup> Laboratoire Glyco-MEV EA4358, UNIROUEN, Normandy University, Rouen, France, <sup>3</sup> Fédération de Recherche Normandie-Végétal – FED 4277, Rouen, France, <sup>4</sup> Institut Universitaire de France (I.U.F.), Paris, France

#### Edited by:

Suvi Tuulikki Häkkinen, VTT Technical Research Centre of Finland Ltd., Finland

#### Reviewed by:

Nicolas Arnaud, INRA – Versailles-Grignon Centre, France Richard Strasser, University of Natural Resources and Life Sciences, Vienna, Austria

#### \*Correspondence:

Muriel Bardor muriel.bardor@univ-rouen.fr †These authors have contributed equally to this work

#### Specialty section:

This article was submitted to Plant Metabolism and Chemodiversity, a section of the journal Frontiers in Plant Science

Received: 30 November 2018 Accepted: 25 April 2019 Published: 21 May 2019

#### Citation:

Zhang P, Burel C, Plasson C, Kiefer-Meyer M-C, Ovide C, Gügi B, Wan C, Teo G, Mak A, Song Z, Driouich A, Lerouge P and Bardor M (2019) Characterization of a GDP-Fucose Transporter and a Fucosyltransferase Involved in the Fucosylation of Glycoproteins in the Diatom Phaeodactylum tricornutum. Front. Plant Sci. 10:610. doi: 10.3389/fpls.2019.00610 Although Phaeodactylum tricornutum is gaining importance in plant molecular farming for the production of high-value molecules such as monoclonal antibodies, little is currently known about key cell metabolism occurring in this diatom such as protein glycosylation. For example, incorporation of fucose residues in the glycans N-linked to protein in P. tricornutum is questionable. Indeed, such epitope has previously been found on N-glycans of endogenous glycoproteins in P. tricornutum. Meanwhile, the potential immunogenicity of the α(1,3)-fucose epitope present on plant-derived biopharmaceuticals is still a matter of debate. In this paper, we have studied molecular actors potentially involved in the fucosylation of the glycoproteins in P. tricornutum. Based on sequence similarities, we have identified a putative P. tricornutum GDP-L-fucose transporter and three fucosyltransferase (FuT) candidates. The putative P. tricornutum GDP-L-fucose transporter coding sequence was expressed in the Chinese Hamster Ovary (CHO)-gmt5 mutant lacking its endogenous GDP-L-fucose transporter activity. We show that the P. tricornutum transporter is able to rescue the fucosylation of proteins in this CHO-gmt5 mutant cell line, thus demonstrating the functional activity of the diatom transporter and its appropriate Golgi localization. In addition, we overexpressed one of the three FuT candidates, namely the FuT54599, in P. tricornutum and investigated its localization within Golgi stacks of the diatom. Our findings show that overexpression of the FuT54599 leads to a significant increase of the α(1,3)-fucosylation of the diatom endogenous glycoproteins.

Keywords: diatom, fucosylation, nucleotide-sugar transporter, Phaeodactylum tricornutum, glycosylation, Golgi apparatus, fucosyltransferase, biopharmaceuticals

# INTRODUCTION

Diatoms are marine organisms that represent one of the most important source of biomass in the ocean (Nelson et al., 1995; Raven and Waite, 2004; Bowler et al., 2008). There has been a surge in developing the use of diatoms as a source of bioactive compounds in the food and cosmetic industries (Spolaore et al., 2006; Mata et al., 2010; Cadoret et al., 2012). In addition, the potential

**119**

of diatoms such as Phaeodactylum tricornutum (P. tricornutum) as solar-powered cell factories for the production of biopharmaceuticals has been demonstrated (Mathieu-Rivet et al., 2014; Hempel and Maier, 2016). For instance, P. tricornutum has been used to produce monoclonal antibodies (mAbs) (Hempel et al., 2011, 2017; Hempel and Maier, 2012). These alga-made mAbs are either directed against the highly pathogenic Marburg virus, which belongs to the same family as Ebola virus (Hempel et al., 2017) or the Hepatitis B virus surface antigen (Hempel et al., 2011; Hempel and Maier, 2012). Both recombinant mAbs produced in P. tricornutum were demonstrated to be able to recognize and bind their respective antigen. In addition, the mAb directed against the Hepatitis B was demonstrated to be of good quality, homogenous and glycosylated with oligomannosides (Vanier et al., 2015). This mAb is also able to bind to human Fcγ receptors (FcγRI and FcγRIIIa in particular) which suggests that it could be efficiently used in human immunotherapy to induce phagocytosis and antibody dependent cell-mediated cytotoxicity response (Vanier et al., 2018). Such therapeutic application represents currently a multimillion dollar market sales (Walsh, 2014). However, when compared to a human IgG1 used as a control, affinity of the diatom-made mAb is 4.5-fold lower than the one of the human IgG1 for FcγRI and three-times higher for FcγRIIIa. Such differences in kinetics and affinity are due to N-glycosylation variance (Vanier et al., 2018). Therefore, it would be necessary in the future to engineer the N-glycosylation of diatom-produced mAb to favor the presence of complex-type and fine-tuned N-glycans as it is well established that glycosylation of mAbs and biopharmaceuticals in general influences their biological functionality and efficacy (Lingg et al., 2012; Buettner et al., 2018; Mimura et al., 2018).

In this context, production of therapeutic proteins dedicated to human therapy in P. tricornutum requires a comprehensive understanding of the glycosylation biosynthesis that operates in the diatom. For instance, fucosylation of glycans N-linked to biopharmaceuticals produced in P. tricornutum is questionable. Indeed, glycosylation analysis of endogenous proteins demonstrated the presence of paucimannosidic glycans bearing an α(1,3)-fucose (Baïet et al., 2011). Moreover, putative immunogenicity of proteins produced in plants has been reported to be due to α(1,3)-fucose epitopes introduced by the plant expression system (Wilson et al., 1998; Bardor et al., 2003; Scha ahs et al., 2007). Such glyco-epitopes are absent in mammalian cells and thus could be immunogenic when proteins carrying such decorations are injected into mammals (van Beers and Bardor, 2012). This question is still a matter of debate. Indeed, previous study demonstrated the presence of antibodies raised against plant α(1,3)-fucose in 25% of non-allergic blood donors over 53 sera (Bardor et al., 2003). Another study reporting a phase I clinical trial for a plant-derived vaccine demonstrated that only 7 out of 48 volunteers (14.6%) had detectable amount of IgG directed against plant N-glycans including the α(1,3)-fucose epitope (Landry et al., 2010). More recently, Ward et al. (2014) reported that 19.2% of the subjects were positive for IgG antibodies directed against plant glyco-epitopes prior to vaccination and that 34% of the vaccinated volunteers developed IgG, and eventually IgE responses to plant glyco-epitopes after vaccination, even if no allergic/hypersensitivity response was observed. Recently, the taliglucerase alpha, the first plant cell-expressed biotherapeutic was approved on the market and is currently used for Enzyme Replacement Therapy to treat Gaucher Disease (Fox, 2012). This approved biopharmaceutical is a glycoprotein bearing as a major exposed glycan (representing more than 90% of the glycoforms) a paucimannosidic N-glycans substituted by an α(1,3)-fucose and a β(1,2)-xylose (Shaaltiel et al., 2007; Tekoah et al., 2013). In a Phase I clinical trial in healthy human volunteers, the injection of the taliglucerase alpha did not induce obvious adverse side effects that could be attributed to the plant N-glycan glyco-epitopes (Aviezer et al., 2009; Rup et al., 2017).

Fucosylation of glycoproteins requires the cytosolic biosynthesis of GDP-L-fucose and its import into the Golgi cisternae prior to its transfer onto the glycoproteins through the action of Golgi-localized fucosyltransferases (FuT). As mentioned earlier, biochemical investigation of the protein N-glycosylation in P. tricornutum has demonstrated that endogenous proteins carry mainly oligomannosides and little amount of paucimannosidic-type N-glycans carrying a fucose residue α(1,3)-linked to the proximal N-acetylglucosamine (GlcNAc) residue (Baïet et al., 2011). Moreover, three putative FuT have been predicted in the genome of P. tricornutum (Baïet et al., 2011; Mathieu-Rivet et al., 2014). In the present paper, we report on the characterization of molecular actors involved in the fucosylation of glycans N-linked to P. tricornutum proteins. This includes, in addition to the FuT candidates, the identification of a sequence encoding homolog of a putative GDP-L-fucose transporter (PtGFT). The later has been cloned and expressed in the Chinese Hamster Ovary (CHO)-gmt5 mutant cell line, a mammalian cell line deficient in GDP-L-fucose transporter activity (Zhang et al., 2012; Haryadi et al., 2013). We show that PtGFT is able to rescue the fucosylation of proteins in the CHO-gmt5 mutant cell line, thus demonstrating the functional activity of the diatom transporter. To the best of our knowledge, PtGFT represents the first microalgae nucleotide-sugar transporter to be functionally characterized so far. Moreover, we demonstrate that FuT54599 (encoded by the Phatr3\_J54599 gene) candidate is localized in the Golgi apparatus in P. tricornutum. Finally, we found that overexpression of the FuT54599 leads to an increase of the α(1,3)-fucosylation of the endogenous glycoproteins from P. tricornutum.

# MATERIALS AND METHODS

# Culture of P. tricornutum

The P. tricornutum strain Pt1.8.6 (CCAP1055/1) was grown in reconstituted artificial seawater (AQUARIUM SYSTEMS Instant Ocean) enriched with Conway medium containing 80 mg.L−<sup>1</sup> of sodium metasilicate (Na2SiO3), at 19 ± 1 ◦C as described previously in Ovide et al., 2018. The culture were grown under a 16 h/8 h light/night cycle (280–350 µmol photons m−<sup>2</sup> ·s −1 ) and agitation at 150 rpm.

Phaeodactylum tricornutum cells expressing the V5-tagged FuT54599 or the V5-tagged GnT I were grown in F/2 medium

containing 1,5 mM NH4Cl as nitrogen source and no sodium metasilicate, under a 16 h/8 h light/night cycle (280–350 µmol photons m−<sup>2</sup> ·s −1 ) and agitation at 150 rpm for the first 4 days at 19 ± 1 ◦C and then under continuous illumination (280–350 µmol photons m−<sup>2</sup> ·s −1 ) for the next 5 days at 23◦C ± 1 ◦C. Liquid cultures were grown with a 150 rpm agitation in a volume of 150 mL. For the expression of the V5-tagged glycosyltransferases, cells were induced at day 6 by transferring cells in a fresh F/2 medium containing 0.9 mM NaNO<sup>3</sup> as nitrogen source according to (Hempel and Maier, 2012).

#### Monosaccharide Composition Analysis of P. tricornutum Fractions

Phaeodactylum tricornutum cell pellets were resuspended in 70% ethanol with lysing beads (D-matrix lysing tubes, MP Biomedicals <sup>R</sup> ) and ground for 6 cycles during 30 s at 6.5 m.s−<sup>1</sup> in a FastPrep-24TM homogenizer (MP Biomedicals <sup>R</sup> ). Crushed cells were incubated at 70◦C for an hour. Extractions were then performed to remove lipids from the cell wall fraction. Briefly, the residues were extracted once with methanol: chloroform (1: 1 v/v), then with acetone at room temperature under agitation. Residues were dried under pure air flush. The monosaccharide composition of this alcohol insoluble residue (AIR) was analyzed by gas chromatography coupled to a Flame Ionization Detector spiking inositol as an internal standard as described previously (Louvet et al., 2011). One mg of each fraction was hydrolyzed in 2 M trifluoroacetic acid during 2 h at 110◦C. Trifluoroacetic acid was washed twice with a 50% iso-propanol: water solution. The released monosaccharides were converted to their O-methylglycosides by incubation in 1 M methanolic HCl at 80◦C overnight (Moore et al., 2006). After evaporation of methanol and HCl, the methyl-glycosides were resuspended in 200 µL of a methanol: pyridine mixture (4: 1 v/v) then submitted to a re-N-acetylation reaction by adding 50 µL of pure Acetic Anhydride and incubated for 1 h at 110◦C. Re-N-acetylated samples, after evaporation of reagents were then converted into their trimethylsilyl derivatives by heating the samples for 20 min at 110◦C in hexamethyldisilizane: trimethylchlorosilane: pyridine (3: 1: 9 v/v/v). After evaporation of the reagent, the samples were washed twice and finally suspended in 1 mL of cyclohexane before being injected in a CP-Sil 5 CB column (Agilent Technologies, United States). Data were integrated with the GC Star Workstation software (Varian/Agilent Technologies, United States). A temperature program (3 min at 40◦C; up to 160◦C at 15◦ min−<sup>1</sup> ; up to 220◦C at 1.5◦ min−<sup>1</sup> ; up to 280◦C at 20◦ min−<sup>1</sup> ; 3 min at 280◦C) was optimized for the separation of the most common cell wall monosaccharides. The GC-FID analyses were ran in triplicate on extracts isolated from 4 independent cell cultures.

# Bioinformatic Analyses

#### Database Search, Protein Sequences Alignments and Phylogenetic Analysis

Search for putative P. tricornutum GFT coding sequences was carried out by BlastP (2.2.28) searches (Altschul et al., 1997) in the sequence data of P. tricornutum in the Ensembl Protists database (release 40 – July 2018 EMBL-EBI). The topology of the potential PtGFT was predicted using the TMHMM (Sonnhammer et al., 1998) and Phoebius (Käll et al., 2004) tools.

Comparison of the protein sequences of various eukaryotic GFT was performed using the MUSCLE program (Edgar, 2004). This includes sequences of GFT which has been already functionally characterized (Luhn et al., 2001, 2004; Geisler et al., 2012; Peterson et al., 2013; Rautengarten et al., 2016) such as the one from Homo sapiens (NP\_060859.4; SLC35C1 gene), Mus musculus (NP\_997597.1; SLC35C1 gene), Cricetulus griseus (NP\_001233737.1; SLC35C1 gene), Caenorhabditis elegans (NP\_001263841.1; nstp-10 gene), Drosophila melanogaster (NP\_649782.1; Dm\_Gfr gene and Arabidopsis thaliana (NP\_197498.1; At5g19980 (GFT1/GONST4) genes.

Structure analysis of the putative P. tricornutum FuT and various characterized FuT from plant and invertebrates was done by using the NCBI Conserved Domain Database (Marchler-Bauer et al., 2017) and Pfam database (El-Gebali et al., 2019) tools. Comparison of the amino acid sequences corresponding to the Glyco\_Transf\_10 domains described in the NCBI Conserved Domain Database was carried out by the T-coffee web server<sup>1</sup> (Di Tommaso et al., 2011).

A phylogenetic tree was built with the GOLGI-LOCALIZED NUCLEOTIDE SUGAR TRANSPORTER GONST1, GONST2, GONST3, and GFT1/GONST4 amino acids sequences of A. thaliana (At2g13650, At1g07290, At1g76340, and At5g19980 genes), respectively (Baldwin et al., 2001; Handford et al., 2004; Mortimer et al., 2013), the P. tricornutum coding sequences corresponding to the Phatr3\_J43174, Phatr3\_J45630, and Phatr3\_J9609 genes, similar to the GONST sequences and the functionally characterized animal GDP-L-fucose transporters from H. sapiens (NP\_060859.4, SLC35C1\_GDP-fucose transporter 1 isoform a), C. elegans (NP\_001263841.1, GDP-fucose transporter), D. melanogaster (NP\_649782.1, Gfr) and also the H. sapiens SLC35C2 (NP\_001268386, solute carrier family 35 member C2 isoform d). The phylogenetic tree was drawn using the Phylogeny.fr platform (Dereeper et al., 2008, 2010) using the "One click" mode. The analysis used follow three steps: (i) complete sequences were aligned with MUSCLE 3.8.31 (Edgar, 2004); (ii) ambiguous regions (i.e., containing gaps and/or poorly aligned) were removed with Gblocks (v0.91b) (Castresana, 2000); and (iii) the phylogenetic tree was built using the maximum likelihood method implemented in the PhyML program (v3.0 aLRT) (Guindon et al., 2010). Graphical representation and edition of the phylogenetic tree were performed with TreeDyn (v198.3) (Chevenet et al., 2006). Finally, the phylogenetic tree viewer PhyD3 was used to finalize the figure<sup>2</sup> (Kreft et al., 2017).

<sup>1</sup>http://tcoffee.crg.cat/

<sup>2</sup>https://phyd3.bits.vib.be/index.html

# Cloning of the V5-Tagged FuT54599 and the GnT I Coding Sequences for Overexpression

The GnT I-V5 insert was obtained from a plasmid construct containing the GnT I coding sequence (GenBank: HM775384.1) fused to the V5-tag of the pcDNA3.1/V5-His-TOPO vector (described in Baïet et al., 2011) by PCR amplification with the PhusionTM high-fidelity DNA polymerase (Finnzymes) and the following forward primer 5<sup>0</sup> -CAATTGATGCGGTTGTGGAAACG-3<sup>0</sup> and reverse primer 5<sup>0</sup> -GGATCCTCTTTTCGGTGACGGAA-3<sup>0</sup> . After purification, the PCR product was cloned in the pJET1.2/blunt (Thermo Fisher), verified by Sanger sequencing and then inserted as a MunI- BamHI restriction fragment in the pPha-NR expression vector (GenBank accession number: JN180663) digested with the EcoRI and HindIII restriction enzymes (Thermo Scientific). Transformation of P. tricornutum Pt1.8.6 cells was done by biolistic as described by Hempel and Maier (2012). The positive transformants were selected by PCR analysis as described below. The cloning of the V5-tagged-FuT54599 sequence in the pPha-NR vector was carried out by the same way excepted that the V5-tagged-FuT54599 was obtained by gene synthesis (according to the genomic sequence of the Phatr3\_J54599 gene, GeneCust). The primers used to retrieve the V5-tagged-FuT54599 insert by PCR amplification were 5<sup>0</sup> -GAGCTCATGTCACTTCGCAAG-3<sup>0</sup> (forward) and 5 0 -AAGCTTACGTAGAATCGAGACCGAGGAGA-3<sup>0</sup> (reverse). Finally, the FuT54599-V5 coding sequence was inserted in the pPha-NR as a SacI-HindIII restriction fragment before transforming P. tricornutum cells.

# PCR and RT-PCR Analysis

For DNA or RNA isolations, sub-culturing of P. tricornutum cells was conducted in two 500 mL Erlenmeyer flasks containing 200 mL of sterilized fresh medium. At steady state (1 × 10<sup>8</sup> cells.mL−<sup>1</sup> ), the cells were pelleted by centrifugation at 4,500 g during 10 min, at 4◦C and then resuspended in 1 mL of NucleoZOL (Macherey-Nagel, GmbH & Co. KG, Düren, Germany) for RNA isolation or in the lysis buffer PL1 (Macherey-Nagel) for DNA extraction. Then, samples were transferred in lysing matrix E, 2 mL tubes (MP Biomedicals <sup>R</sup> ), immediately frozen in liquid nitrogen and stored at −80◦C until purification. Cell lysis was carried out by using the FastPrep-24TM homogenizer (MP Biomedicals <sup>R</sup> ) for 4 cycles of 30 s, at 6.5 m.s−<sup>1</sup> . Then, after 5 min of incubation at room temperature and a centrifugation step of 5 min at 12,000 g, the supernatant was recovered and transferred to a new 2 mL tube. Genomic DNA was purified with the Nucleospin Plant II kit (Macherey-Nagel) according to the manufacturer's instructions. Total RNA was isolated using a combination of the NucleoZOL reagent method (Macherey-Nagel) for the extraction and the NucleoSpin RNA Plus kit (Macherey-Nagel) for purification following the supplier's instructions. After DNase treatment with the TURBO DNA-freeTM Kit (InvitrogenTM), the first-strand cDNA was synthetized from 2 µg of RNA using the High-Capacity cDNA Reverse Transcription Kit with RNase Inhibitor (Applied BiosystemsTM).

The PCR reactions were prepared according to the GoTaq <sup>R</sup> G2 DNA Polymerase protocol (Promega) in a total volume of 20 µL. A 2 µL aliquot of a 1:10 dilution of gDNA or cDNA was added to the mixture and, in parallel, a reaction with 2 µL of water was prepared as a negative control. PCR was performed in a Veriti Thermal Cycler (Applied BiosystemsTM) using a 3 steps program as follow: 5 min of initial denaturation at 95◦C, followed by 35 cycles of 30 s for denaturation at 95◦C, 30 s for annealing at 60◦C, 30 s for extension at 72◦C and a final elongation for 5 min at 72◦C. A 14 µl aliquot of the PCR reaction was analyzed on a 1.8% agarose gel stained with SafeViewTM (ABM) to reveal the amplified products.

A primer pair specific to the putative PtGFT and allowing to distinguish the cDNA and DNA sequences was designed with the Primer-Blast program (Ye et al., 2012) using the nucleotide sequence NCBI accession number XM\_002177440.1 as the template. The forward primer 5 0 -TTGTCGGGCATCTTCTGGTC-3<sup>0</sup> and the reverse primer 5 0 -GACGAATTCCCAGGCACGTA-3<sup>0</sup> were used in this work.

To screen the P. tricornutum cells transformed with the V5-tagged GnT I coding sequence by PCR amplification of DNA, the same primer pair as the one used for retrieving the sequence from the pcDNA3.1/V5-His-TOPO vector (described in the section "Cloning of the V5-Tagged FuT54599 and the GnT I Coding Sequences for Overexpression" of the Materials and Methods) was chosen.

For the screening of the P. tricornutum transformants expressing the Phatr3\_J54599 gene fused to a 3<sup>0</sup> V5-Tag a forward primer specific to the FuT54599 coding sequence (50 -GCCAGGCCAATTATAGTCGC-3<sup>0</sup> ) was used in combination with a reverse primer specific to the V5-tag (50 -GACCGAGGAGAGGGTTAGGG-3<sup>0</sup> ).

# Complementation of the CHO-gmt5 Line

The coding sequence of the candidate PtGFT was used to prepare a DNA construct in the pcDNATM3.1(+) Mammalian Expression Vector (Invitrogen, Life Technologies). The PtGFT coding sequence from the NCBI accession no. XM\_002177440.1 (nucleotides 39–1121) in fusion with the nucleotide sequence encoding a HA-tag at its 5<sup>0</sup> end and a "GCCACC" Kozak sequence was obtained by gene synthesis and then cloned as a HindIII-XhoI fragment in the pcDNATM3.1(+).plasmid. The synthetic DNA sequence is registered under the NCBI accession number KT737477. Transient expression of PtGFT gene in CHO-gmt5 cell line, immunodetection of proteins and affinostaining with Aleuria Aurantia Lectin (AAL), were carried out as previously reported in Zhang et al. (2012).

# N-Glycan Profiling of CHO-gmt5 Proteins

1 × 10<sup>7</sup> CHO cells at the mid-exponential phase (day 4) were harvested, washed 3 times with PBS 1X and resuspended in 1 mL of extraction buffer (25 mM Tris, 150 mM NaCl, 5 mM EDTA, 1% CHAPS, pH 7.4) prior to sonication for 15 min. The samples were then centrifuged at 500 g for

10 min. The supernatant was saved whereas the pellet was extracted a second time using 500 µL of the same extraction buffer. The second supernatant was pooled with the previous one before dialyzing against 4 × 1 L of 50 mM ammonium bicarbonate, pH 8.5 at 4◦C for 24 h using 7000 MWCO dialysis cassette. After 24 h of dialysis, the sample was transferred to a 7 mL Teflon-lined capped amber glass vial. 2 mL of 50 mM Tris-HCl, pH 8.5 containing 2 mg.mL−<sup>1</sup> of dithiothreitol was added to the sample. After homogenization, the sample was incubated in the dark at 37◦C for 1 h under rotation at 20 rpm. Iodoacetic acid (10 mg.mL−<sup>1</sup> ) was added to the sample, vortexed and incubated at 37◦C for another 2 h in the dark. At the end, this carboxymethylation process was terminated by dialyzing the sample against 4 × 1 L of 50 mM ammonium bicarbonate, pH 8.5 at 4◦C for 24 h. The sample was then transferred to a 7 mL Teflon-lined capped glass vial and finally evaporated to dryness. The reduced carboxymethylated proteins were digested with 40 µg of trypsin (Promega) and further deglycosylated by peptide-N-glycosidase F (PNGase F) (Prozyme). The digestions, purification and permethylation of the resulting N-glycans were performed as previously described in Yusufi et al. (2017). MALDI-TOF mass spectrometry data was acquired on a 5800 MALDI-TOF/TOF mass spectrometer (AB Sciex, Foster City, CA, United States) in positive reflectron mode. Permethylated samples were reconstituted in 30 µL of 80% (v/v) methanol in water. 0.5 µL of the sample was then spotted on a target plate along with 0.5 µL of matrix [10 mg.mL−<sup>1</sup> 2,5-dihydroxybenzoic acid (Water Corporation, Milford, MA, United States) dissolved in 80% (v/v) methanol in water]. The 4700 calibration standard kit, calmix (AB Sciex) was used as the external calibrant for the MS mode. The mass spectrum of the sample was acquired from a mass range of m/z 500–5,000 with total accumulated shots of 10,000. The laser intensity used was 80%.

#### Transmission Electron Microscopy

High pressure freezing, freeze substitution and transmission electron microscopy of P. tricornutum expressing V5-tagged FuT54599 or GnT I were carried out as described in Ovide et al. (2018). Immunocytochemistry of V5-tagged transferases was carried out using antibodies raised against the V5 epitope (mouse anti-V5 tag antibodies, Invitrogen, dilution 1/20) and second antibody (EM-goat anti mouse IgG + IgM 10 nm gold particles, BBI solution, dilution 1/20). A classical staining using uranyl acetate/lead citrate and eventually KMnO<sup>4</sup> was done before observation as previously described (Venable and Coggeshall, 1965).

# Extraction of Proteins for Western Blot Analysis

The expression of the V5-tagged FuT54599 or GnT I was induced at day 6 by transferring P. tricornutum cells in a fresh 100% seawater medium containing 0.9 mM NaNO3. After 24, 48, and 72 h, cell cultures were harvested by centrifugation (2,000 g during 10 min). Cell pellets were frozen. The cell pellets were then re-suspended in a 0.1 M Tris buffer pH7 containing a Protease Inhibitor Cocktail (SIGMAFASTTM Protease Inhibitor Cocktail Tablets, EDTA-Free) and the cells were broken down using the FastPrep-24TM homogenizer (MP Biomedicals <sup>R</sup> ) as described for PCR and RT-PCR analysis. The extracted proteins were centrifuged first at 10,000 g during 30 min giving the total protein extract. Sample was then centrifuged at 100,000 g during 1 h 30 at 4◦C.

Pellet containing the membrane proteins was separated from the intracellular proteins that were present in the supernatant. Both fractions were analyzed by NuPAGE Bis-Tris Gel electrophoresis and Western Blot analysis. About 50 µg of intracellular proteins were denatured with the Laemmli sample buffer during 10 min at 100◦C and loaded on a NuPAGE Bis-Tris gel (4–12%). Likewise, membrane fractions were loaded on the gel after denaturation. The migration of the proteins through the gel was carried out in a MOPS buffer at 180 V during 1 h. For the detection of V5-tagged glycosyltransferases such as the FuT54599 or the GnT I, 1 µg of Escherichia coli Positive Control (E. coli) Whole Cell Lysate (Abcam) was used as a positive control (presence of the V5-tag). Five µL of the PageRuler Plus Prestained Protein Ladder (Thermo Fisher) was loaded as molecular weight markers. After separation on the NuPAGE Bis-Tris gel, proteins were blotted on a nitrocellulose membrane using a semi-dry transfer Thermo ScientificTM PierceTM Power Blotter. A Ponceau S staining was performed to visualize the efficiency of the protein transfer on the nitrocellulose membrane.

For the detection of the V5-tagged FuT54599, the nitrocellulose membrane was saturated overnight in TBS-T buffer and then incubated with a primary rabbit anti-V5 antibody (Invitrogen) at a dilution of 1/3,000 in TBS-T for 2 h at room temperature. The secondary antibody used was a goat anti-rabbit antibody conjugated with HRP (Sigma). It was used at a dilution of 1/30,000 in TBS-T for 1 h at room temperature. Revelation was performed using the ECL west Pico plus kit (Thermo Fisher) according to the manufacturer's instructions. Exposure time was 1 min.

Above 50 µg of total protein extract were loaded on a NuPAGE Bis-Tris gel (4–12%) after denaturation and the migration of the proteins through the gel was carried out in a MOPS buffer as described above. One µg of PLA<sup>2</sup> from honey bee venom (14.5 kDa, SIGMA-ALDRICH) was used as a positive control and 1 µg of Ribonuclease B from bovine pancreas (15 kDa, SIGMA-ALDRICH) was used as negative control. Indeed, the PLA<sup>2</sup> is known to be glycosylated with α (1,3)-core fucose N-glycans and the Ribonuclease B is bearing high mannose type N-glycans (Joao and Dwek, 1993; Lai and Her, 2002). Five µL of the PageRuler Plus Prestained Protein Ladder (Thermo Fisher) was loaded as molecular weight markers. After separation on NuPAGE Bis-Tris gel, proteins were blotted on a nitrocellulose membrane using a semi-dry transfer Thermo ScientificTM PierceTM Power Blotter. A Ponceau S staining was performed to visualize the efficiency of protein transfer on the membrane.

Western blot was saturated overnight in TBS-T and then incubated with the anti-α(1,3)-core fucose antibody (Agrisera)

as a primary antibody at a dilution of 1/5,000 in TBS-T during 2 h at room temperature. After washing, a secondary goat anti-rabbit antibody conjugated with HRP (Sigma) was used at a dilution of 1/30,000 in TBS-T for 1 h at room temperature. Revelation was performed using the ECL west Pico plus kit (Thermo Fisher) according to the manufacturer's instructions. Exposure time was 1 min.

#### RESULTS

#### Presence of Fucose in Phaeodactylum tricornutum Glycoconjugates

The activated nucleotide-sugar GDP-L-fucose is synthesized in the cytosol from GDP-D-mannose. Bioinformatic analysis revealed that homologs of GDP-D-mannose-4,6-dehydratase and the GDP-4-keto-6-deoxy-D-mannose-3,5-epimerase-4 reductase, the two enzymes of the GDP-L-fucose pathway in eukaryotes, are predicted in the P. tricornutum genome (Gügi et al., 2015). Moreover, little amount of fucosylated N-glycans has already been described in P. tricornutum (Baïet et al., 2011). To investigate whether P. tricornutum accumulates other fucose-containing polymers, a monosaccharide composition of an alcohol insoluble fraction (AIR) was carried out by gas chromatography analysis. Fucose was found to represent about 4% of total monosaccharides in this diatom fraction (**Table 1**), suggesting that the other polysaccharides or glycoconjugates of P. tricornutum may contain fucose monomers. The relative proportions of other monosaccharides are consistent with those previously reported (Abdullahi et al., 2006; Gügi et al., 2015). The high percentage of mannose is likely to originate from the major cell wall polysaccharide of P. tricornutum, namely sulfated glucuronomannan (Ford and Percival, 1965; Tesson et al., 2009).

#### Phaeodactylum tricornutum Putative Transporter Predicted Protein Sequences Exhibits Strong Amino Acid Identities With Eukaryotic GDP-Fucose Transporters

For the synthesis of either polysaccharides or glycoproteins, GDP-L-fucose has to be imported by a specific GDP-L-fucose transporter into Golgi cisternae where the elongation of glycans occurs. In order to search for putative candidates for GDP-sugar transporters in P. tricornutum, a BlastP analysis was carried out using the GOLGI-LOCALIZED NUCLEOTIDE SUGAR TRANSPORTER GONST1, GONST2, GONST3, and GFT1/GONST4 amino acids sequences of A. thaliana as queries (Baldwin et al., 2001; Handford et al., 2004; Mortimer et al., 2013; Rautengarten et al., 2016). This allows the identification of three protein sequences encoded by the Phatr3\_J43174, Phatr3\_J45630, and Phatr3\_J9609 genes. These P. tricornutum sequences were also compared with the functionally characterized GDP-L-fucose transporters from H. sapiens (NP\_060859.4, SLC35C1\_GDP-fucose transporter 1 isoform a), C. elegans (NP\_001263841.1, GDP-fucose transporter), D. melanogaster (NP\_649782.1, Gfr), and the H. sapiens SLC35C2 (NP\_001268386, solute carrier family 35 member C2 isoform d). Phylogenetic analysis shows that the protein encoded by the Phatr3\_J43174 is more closely related to the GDP-L-fucose transporter from mammals and C. elegans whereas the two other putative GDP-sugar transporters encoded by Phatr3\_J45630 and Phatr3\_J9609, respectively, are related to the GFT1/GONST4 (Rautengarten et al., 2016) and the GONST 2 and 3, respectively (**Figure 1**). Due to the fact that (1) the codon usage in P. tricornutum is much closer to that of human (Heitzer et al., 2007; Bowler et al., 2008), (2) the cell metabolism from P. tricornutum shares common features with both animals and plants (De Martino et al., 2009; Martin-Jézéquel and Tesson, 2012) and (3) its best amino acids sequence homology with the human SLC35C1 (Luhn et al., 2001; Ishida and Kawakita, 2004; Zhang et al., 2012), the protein sequence encoded by the Phatr3\_J43174 gene (NCBI accession number: XP\_002177476.1) was selected as the most promising GDP-L-fucose transporter candidate in P. tricornutum.

#### Functional Characterization of the GDP-Fucose Transporter Encoded by the Phatr3\_J43174 Gene in P. tricornutum The GDP-Fucose Transporter Is Expressed in P. tricornutum

Phatr3\_J43174 codes for a protein of 360 amino acids in length that is in agreement with the expected length for nucleotide sugar transporters (Ishida and Kawakita, 2004). The candidate protein from P. tricornutum is a hydrophobic protein predicted to be a type III membrane protein containing 8–10 membrane-spanning helices as observed for nucleotidesugar transporters that act as antiporters able to exchange cytosolic nucleotide-sugars for the corresponding nucleotide monophosphate (**Supplementary Figure S1**). The same topology has previously been described for the H. sapiens, C. elegans, and D. melanogaster GDP-L-fucose transporter (Luhn et al., 2001, 2004; Ishida and Kawakita, 2004; Geisler et al., 2012; Peterson et al., 2013; Rautengarten et al., 2016). The P. tricornutum Phatr3\_J43174 deduced protein

TABLE 1 | Monosaccharide composition of an alcohol insoluble fraction (AIR) isolated from P. tricornutum cells.


Results are expressed in relative percentage (%). Values are the means of monosaccharide quantities determined in triplicate by GC-FID analysis performed on AIR isolated from four independent cell cultures. Rha, rhamnose; Fuc, fucose; Xyl, xylose; GlcUA, glucuronic acid; ManUA, mannuronic acid; Man, mannose; Gal, galactose; GalUA, galacturonic acid; Glc, glucose.

sequence exhibited 46% amino acid identities (with a query coverage of 88%) with the human GDP-L-fucose transporter (**Figure 2**). Identities from 39 to 45% were observed with GDP-L-fucose transporters from other eukaryotic organisms such as M. musculus (NCBI accession number NP\_997597.1), C. elegans (NCBI accession number NP\_001263841.1), H. sapiens (NCBI accession number NP\_060859.4); C. griseus (NCBI accession number NP\_001233737.1), or D. melanogaster (NCBI accession number NP\_649782.1). Only 22% of identity (with a query coverage of 84%) were observed between the P. tricornutum and A. thaliana (NCBI accession number NP\_197498.1) proteins. Furthermore, the putative sequence possesses a conserved C-terminal tail which is known to be crucial for Golgi localization and GDP-L-fucose import into the Golgi apparatus (Zhao et al., 2006; Lim et al., 2008; Zhang et al., 2012). Moreover, the two Gly residues (positions 171 and 266, respectively, in the PtGFT\_XP\_002177476.1; **Figure 2**) which are required for GDP-L-fucose import in the Golgi apparatus are conserved in the P. tricornutum protein candidate (Zhang et al., 2012). Considering the high sequence identities with biochemically characterized GDP-L-fucose transporters, we postulate that this protein is able to import GDP-L-fucose in the Golgi apparatus of P. tricornutum and is, accordingly, named P. tricornutum GDP-L-fucose transporter (PtGFT) in this paper. To determine whether the PtGFT gene is expressed in P. tricornutum, PCR analyses using specific PtGFT primer pairs were performed on cDNA and gDNA prepared from P. tricornutum cells and allowed the amplification of specific bands (**Supplementary Figure S2**). Difference in size of amplified sequences reflected the presence of an intron in the PtGFT candidate gene.

#### The GDP-Fucose Transporter From P. tricornutum Is Able to Rescue the Glycoprotein Fucosylation in the CHO-gmt5 Cells

To investigate its cell localization and its biochemical function, a N-terminal HA-tagged version of the PtGFT was transiently expressed in the CHO-gmt5 mutant cell line that is devoid of endogenous GDP-L-fucose transporter (Zhang et al., 2012; Haryadi et al., 2013). GDP-L-fucose transporter has been shown to be localized in the Golgi apparatus to supply this compartment with activated fucose (Luhn et al., 2001). To


alignments of GDP-L-fucose transporters protein sequences from H. sapiens (NP\_060859.4; SLC35C1 gene), M. musculus (NP\_997597.1; SLC35C1 gene), C. griseus (NP\_001233737.1; SLC35C1 gene), C. elegans (NP\_001263841.1; nstp-10 gene), D. melanogaster (NP\_649782.1; Dm\_Gfr gene); A. thaliana [NP\_197498.1; At5g19980 (GFT1/GONST4) gene], and PtGFT (XP 002177476.1) with the MUSCLE program (https://www.ebi.ac.uk/; Edgar, 2004). The figure was created with the Espript program (http://espript.ibcp.fr/ESPript/ESPript/index.php; Robert and Gouet, 2014). The two red arrows highlight the glycine residues positions 171 and 266, respectively, in the PtGFT\_XP\_002177476.1 sequence.

ascertain the localization of PtGFT in the Golgi apparatus, CHO-gmt5 cells transiently expressing the HA-tagged PtGFT were examined using immunofluorescence microscopy. PtGFT sub-cellular localization pattern was compared to that of Giantin, a Golgi marker and that of the Protein Disulfide Isomerase, an ER-resident soluble protein. As shown in **Figures 3A,B**, HA-tagged PtGFT clearly co-localized with Giantin but not with Protein Disulfide Isomerase (PDI in the **Figure 3B**), indicating that PtGFT is efficiently targeted to the Golgi membrane in the CHO-gmt5 cells. To claim the capacity of PtGFT to transport GDP-L-fucose in the complemented mutant, CHO-gmt5 cells transiently expressing PtGFT were affinodetected with Aleuria Aurantia Lectin (AAL), a fucose-specific lectin exhibiting a strong affinity toward core fucose and Lewis-X epitope on N-linked glycans (Bergstrom et al., 2012; Zhang et al., 2012). As shown in **Figure 3B**, only cells expressing PtGFT exhibit AAL staining, suggesting that the diatom transporter candidate is able to rescue fucosylation of proteins in the PtGFT complemented CHO-gmt5 mutant cells. Such a result also confirmed the Golgi localization of the PtGFT. To further confirm that PtGFT is able to rescue a wild-type fucosylation of endogenous proteins, N-glycosylation of proteins in the CHO-gmt5 complemented with the PtGFT was investigated. Proteins from both non-complemented and complemented CHO-gmt5 cell lines were isolated. N-linked glycans to proteins were then released by PNGase F treatment, permethylated and then analyzed by MALDI-TOF/TOF mass spectrometry (**Figure 4A**). As expected, N-glycans isolated from CHO-gmt5 were mostly asialo, afucosylated galactosylated

bi- and tri-antennary N-glycans. In PtGFT complemented line, similar species were detected in the N-linked glycan profile with a shift of 174 mass units assigned to an additional permethylated deoxyhexose (fucose) residue. Furthermore, MS<sup>2</sup> fragmentation pattern of a galactosyl biantennary glycan clearly shows a fragment ion at m/z 474 confirming the core fucosylation of N-linked glycans in the PtGFT CHO-gmt5 complemented line (**Figure 4B**). Taken together, both AAL staining and mass spectrometry N-glycan profiling of CHO-complemented cells demonstrated that expression of PtGFT in CHO-gmt5 cell line is able to rescue the fucosylation of proteins by complementing the lack of endogenous GDP-L-fucose transporter in the CHO mutant line. This demonstrates the capacity of PtGFT to import the GDP-L-fucose within the Golgi apparatus where the N-glycan fucosylation takes place.

#### Functional Characterization of the Fucosyltransferases in P. tricornutum Specific Features of Putative Fucosyltransferases in P. tricornutum

Three FuT have been predicted in a preliminary investigation of the P. tricornutum genome (Baïet et al., 2011). These predicted proteins of 798, 707, and 481 amino acids are encoded, respectively, by the Phatr3\_J46109, Phatr3\_J46110, and Phatr3\_J54599 genes. The three putative fucosyltransferases have been described to possess structural features characteristic for the CAZy family GT10, family to which belongs the FuT

(Baïet et al., 2011). The three protein sequences exhibit Pfam GT10 domains. Only one GT domain is present in the C-terminal part of Phatr3\_J46109 and Phatr3\_J46110 as observed for plant α(1,3)-FuT (Both et al., 2011), whereas two domains are predicted for Phatr3\_J54599 as reported for invertebrate α(1,3)-FuT by Pfam analysis (El-Gebali et al., 2019). The amino acid sequences corresponding to the Glyco\_Transf\_10 domains (described in the NCBI Conserved Domain Database) of the putative P. tricornutum FuT were compared to those of various characterized FuT from plant and invertebrates by using the T-coffee multiple sequence alignment server<sup>3</sup> (Di Tommaso et al., 2011). All putative FuT present the CXXC motif located on the C-terminal end of the proteins which is well known to be essential for the enzymatic activity as it is involved in the formation of disulfide bridges which favor good folding of the

<sup>3</sup>http://tcoffee.crg.cat/

FuT (Holmes et al., 2000). Additional conserved domains named "1st cluster" and "α1,3-FuT motif " are also present in the three candidates. These domains are described to be involved in the binding to the donor substrate which is the GDP-L-fucose for fucosyltransferase activity (Both et al., 2011). Moreover, the amino acids which are indicated in red in the **Figure 5** are conserved in the P. tricornutum putative FuT. These residues have been demonstrated by directed mutagenesis to be involved in the binding to the acceptor substrate. Based on RNA-seq data recently reported in Ovide et al. (2018), mainly Phatr3\_J46109 and Phatr3\_J54599 are expressed in the three morphotypes of P. tricornutum (fusiform, triradiate, and oval morphotypes).

#### FuT54599 Is Active and Localized in the Golgi Stacks

The coding sequences of the three putative FuT were cloned and expressed in P. tricornutum in fusion to a C-terminal V5 tag which allows detection of the recombinant glycosyltransferase as previously described for the heterologous expression of the N-acetylglucosaminyltransferase I (GnT I) from P. tricornutum within the CHO Lec1 mutant cells (Baïet et al., 2011). Clones expressing the V5-tagged fucosyltransferase respective genes were selected by PCR and RT-PCR analyses. Only P. tricornutum lines expressing the V5-tagged version of the Phatr3\_J54599 fucosyltransferase candidate, called FuT54599 from now, were positive and further studied. The other clones transformed with the Phatr3\_J46109 and Phatr3\_J46110 genes, respectively, were negative by RT-PCR and later by Western blot analyses. In contrast, analyses by western blot using a V5 specific antibody of microsomal fractions extracted from the transformed lines expressing the V5-tagged FuT54599 revealed, 24 h after induction a specific band above 60 kDa, suggesting that the FuT54599 might be localized within the Golgi apparatus in P. tricornutum (**Supplementary Figure S3**). In order to confirm this result and to precisely localize the FuT54599 at the sub-cellular level, Transmission Electron Microscopy (TEM) coupled to immuno-gold labeling using antibodies directed against the V5 epitope was used on high pressure frozen samples (**Figure 6**). Such a technique has already been employed to localize glycosyltransferases within Golgi stacks in plant cells (Chevalier et al., 2010). P. tricornutum cell lines overexpressing a V5-tagged of its endogenous GnT I, which is a Golgi-resident transferase involved in the N-linked glycans modifications as previously reported (Baïet et al., 2011), have been studied in parallel to the FuT54599. The excellent structural preservation of the different membrane system in high pressure frozen P. tricornutum cells allows us to orientate the Golgi apparatus. Indeed, as described in Donohoe et al. (2007, 2013), cis cisternae exhibit a much lighter luminal staining and possess a thicker lumen as compared to the medial cisternae. The trans-Golgi cisternae is presenting a collapsed central luminal domain and swollen margins. As illustrated in **Figure 6**, the V5-tagged FuT54599 was found to be preferentially located in the medial/trans Golgi cisternae. A similar immunogold labeling was observed for the V5-tagged GnT I protein. In addition, western blot analysis using a specific anti-α(1,3)-fucose antibody on total protein extract from P. tricornutum revealed an increase of the level of α(1,3)-fucose epitopes associated with proteins of P. tricornutum lines overexpressing the V5-tagged FuT54599 as compared to the wild-type cells (**Figure 7**).

# DISCUSSION

In the context of the production of biopharmaceuticals dedicated to human therapy in P. tricornutum, a comprehensive understanding of its protein glycosylation is crucial. Core α(1,3)-fucose have been detected on glycans N-linked to proteins of P. tricornutum. This glyco-epitope may induce immune responses in humans after injection of a biopharmaceutical produced in this diatom. As a consequence, inactivation of genes encoding key enzymes of the fucosylation machinery will likely be required, taking advantage of recent progresses that have been achieved in P. tricornutum to develop genome editing tools such as TALEN and CRISPR/Cas9 (Daboussi et al., 2014; Nymark et al., 2016; Allorent et al., 2018; Kroth et al., 2018; Serif et al., 2018; Slattery et al., 2018; Stukenberg et al., 2018).

Fucosylation of glycoproteins starts by the cytosolic biosynthesis of GDP-L-fucose, its import into the Golgi apparatus and finally its transfer onto the glycoproteins within Golgi cisternae. With regard to the import in the Golgi apparatus of the fucose-activated nucleotide, we identified a putative GDP-L-fucose transporter exhibiting high sequence identity with well-characterized GDP-L-fucose transporters. When expressed in CHO-gmt5 mutant lacking endogenous GDP-L-fucose transporter activity, the PtGFT candidate is efficiently addressed to Golgi membranes and is able to rescue the fucosylation of proteins in the CHO-gmt5 mutant cell line, demonstrating that the cDNA sequence registered under the NCBI accession number KT737477 codes a functional Golgi resident P. tricornutum transporter which is at least able to import GDP-L-fucose. This suggests that molecular mechanisms controlling the targeting and nucleotide-sugar import are conserved between mammals and microalgae. To the best of our knowledge, this is the first diatom nucleotide-sugar transporter characterized to date. Two other putative GDP-sugar transporters are also predicted in P. tricornutum genome. We postulate that they may be involved in the D-mannose import, another abundant monosaccharide detected in this diatom that is also activated in the cytosol by coupling to GDP. In this work, we identified a transporter from P. tricornutum which is able to transport at least the GDP-L-fucose. However, based on the methodology used in this study (complementation of the CHO-gmt5 mutant) and on the monosaccharide composition of AIR from P. tricornutum which contained more than 44% of mannose, we cannot completely rule out that the transporter from P. tricornutum is not able to transport other nucleotide sugars like the GDP-D-mannose as well. Indeed, in A. thaliana, initial studies have established that GONST1 is within Golgi stacks and can functionally complement the yeast vanadate resistance glycosylation GDP-D-mannose transporter mutant (Baldwin et al., 2001; Handford et al., 2004). However, years later, this GONST1 transporter was described to be able to transport 4


FIGURE 5 | Phaeodactylum tricornutum putative FuT exhibits strong amino acid identities with α(1,3)-fucosyltransferases. Amino acids sequences comparison with the T-coffee program (http://tcoffee.crg.cat/, Di Tommaso et al., 2011) of the C-terminal GT10 domains of the three putative fucosyltransferases encoded, respectively, by the Phatr3\_J46109, Phatr3\_J46110 and Phatr3\_J54599 genes from Phaeodactylum tricornutum with biochemically characterized α(1,3)-fucosyltransferases from Arabidopsis thaliana (FUT11\_ARATH and FUT12\_ARATH), Medicago sativa (Q5DTC8\_MEDSA), Oryza sativa (Q6ZDE5\_ORYSJ), Zea mays Q0VH31\_MAIZE), Vigna radiata (Q9ST51\_VIGRR), Physcomitrella patens (Q8L5D1\_PHYPA), Drosophila melanogaster (FUCTA\_DROME), Caenorhabditis elegans (G5EDR5\_CAEEL), Apis mellifera (Q05GU3\_APICA) and Bombyx mori (H9JL25\_BOMMO). The graphic output reflects the level of consistency of the alignment of a considered residue (from blue/green: badly or poorly supported to pink which corresponds to strongly supported). Conserved motifs of the GT10 domain are indicated on the top of the alignment.

FIGURE 6 | Immunocytochemical localization of V5-tagged FuT54599 and V5-tagged GnT I in the Golgi apparatus of P. tricornutum. Transmission Electron micrographs of HPF/FS P. tricornutum cells which were embedded in LRW resin. (A) View showing a Golgi apparatus of P. tricornutum oriented as previously described for algae cells in Donohoe et al. (2007, 2013). (B) Localization of the V5-tagged GnT I in the medial-trans Golgi cisternae of P. tricornutum Golgi apparatus after immunolabelling with a mouse anti-V5 antibody used as a primary antibody (dilution 1/20) and a secondary antibody coupled to 10 nm gold beads (dilution 1/20) and then contrasted with uranyl acetate and lead citrate. (C) Localization of the V5-tagged FuT54599 in the median-trans Golgi cisternae of P. tricornutum Golgi apparatus after immunolabelling with a rabbit anti-V5 antibody used as a primary antibody (dilution 1/20) and a secondary antibody coupled to 10 nm gold beads (dilution 1/20) and then contrasted with uranyl acetate and lead citrate. N, nucleus; G, Golgi apparatus with cis and medial/trans cisternae; V, vacuole; m, mitochondria; p, pyrenoid; C, chloroplast; Cw, cell wall. Scale bar: 0.5 µm. Please refer to the Supplementary Figure S4 for negative controls.

FIGURE 7 | N-glycans bearing the α(1,3)-fucose epitope are increasing in P. tricornutum cells expressing the Golgi resident V5-tagged FuT54599. Western Blot analysis of total protein extracts from WT (lane 1) and P. tricornutum cells expressing the V5-tagged FuT54599 using a specific core α(1,3)-fucose antibody. Total proteins have been extracted from P. tricornutum cells expressing the V5-tagged FuT54599 24 h after induction (lane 2); 48 h after induction (lane 3); 72 h after induction (lane 4). The 15 kDa Ribonuclease B (Rb) was used a negative control (lane 5) and the 14.5 kDa Phospholipase A2 (PLA2) was used as a positive control (lane 6). In parallel, molecular weight (Mw) markers (PageRuler Plus Prestained Protein Ladder, Thermo Fisher) are reported and expressed in kDa. (A) Ponceau red staining of the membrane. (B) Revelation with the anti-α(1,3)-fucose antibody (Agrisera).

different GDP-sugars in vitro (Mortimer et al., 2013). Moreover, mutation in the GONST1 only alters the glycosylation of the glycosylinositol phosphoceramide, even if GDP-D-mannose is used in the biosynthesis of multiple glycoconjugates in the Golgi apparatus. Therefore, future work would be necessary to characterize the function of the other putative GDP-sugar transporters from P. tricornutum. In addition, it would be interesting to evaluate the physiological impact of the GFT inactivation in P. tricornutum and demonstrate whether the fucose transport is essential in the diatom as it is in A. thaliana and humans. Indeed, A. thaliana gonst1 mutants are dwarfed and developed spontaneous leaf lesions (Mortimer et al., 2013). Moreover, previous alteration of the mur1 gene (encoding for an isoform of the GDP-D-mannose-4,6-dehydratase) in A. thaliana demonstrated that the availability of GDP-L-fucose is critical for normal plant development and cell wall structure (Bonin et al., 1997; Rayon et al., 1999; Reuhs et al., 2004). In humans, missense mutations in the GDP-fucose transporter cDNA of patients suffering from a congenital disorder of glycosylation type IIc cause many symptoms including mental retardation, short stature, facial stigmata, and recurrent bacterial peripheral infections with persistently elevated peripheral leukocytes (Lübke et al., 2001).

Three putative FuT are also predicted in P. tricornutum genome. They exhibit Pfam GT10 domains as observed for α(1,3)-FuT of plants and invertebrates (Wilson et al., 2001; Both et al., 2011). We attempted to overexpress the three FuT candidates as fusion proteins containing a C-terminal

V5 tag in P. tricornutum. However, only the V5-tagged FuT54599 was detected. A monoexonic gene encodes this protein. Moreover, this protein is expected to be 481 amino acids long which is in agreement with core α(1,3)-fucosyltranferases characterized earlier in plants and invertebrates (Wilson et al., 2001; Paschinger et al., 2004). When looking at the topology, the FuT54599 is the only candidate which is clearly predicted to be a type II protein. Such topology has been observed for Golgi-resident proteins and especially glycosyltransferases (Czlapinski and Bertozzi, 2006). This includes a short N-terminal cytosolic tail, a transmembrane domain and consequent catalytic domain which is exposed in the lumen of the Golgi apparatus (Czlapinski and Bertozzi, 2006). The fact that the FuT54599 is a Golgi-resident protein has been confirmed in this work through the localization of the FuT54599 by immunogold-electron microscopy. This result is pioneer as so far, the sub-cellular organization in microalgae of Golgi enzymes involved in glycans and glycoconjugates biosynthesis has not been investigated. The V5-tagged FuT54599 was found to be mainly located in the medial/trans Golgi (**Figure 6A**). Similar sub-cellular localization was observed for the V5-tagged GnT I, a Golgi-resident transferase involved in the maturation of N-linked glycans (Baïet et al., 2011). These data suggest that GnT I and the putative α(1,3)-FuT are localized within specific Golgi cisternae. Localization in specific cisternal subtypes has been reported in mammals and land plants (Berger and Hesford, 1985; Chevalier et al., 2010; Schoberer and Strasser, 2011). This sub-cellular organization of enzymes within Golgi stacks is believed to control the step-by-step maturation of glycans N-linked to proteins along the secretory pathway. We postulate that such a compartmentation of Golgi enzymes would also occur in the microalgae P. tricornutum. In a context of the optimization of the N-glycosylation of a microalgae-made biopharmaceuticals, glyco-engineering strategies could benefit from such a compartmentation of Golgi enzymes by expressing chimaeric transferases targeted for optimal activity to a specific Golgi cisternae as previously reported for plants (Vézina et al., 2009).

# REFERENCES


# AUTHOR CONTRIBUTIONS

MB conceived and supervised the study. MB, M-CK-M, and PZ designed the experiments. CP, CB, BG, PZ, CW, GT, CO, and AM performed the experiments. ZS provided CHO-gmt5 mutant cell line. MB, M-CK-M, PZ, CP, CB, AD, CO, and BG analyzed the data. PL, MB, PZ, M-CK-M, and BG wrote the manuscript. All authors read and agreed on the submission of the manuscript.

# FUNDING

M-CK-M, CP, CB, BG, CO, AD, PL, and MB are grateful to the SFR NORVEGE, the GRR IRIB, the University of Rouen Normandie as well as the IUF (Institut Universitaire de France), France for their financial support. PZ, CW, GT, AM, and ZS were supported by Strategy Positioning Funding from Biomedical Research Council of the Agency for Science, Technology and Research (A<sup>∗</sup> STAR), Singapore. PZ and MB are grateful to the MERLION program for supporting the development of the collaboration between BTI, Singapore and URN, France. The funding bodies were not involved in the design of the study, collection and interpretation of data or in the writing of the manuscript.

# ACKNOWLEDGMENTS

The authors thank Dr. Sophie Bernard from the PRIMACEN platform, the Glyco-MEV lab for her technical support in the cryo-fixation step for TEM, and Pr Maier and Dr. Hempel for their collaboration regarding Pt transformation.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00610/ full#supplementary-material


chromatography. J. Chromatogr. B 885–886, 66–72. doi: 10.1016/j.jchromb. 2011.12.015




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Zhang, Burel, Plasson, Kiefer-Meyer, Ovide, Gügi, Wan, Teo, Mak, Song, Driouich, Lerouge and Bardor. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Butterfly Pea (Clitoria ternatea), a Cyclotide-Bearing Plant With Applications in Agriculture and Medicine

#### Georgianna K. Oguis, Edward K. Gilding, Mark A. Jackson and David J. Craik\*

Institute for Molecular Bioscience, The University of Queensland, St Lucia, QLD, Australia

#### Edited by:

Suvi Tuulikki Häkkinen, VTT Technical Research Centre of Finland Ltd., Finland

#### Reviewed by:

Robert Burman, Uppsala University, Sweden Christian W. Gruber, Medical University of Vienna, Austria Blazej Slazak, Władysław Szafer Institute of Botany (PAN), Poland

> \*Correspondence: David J. Craik d.craik@imb.uq.edu.au

#### Specialty section:

This article was submitted to Plant Metabolism and Chemodiversity, a section of the journal Frontiers in Plant Science

Received: 15 January 2019 Accepted: 29 April 2019 Published: 28 May 2019

#### Citation:

Oguis GK, Gilding EK, Jackson MA and Craik DJ (2019) Butterfly Pea (Clitoria ternatea), a Cyclotide-Bearing Plant With Applications in Agriculture and Medicine. Front. Plant Sci. 10:645. doi: 10.3389/fpls.2019.00645 The perennial leguminous herb Clitoria ternatea (butterfly pea) has attracted significant interest based on its agricultural and medical applications, which range from use as a fodder and nitrogen fixing crop, to applications in food coloring and cosmetics, traditional medicine and as a source of an eco-friendly insecticide. In this article we provide a broad multidisciplinary review that includes descriptions of the physical appearance, distribution, taxonomy, habitat, growth and propagation, phytochemical composition and applications of this plant. Notable amongst its repertoire of chemical components are anthocyanins which give C. ternatea flowers their characteristic blue color, and cyclotides, ultra-stable macrocyclic peptides that are present in all tissues of this plant. The latter are potent insecticidal molecules and are implicated as the bioactive agents in a plant extract used commercially as an insecticide. We include a description of the genetic origin of these peptides, which interestingly involve the cooption of an ancestral albumin gene to produce the cyclotide precursor protein. The biosynthesis step in which the cyclic peptide backbone is formed involves an asparaginyl endopeptidase, of which in C. ternatea is known as butelase-1. This enzyme is highly efficient in peptide ligation and has been the focus of many recent studies on peptide ligation and cyclization for biotechnological applications. The article concludes with some suggestions for future studies on this plant, including the need to explore possible synergies between the various peptidic and non-peptidic phytochemicals.

#### Keywords: peptides, forage crop, anthocyanins, organic pesticide, butelase, medicinal plant

# INTRODUCTION

Clitoria ternatea, commonly known as butterfly pea, is a perennial herbaceous plant from the Fabaceae family. It has recently attracted a lot of interest as it has potential applications both in modern medicine and agriculture, and as a source of natural food colorants and antioxidants. C. ternatea has long been cultivated as a forage and fodder crop, and early studies assessed the plant for these purposes (Reid and Sinclair, 1980; Barro and Ribeiro, 1983; Hall, 1985). Numerous field trials in Queensland, Australia, eventually led to the registry of C. ternatea cv. 'Milgarra' (Oram, 1992), the only cultivar in Australia that was released for grazing purposes (Conway and Doughton, 2005). Additionally, C. ternatea has been widely used in traditional medicine, particularly as a supplement to enhance cognitive functions and alleviate symptoms of numerous ailments including fever, inflammation, pain, and diabetes (Mukherjee et al., 2008).

In as early as the 1950s, studies on C. ternatea sought to elucidate its pharmacological activities, phytochemical composition and active constituents (Grindley et al., 1954; Piala et al., 1962; Kulshreshtha and Khare, 1967; Morita et al., 1976). The novel C. ternatea anthocyanins termed "ternatins" which render C. ternatea flowers with their vivid blue color, were first isolated in 1985 (Saito et al., 1985). Following further isolation and structural characterization of numerous other ternatins, the ternatin biosynthetic pathway was postulated a decade later (Terahara et al., 1998). In 2003, comparison of C. ternatea lines bearing different floral colors provided insights into the role of acylation on C. ternatea floral color determination (Kazuma et al., 2003a). The abundance of these unique anthocyanins alongside other secondary metabolites in C. ternatea makes the plant an ideal source of natural additives that can enhance the appearance and nutritive values of consumer products (Pasukamonset et al., 2016, 2017, 2018; Siti Azima et al., 2017). Although a number of recent studies has endeavored to elucidate the pharmacological activities of C. ternatea (Adhikary et al., 2017; Kavitha, 2018; Singh et al., 2018), the contribution of individual extract components on any bioactivity measured remains unknown.

**Figure 1** summarizes some of the key agricultural and biochemical studies conducted on C. ternatea from the 1950s to the present, providing a convenient timeline of discoveries. The corresponding references to the key studies and milestones are listed in **Table 1**. In recent years, the small circular defense molecules called cyclotides, in C. ternatea (Nguyen et al., 2011; Poth et al., 2011a,b; Nguyen et al., 2014) have fueled scientific innovations that may have impact in modern agriculture, biotechnology and medicine. In 2017, Sero-X <sup>R</sup> , a cyclotide-containing eco-friendly pesticide made from extracts of C. ternatea, was approved for commercial use in Australia<sup>1</sup> . In addition, the C. ternatea cyclotide processing enzyme, butelase-1, which is the fastest ligase known to date and is capable of ligating peptides across a vast range of sizes (26 to >200 residues), can potentially be used in the large scale synthesis of macrocycle libraries and peptide-based pharmaceuticals (Nguyen et al., 2014, 2015).

# Plant Description

Clitoria ternatea produces pentamerous zygomorphic peashaped flowers with a tubular calyx consisting of five sepals which are fused about two thirds of their length. The showy corollae consists of five free petals, with one large and rounded banner, two wrinkled wings which are often half the length of the banner and two white keels which aid in protecting the floral organs (Cobley, 1956; Biyoshi and Geetha, 2012) (**Figure 2A**). The corollae are most often dark blue in color but may also occur in white and various blue and white shades in between (Morris, 2009; Biyoshi and Geetha, 2012). The diadelphous C. ternatea stamens consist of 10 filaments where nine are fused and one is free lying (Cobley, 1956; Biyoshi and Geetha, 2012). Attached to each filament is a pollen-bearing white anther, which consists of four lobes (Cobley, 1956; Pullaiah, 2000). C. ternatea produces a monocarpellary ovary bearing ten ovules (Pullaiah, 2000; Biyoshi and Geetha, 2012). Surmounting this is a long and thick style with a bent tip (Cobley, 1956; Biyoshi and Geetha, 2012). C. ternatea pods are narrow and flattened with pointy tips, and they typically contain around 10 seeds (Cobley, 1956) (**Figure 2B**). The seeds contain palmitic acid (19%), stearic acid (10%), oleic acid (51- 52%), linoleic acid (17%) and linolenic acid (4%) (Grindley et al., 1954; Joshi et al., 1981). The caloric content of the seed is reported to be around 500 cal/100 g (Joshi et al., 1981). C. ternatea produces pinnate compound leaves that are obovate and entire with emarginate tips (Taur et al., 2010) (**Figure 2C**). The epidermis on both leaf surfaces consist of a single layer of cells protected by a thick cuticle and with trichome outgrowths (Taur et al., 2010). A layer of palisade cells, lignified xylem and paracytic stomata lie underneath the upper epidermis (Taur et al., 2010). C. ternatea produces an extensive deep-root system, which enables the plant to survive up to 7–8 months of drought (Cobley, 1956). The roots also produce large nodules for nitrogen fixation (Cobley, 1956) (**Figure 2D**).

# Taxonomy, Geographic Distribution and Habitat

The genus Clitoria occurs in tropical and subtropical environments across the globe. The number of subfamilial taxa remains unclear, and as in the case of Clitoria, the descriptions of species and citations of type specimens are noted as being incomplete or incorrect according to Fantz (1977). Thus, it is difficult to estimate species richness of the genus. Within Clitoria, three subgenera have been described and held as valid according to the monograph of Clitoria. Across all three subgenera, Fantz retains 58 species as valid, with numerous lower classifications of varieties and subspecies (Fantz, 1977).

Clitoria ternatea is the holotype of Clitoria subgenus Clitoria, and represents the archetypical Clitoria. The etymology of the specific name is postulated to be from the island of Ternate in the Indonesian archipelago because it is from specimens from that location that Linnaeus produced the specific description. Ternate is not in the Indian Ocean but is instead in the Molucca Sea and in eastern Indonesia, lending ambiguity to the native range of the species. The distribution of all other taxa in subgenus Clitoria is restricted to Southern and Eastern Africa, India, Madagascar, and other islands of the Western Indian Ocean (**Figure 3**). The exact geographic origin of C. ternatea is thus difficult to determine, but we may infer from the center of diversity for subgenus Clitoria, that C. ternatea arose in or around the Indian Ocean and not the Pacific Ocean or South China Sea where it has been in use as a food coloring historically (Fantz, 1977; Staples, 1992). It is also entirely possible that the taxon we know as C. ternatea is an ancient hybrid of one or more members of the subgenus Clitoria that had subsequently been introduced to Southeast Asia. Testing of this synthetic origin hypothesis would require large scale genetics work on C. ternatea and related taxa like Clitoria biflora, C. kaessneri, C. lasciva, and C. heterophylla. Regardless of the specific geographical origin and evolutionary history of C. ternatea, the present day distribution of naturalized

<sup>1</sup>https://innovate-ag.com.au/

populations of C. ternatea is pantropical, as facilitated by key characteristics of the species: tolerance to drought conditions, non-reliance on specific pollinators because of self-pollination, and nitrogen fixation capability (Cobley, 1956; Staples, 1992; Conway et al., 2001). It is also possible to cultivate and maintain populations in subtropical regions (ex. Wee Waa NSW, located at −30.2, 149.433333).

The habitat of C. ternatea is open mesic forest or shrub land (personal observations of authors and records in the Australasian Virtual Herbrarium<sup>2</sup> ). In Australia, the authors note that populations of C. ternatea occur in tropical regions in open areas where sunlight is plentiful due to a sparse canopy and in areas near where fresh water would collect such as the border of wetlands, small gullies, or at the base of rocky hillsides. When present, the plants are often vigorous and smother other vegetation.

#### Growth and Propagation

Germination and establishment of C. ternatea is most favorable when the temperature is between 24–32◦C, and when seeds are sown in moist soil at 2.5–5 cm deep and 20–30 cm apart (McDonald, 2002; Conway, 2005). Although C. ternatea can withstand arid conditions (Cobley, 1956), the plant grows best

<sup>2</sup>https://avh.chah.org.au/

with ample moisture and rainfall (650–1250 mm) and when the temperature reaches 27◦C or higher (Conway and Collins, 2005). Like most tropical legumes, C. ternatea is susceptible to frost damage (Conway and Collins, 2005). However, it can retain its leaves for as long as 7 days, and its woody parts typically recover (Conway and Collins, 2005).

Despite its hardy features, one of the impediments in propagating C. ternatea is its low seed germination rate. This problem has long been recognized as evident in a study conducted in 1967 (Mullick and Chatterji, 1967). The study showed that freshly harvested C. ternatea would not imbibe water and germinate (Mullick and Chatterji, 1967). On the other hand, storing the seeds for another 6 months promoted germination in 15–20% of the seeds (Mullick and Chatterji, 1967). Chemical scarification by means of soaking the seeds in boiling water or sulfuric acid was also found to promote C. ternatea seed germination (Cruz et al., 1995) where soaking the seeds in concentrated sulfuric acid for at least 10 min resulted in a reported 100% seed germination rate (Patel et al., 2016).

In vitro propagation can circumvent the unreliably low seed germination rate in C. ternatea. It can also be an alternative method for conserving and mass propagating C. ternatea lines with superior qualities. In 1968, a study determined the effects of adding ascochitine on the growth of C. ternatea embryos (Lakshmanan and Padmanabhan, 1968). That study reported that


<sup>1</sup>https://innovate-ag.com.au/.

60% of the embryos produced callus in both the upper and lower hypocotyl when 5–10 ppm ascochitine was added to the culture media. Numerous studies have since been conducted from 1990 to 2016 to determine the optimal plant hormone concentrations, basal media types and explant types for C. ternatea in vitro propagation (**Table 2**).

With the optimal hormone concentrations supplemented in the basal medium, callus production was observed from mature C. ternatea embryos, leaf and root explants obtained from aseptic seedlings (Lakshmanan and Dhanalakshmi, 1990; Shahzad et al., 2007; Mohamed and Taha, 2011). In some instances, prolonged explant maintenance in the same callus induction medium led to embryoid production (Lakshmanan and Dhanalakshmi, 1990). Recently, a study described a protocol to produce encapsulated embryogenic callus from leaf explants using the optimal hormone concentrations and 3% sodium alginate (Mahmad et al., 2016). The study reported that more than 50% of the encapsulated explants stored at 4◦C for 90 days survived (Mahmad et al., 2016). Studies showed that shoots can be regenerated from callus (Shahzad et al., 2007; Mahmad et al., 2016). Alternatively, shoots can also be induced and proliferated directly from different explant types such as isolated shoot buds (Lakshmanan and Dhanalakshmi, 1990), axillary buds (Mhaskar et al., 2011), shoot tips (Pandeya et al., 2010), leaf (Mohamed and Taha, 2011), and root (Shahzad et al., 2007) from aseptic seedlings, cotyledonary nodes (Pandeya et al., 2010; Mukhtar et al., 2012) and nodal explants (Rout, 2005; Pandeya et al., 2010; Ismail et al., 2012; Mukhtar et al., 2012). These in vitro grown C. ternatea shoots when subsequently placed in a medium supplemented with the optimal auxin concentrations produced roots in vitro (Lakshmanan and Dhanalakshmi, 1990; Rout, 2005; Shahzad et al., 2007; Mhaskar et al., 2011; Mohamed and Taha, 2011; Ismail et al., 2012; Mukhtar et al., 2012). Nevertheless, ex vitro root production was observed when elongated shoots were soaked in a concentrated auxin solution (Pandeya et al., 2010).

Moreover, a study has described propagation of C. ternatea via hairy root cultures (Swain et al., 2012b). Using the wildtype Agrobacterium rhizogenes strain A4T with the optimal culture conditions, a transformation frequency of as high as 85.8% was observed (Swain et al., 2012b). Compared to roots obtained from outdoor grown plants, C. ternatea hairy root cultures produced fourfold the amount of taraxerol, an anticancer triterpenoid compound that is naturally produced in C. ternatea roots (Swain et al., 2012a).

#### HISTORICAL AND CURRENT APPLICATIONS

#### Agriculture

#### Fodder and Forage Crop

Clitoria ternatea has long been cultivated as a forage crop (Cobley, 1956), with yields reaching 17–29 tons/ha of palatable hay for cattle (Barro and Ribeiro, 1983; Abdelhamid and Gabr, 1993). This yield is on par with the established forage crop, alfalfa (Medicago sativa), and can potentially replace it in warm areas with low rainfall (Barro and Ribeiro, 1983). In Australia, C. ternatea has been cultivated predominantly in Queensland, due to its adaptability in the arid regions and persistence in heavy-textured farm lands (Hall, 1985). In 1991, the Queensland Department of Primary Industries, released the C. ternatea cv. 'Milgarra' mainly for grazing purposes (Oram, 1992). Milgarra is a composite of 21 introduced and naturalized C. ternatea lines that were grown for over three generations (Oram, 1992). As it is a composite cultivar, phenotypic variations are commonly observed in the field (Conway and Doughton, 2005).

Timing of harvest has been demonstrated to be important for maximizing dry matter content and digestibility of C. ternatea hay, with 45 days shown to be optimal (Mahala et al., 2012). Further increases in dry matter content have been reported if C. ternatea is pruned every 42 days at 20 cm (Colina et al., 1997), with dry matter yields of 1122 kg/ha reported. Compared to other legumes, animal feeds prepared from C. ternatea have consistently lower acid detergent fiber content. This low amount of acid detergent fiber increases energy density of the feed, and retains a high nitrogen content (Jones et al., 2000). Thus, feeds made from this plant have favorable nutritional characteristics compared to other legume forages. C. ternatea is also a great source of carotenoids with the carotenoid content of a 6-month old hay reaching 600 mg/kg dry matter (Barro and Ribeiro, 1983).

#### Nitrogen Fixation and Improvement of Soil Nutrient Content

Clitoria ternatea roots produce large round nodules (Cobley, 1956) (**Figure 2D**) known to house nitrogen-fixing bacteria, making the plant ideal for use in a crop rotation system. As early as the 1970s, studies were conducted to assess the nitrogen-fixing capacity of C. ternatea (Oblisami, 1974; De Souza et al., 1996). Nodulation was shown to be more favorably induced with a soil moisture content of around 25–45% with a light duration of 11–14 h and an intensity of 11–17 W/m<sup>2</sup> (Habish and Mahdi, 1983). Supplementing the soil with sulfur was also demonstrated as beneficial for nodule formation (Zaroug and Munns, 1980a). Several studies have reported the benefits of C. ternatea to soil health (De Souza et al., 1996; Dwivedi and Kumar, 2001; Kamh et al., 2002; Alderete-Chavez et al., 2011). Field trials conducted in Mexico reported that at 180 days post planting of C. ternatea, the organic matter, N, P, and K content of the soil all increased significantly (Alderete-Chavez et al., 2011). A similar study conducted in India reported that intercropping C. ternatea with the fodder crop Setaria sphacelata enriched the

N content of the soil to an estimated 39.8 kg/ha (Dwivedi and Kumar, 2001). The results suggest that intercropping C. ternatea may potentially lead to a shorter fallow period requirement (Njunie et al., 2004).

Symbols represent: C. biflora, C. heterophylla, 1 C. kaessneri, • C. lasciva, + C. ternatea.

When considering crop rotations, it is important to determine the cross nodulation capacity of nitrogen fixing Rhizobium species. One study showed that the Rhizobium species isolated from C. ternatea, cow pea and soybean are more compatible to each other than other legume species (Oblisami, 1974), while cross inoculation of Rhizobium sp. from C. ternatea and the legume species, Phaseolus vulgaris, M. sativa, and Pisum sativum, produced no nodules (Oblisami, 1974). These studies provide insights as to which legume species, when planted together with C. ternatea, are more likely to form nodules and thereby yield the most soil benefits. Another early study showed that the symbiotic efficiencies measured, based on C. ternatea dry matter yield, varied depending on the Rhizobium sp. strains tested (Zaroug and Munns, 1980b). A more recent study reported the isolation and identification of 11 rhizobial strains from C. ternatea grown in Thailand (Duangkhet et al., 2018). The 16s rDNA phylogenetic analysis revealed that ten of these isolates were Bradyrhizobium elkanii strains while the remaining isolate was a Bradyrhizobium japonicum strain. These C. ternatea B. elkanii strains were shown to promote better plant growth and induce higher nitrogenfixing capacity than B. elkanii strains isolated from soybean (Duangkhet et al., 2018).

#### Medicine

The popular use of C. ternatea in traditional medicine has stimulated researchers to elucidate the pharmacological activities of extracts obtained from various C. ternatea tissues. Numerous animal studies have reported that the extracts exhibit diuretic, nootropic, antiasthmatic, anti-inflammatory, analgesic, antipyretic, antidiabetic, antilipidemic, anti-arthritic, antioxidant, and wound healing properties. The results of the animal and in in vitro studies are summarized in **Tables 3** and **4**, respectively. Although these combined studies claim that C. ternatea extracts showcase a diverse range of pharmacological properties, many of these studies are preliminary and require more thorough investigation. In many instances the authors have attributed the extract activities to the presence of flavonols and anthocyanins, however, attempts to isolate and test individual components are limited. Indeed several components in C. ternatea extracts could be acting synergistically. For instance, cyclotides which have been reported to have immunosuppressive properties may contribute (Gründemann et al., 2012, 2013; Thell et al., 2016), as could the abundance of delphinidins (Sogo et al., 2015; Tani et al., 2017; Harada et al., 2018).

#### Nootropic Activity

Several studies have reported improvement in cognitive performance when C. ternatea extracts were administered to experimental animals (Taranalli and Cheeramkuzhy, 2000; Rai et al., 2001; Jain et al., 2003). In one study, rats orally dosed with ethanolic extracts derived from C. ternatea roots or aerial tissues were shown to attenuate electric shock-induced amnesia better than the controls (Taranalli and Cheeramkuzhy, 2000). In a separate study, 7-day old neonatal rats orally dosed with aqueous C. ternatea root extract also showed improved memory retention and enhanced spatial learning performance 48 h and 30 days post treatment (Rai et al., 2001). Further investigations revealed that the brains of treated rats contained a significantly higher acetylcholine content than the controls (Taranalli and Cheeramkuzhy, 2000; Rai et al., 2002). A more recent study of the effects of C. ternatea leaf extracts on diabetic-induced cognitive decline showed that the acetylcholinesterase activity, total nitric oxide levels and lipid peroxide levels all significantly

#### TABLE 2 | Summary of published Clitoria ternatea in vitro propagation studies.


KN, kinetin; BAP, 6-benzylaminopurine; TDZ, thidiazuron; IAA, Indole-3-acetic acid; IBA, Indole-3-butyric acid; 2,4-D, 2,4-dichlorophenoxyacetic acid; NAA, 1-napthaleneacetic acid; GA, gibberellic acid; 2iP-N<sup>6</sup> , (2-Isopentenyl)adenine; MS, Murashige and Skoog medium (3% sucrose (suc) unless otherwise stated); DKW, Driver Kuniyuki Walnut medium (3% suc).

decreased upon treatment, whilst the catalase, superoxide dismutase and glutathione levels all significantly increased (Talpate et al., 2014). Another recent study showed that rats fed for 60 days with "medhya rasayana," a mixture of crushed C. ternatea and jaggery (1:1), exhibited significant reduction in autophagy in the brain (Raghu et al., 2017). The treated and the control rats also differentially expressed genes implicated in autophagy regulation, nucleotide excision repair, homologous recombination, etc. The study suggested that C. ternatea

protects the brain by affecting the autophagy directed pathway (Raghu et al., 2017).

Anti-inflammatory, Analgesic, and Antipyretic Activity Extracts of C. ternatea roots and leaves have been reported to demonstrate anti-inflammatory, analgesic, and antipyretic activities (Devi et al., 2003; Parimaladevi et al., 2004; Bhatia et al., 2014; Singh et al., 2018). Oral administration of the methanolic root extracts and ethanolic floral extracts of C. ternatea was



fpls-10-00645 May 25, 2019 Time: 16:29 # 8


TABLE 4 | In vitro studies demonstrating the pharmacological properties of Clitoria ternatea extract.

reported to significantly inhibit carrageenin-induced rat paw oedema and acetic acid-induced vascular permeability in rats (Devi et al., 2003; Singh et al., 2018). Results with an oral dosage of 400 mg extract per kg body weight were on par with a 20 mg/kg oral dosage of diclofenac sodium (Devi et al., 2003), a non-steroidal anti-inflammatory drug. In an antipyretic study, oral administration of C. ternatea methanolic root extracts significantly reduced the body temperature of Wistar rats that had yeast-induced elevated body temperature (Parimaladevi et al., 2004). This antipyretic activity of the extract was found to be comparable to paracetamol (Parimaladevi et al., 2004). More recently, C. ternatea leaf extracts have been implicated for use as an analgesic (Bhatia et al., 2014). In this study the established rat tail flick pain assay was used to determine the effects of pretreatment with both ethanolic and petroleum C. ternatea extracts. A positive analgesic effect of C. ternatea leaf extracts was reported, comparable to diclofenac sodium (10 mg/kg) 1 h post treatment (Bhatia et al., 2014).

#### Antidiabetic Activity

Recently, C. ternatea leaf extracts have shown potential for use as an antidiabetic (Chusak et al., 2018b; Kavitha, 2018). Wistar rats orally dosed with 400 mg C. ternatea ethanolic leaf extract per kg of body weight per day for 28 days, had significantly lower levels of blood glucose, insulin, glycosylated hemoglobin, urea and creatinine than the diabetic control. Furthermore, the levels of liver enzymes (serum glutamate oxalate transaminase, serum glutamate pyruvate transaminase, lactate dehydrogenase, and alkaline phosphatase) in treated rats were lower than the diabetic control rats and were comparable to the normal control rats (Kavitha, 2018). More recent studies have focused on the effects of C. ternatea extracts on the glycemic response and antioxidant capacity in humans (Chusak et al., 2018b). A small scale clinical trial involving 15 healthy males revealed that when either 1 or 2 g of C. ternatea extract was ingested together with 50 g sucrose the resulting plasma glucose and insulin levels were suppressed (Chusak et al., 2018b). Furthermore the postprandial plasma antioxidant capacities of the subjects were also enhanced upon extract consumption.

#### Antioxidant Activity

The antioxidant properties of C. ternatea extracts are well documented (Phrueksanan et al., 2014; Sushma et al., 2015). One study demonstrated that C. ternatea extracts could protect canine erythrocytes from hemolysis and oxidative damage induced by 2,2<sup>0</sup> -azobis-2-methyl-propanimidamide dihydrochloride (AAPH) (Phrueksanan et al., 2014). Compared to the AAPH control, erythrocytes treated with 400 µg/mL of the C. ternatea extract had significantly lower levels of AAPH-induced lipid peroxidation and protein oxidation, and significantly higher levels of glutathione (Phrueksanan et al., 2014). In another study the antioxidant properties within a C. ternatea extract facilitated the production of magnesium oxide nanoparticles, materials which are increasingly being utilized for biomedical applications (Sushma et al., 2015).

#### Pesticidal Activities

The anthelmintic and insecticidal activities, and the antimicrobial activities of C. ternatea extracts and several isolated protein and peptide components are summarized in **Tables 5** and **6**, respectively. These biological activities presumably evolved for host-defense purposes but can have potential applications both in agriculture and medicine. Further details on these activities are described in the following sections.

#### Anthelmintic Activity

The anthelmintic properties of C. ternatea have been reported in several studies (Hasan and Jain, 1985; Khadatkar et al., 2008; Salhan et al., 2011; Kumari and Devi, 2013; Gilding et al., 2015) (**Table 5**). Characterization of 27 homozygous C. ternatea lines showed that individual lines displayed different degrees of resistance against the parasitic root-knot nematode, Meloidogyne incognita (Hasan and Jain, 1985). The methanolic extract of C. ternatea was also found to inhibit 93% of M. incognita eggs from hatching (Kumari and Devi, 2013). In another study that utilized the model organism, Caenorhabditis elegans, C. ternatea extracts were found to effectively kill nematode larvae, with the root extracts showing greater lethality than the leaf extracts (Gilding et al., 2015). Two studies also reported C. ternatea activities against annelids (Khadatkar et al., 2008;

#### TABLE 5 | Anthelmintic and insecticidal activities of Clitoria ternatea.


#### TABLE 6 | Antimicrobial activities of Clitoria ternatea.


Salhan et al., 2011). Using Pheretima posthuma as a test worm, one study showed that the ethanolic C. ternatea extract (50 mg/mL) caused significantly higher mortality rate and incidence of worm paralysis than piperazine citrate, a commonly used drug for controlling parasitic worms (Khadatkar et al., 2008). Similarly, using Eisenia foetida as a test worm, another study showed that the ethanolic and aqueous C. ternatea extract induced worm paralysis and mortality at 100 mg/mL (Salhan et al., 2011). However, compared to the commonly used antiparasitic drug levamisole, the rate of worm paralysis and death was significantly slower in the C. ternatea extracts (Salhan et al., 2011).

#### Insecticidal Activity

Proteins and peptides isolated from C. ternatea are reported to exhibit insecticidal properties (Kelemu et al., 2004; Poth et al., 2011a) (**Table 5**). One study reported 100% larval mortality when 1% w/w and 5% w/w of the purified C. ternatea protein (20 kDa), finotin, was applied to the bruchids Acanthoscelides obtectus and Zabrotes subfasciatus, respectively (Kelemu et al., 2004). Another study showed that when the C. ternatea cyclotide, Cter M, was incorporated in the diet of the lepidopteran species Helicoverpa armigera, larval growth retardation was observed in a dose dependent manner (Poth et al., 2011a). Larval mortality was observed at 1 µmol CterM peptide g−<sup>1</sup> diet (Poth et al., 2011a).

Expanding on the initial findings of Poth et al. (2011a), additional studies have reported pesticidal activities of cyclotide extracts from C. ternatea (Gilding et al., 2015; Mensah et al., 2015) (**Table 5**). Gilding et al. (2015) showed that C. ternatea extracts permeabilized insect-like membrane lipids, with the shoot extracts exhibiting the greatest potency (0.31 µg/mL LC50). Another study reported that application of oil-based C. ternatea mixture (1–2% v/v) to transgenic and conventional cotton crops, resulted in Helicoverpa spp. larval mortality and reduced oviposition and larval feeding (Mensah et al., 2015). Detrimental effects of the extract against beneficial insects were not observed (Mensah et al., 2015), suggesting that C. ternatea extracts could provide the basis for eco-friendly natural insecticides.

#### Antimicrobial Activity

The antimicrobial properties of proteins isolated from C. ternatea have previously been described (Kelemu et al., 2004; Ajesh and Sreejith, 2014) (**Table 6**). The C. ternatea 20 kDa protein finotin demonstrated inhibitory activities over a wide range of plant fungal pathogens (Kelemu et al., 2004). Finotin also exhibited activities against the plant bacterial pathogen Xanthomonas axonopodis (Kelemu et al., 2004). Another study reported isolation of a 14.3 kDa protein from C. ternatea seeds (Ajesh and Sreejith, 2014) that exhibited activities against the human fungal pathogens, Cryptococcus spp. and Candida spp., and against a number of mold fungi (Ajesh and Sreejith, 2014). Studies also reported the antimicrobial properties of C. ternatea cyclotides against Gram-negative, but not Gram-positive, bacteria (Nguyen et al., 2011, 2016b).

Ethanol extract of C. ternatea outdoor grown leaves and calli inhibited the growth of the bacterial species Staphylococcus spp., Streptococcus spp., Enterococcus faecalis, and Bacillus spp. (Shahid et al., 2009). On the other hand, the antibacterial activities of the calli aqueous extract were only limited to Bacillus spp. and Streptococcus pyogenes; and activity of the leaf aqueous extract was limited to Bacillus spp. (Shahid et al., 2009). Furthermore, a recent study reported that the ultrasound-assisted aqueous extract of C. ternatea leaves and petals inhibited the growth of Staphylococcus aureus (Anthika et al., 2015). C. ternatea petals extracted for 30 min using ultrasound yielded the highest anthocyanin content and also displayed the highest antibacterial activity (Anthika et al., 2015).

The antifungal properties of C. ternatea have also been reported (Kamilla et al., 2009; Das and Chatterjee, 2014) (**Table 6**). Growth of the mold fungus Aspergillus niger was inhibited at a minimum inhibitory concentration of 0.8 mg/mL of the methanolic C. ternatea leaf extract (Kamilla et al., 2009). Scanning electron microscopy images from the study revealed that addition of the extract lead to conidial and hyphal collapse and distortion which is likely due to cell wall disruption (Kamilla et al., 2009). Another study reported that the 50% aqueous-ethanolic C. ternatea leaf extract inhibited the growth of Fusarium oxysporum and promoted the activities of amylase, protease and dehydrogenase in P. sativum seeds, enzymes that otherwise had low activities during F. oxysporum infestation (Das and Chatterjee, 2014).

# PHYTOCHEMICAL COMPOSITION

#### Non-proteinaceous Components Flavonols

As early as 1967, a study reported that C. ternatea seeds contain flavonol glycosides as well as phenolic aglycones, cinnamic acid, and a range of other compounds (Kulshreshtha and Khare, 1967). Nearly two decades later, Saito et al. (1985) reported the isolation of five C. ternatea flavonols, namely kaempferol, kaempferol 3 glucoside, kaempferol 3-robinobioside-7-rhamnoside, quercetin, and quercetin 3-glucoside (Saito et al., 1985). Subsequent studies reported the isolation of flavonol glycosides from C. ternatea leaves (Morita et al., 1976) and flowers (Kazuma et al., 2003a,b). With some exceptions, the identified flavonol glycosides were found in all C. ternatea lines bearing different floral colors (blue, mauve and white) (Kazuma et al., 2003a). For instance, myricetin 3-(200-rhamnosyl-600-malonyl)glucoside, myricetin 3-rutinoside and myricetin 3-glucoside were not detected in the C. ternatea line bearing mauve petals (Kazuma et al., 2003a). The flavonols isolated from C. ternatea are summarized in **Table 7**.

#### Anthocyanins

In 1985, six acylated anthocyanins were isolated from blue C. ternatea flowers that were all derivatives of delphinidin 3,3<sup>0</sup> ,50 triglucoside (Saito et al., 1985). The chemical properties of the acylated C. ternatea delphinidins, which were named ternatins, were further elucidated in subsequent studies (Terahara et al., 1989a, 1990a,b). In 1989, the structure of the largest isolated blue anthocyanin, ternatin A1, was determined (Terahara et al., 1989a). The study also showed that not only was ternatin A1 the largest, it was also one of the most stable in neutral solution

#### TABLE 7 | Flavonol and anthocyanin content of Clitoria ternatea.


(Terahara et al., 1989a). The structure of ternatins A2 (Terahara et al., 1990c), B1 (Kondo et al., 1990), B2 (Terahara et al., 1996), D1 (Terahara et al., 1989b), and D2 (Terahara et al., 1996) were elucidated shortly after.

Subsequent studies isolated and determined the structures of several other novel ternatins isolated from C. ternatea: ternatins A3, B3–B4, C1–C5, D3, and preternatins A3 and C4 (Terahara et al., 1996, 1998) (**Table 7**). Terahara et al. (1998) observed that lower molecular weight ternatins are more abundant in young flowers while higher molecular weight ternatins are more prevalent in mature flowers. The authors proposed that ternatin A1 is the final compound, and the other ternatins are intermediate products (Terahara et al., 1998). Starting with ternatin C5, production of ternatin A1 can be achieved via four p-coumaric acid acylation steps and four glucosylation steps (Terahara et al., 1998). The biosynthetic pathway of ternatin A1 is summarized in **Figure 4** (Terahara et al., 1998). The key enzymatic steps and the biosynthetic pathway to produce ternatin C5 from delphinidin was elucidated in 2004 (Kazuma et al., 2004).

A 2003 study compared the anthocyanin contents of C. ternatea lines bearing different floral colors (Kazuma et al., 2003a). The study showed that white C. ternatea flowers do not produce anthocyanins. Furthermore, unique to the mauve C. ternatea flowers, is the accumulation delphinidins lacking the 3 0 and 5<sup>0</sup> (polyacelated) glucosyl group substitutions (Kazuma et al., 2003a). The study concluded that glucosylation of delphinidins at these positions are crucial to the production of C. ternatea flowers (Kazuma et al., 2003a).

#### Other Non-proteinaceous Components

The pentacyclic triterpenoids, taraxerol and taraxerone, were isolated from C. ternatea roots in the 1960s (Banerjee and Chakravarti, 1963, 1964). Realizing the potential of C. ternatea as a source of taraxerol, in 2008, a method was developed for the routine quantification of the content in C. ternatea

extracts of this medicinal compound (Kumar et al., 2008). In 2012, in vitro propagated hairy root cultures were sought as alternative to in vivo grown roots as source of taraxerol (Swain et al., 2012a). In 2016, in addition to taraxerol, novel norneolignans, clitorienolactones A-C, were isolated from C. ternatea roots (Vasisht et al., 2016). C. ternatea floral extracts also contain other types of flavonoids, including rutin (flavone), epicatechin (flavanol) and other polyphenolic acids (gallic acid, protocatechuic acid, and chlorogenic acid) (Siti Azima et al., 2017).

# Proteinaceous Components

In general, there has traditionally been a greater focus in phytochemical studies on the non-protein components of plants and C. ternatea is no exception. However, over the last decade, with advances in nucleic acid sequencing and mass spectroscopic peptide and protein characterization techniques there is now much more focus on proteinaceous components, particularly in the characterization of peptides and proteins implicated in plant defense. Of the known C. ternatea phytochemical components implicated in defense, a class of peptides known as cyclotides is particularly noteworthy (Nguyen et al., 2011; Poth et al., 2011a,b). These peptides mature into cyclic molecules of ∼30 aa from linear precursors through an enzymatic transpeptidation reaction of the peptide backbone. Cyclotides contain three disulfide bonds that form a knot (**Figure 5A**), similar to configurations seen in linear knottins cataloged across diverse taxa (Gelly et al., 2004). Together, the cyclic and knotted nature of cyclotides makes them highly stable in conditions that would otherwise facilitate peptide degradation (Craik et al., 1999). Unlike linear knottins, which are found across multiple kingdoms of life, cyclotides are restricted to relatively few taxa in Viridiplantae, namely the dicotyledon angiosperms (Gruber et al., 2008). Searching all Viridiplantae sequences for cyclotides using the widely distributed program tblastn, has highlighted the restriction of cyclotides and linear non-cyclotide-like sequences to a handful of plant families discussed below (Altschul et al., 1990).

Despite reports of cyclotide-like sequences in the Poaceae, none of the described sequences have been shown to exist as cyclic molecules in planta, thus failing the definition of the term cyclotide. The taxonomic distribution of cyclotides is often disjointed in a taxonomic group; for instance the taxonomically sparse occurrence of cyclotides observed in the Rubiaceae (Gruber et al., 2008; Koehbach et al., 2013) contrasts with the ubiquitous occurrence of cyclotides in all species tested far-off the Violaceae family (Burman et al., 2015; Göransson et al., 2015). Within the Fabaceae, C. ternatea is the only family member in which cyclotides have been observed despite examination of diverse Fabaceae, including other species of Clitoria (Gilding et al., 2015). Cyclotides are therefore one of the most interesting proteinaceous components of C. ternatea. That they are processed from genetically encoded precursor proteins opens opportunities for detecting them either or both as nucleic acid or peptide sequences.

#### Gene and Transcript Characterization

RNA-seq experiments to define the transcripts that encode for cyclotides have been performed by several groups. The

resulting transcriptomes have allowed the cataloging of at least 74 cyclotide sequences (**Table 8**) which exhibit high levels of diversity at loops intervening the conserved Cys residues (**Figures 5B,C**) (Gilding et al., 2015; Nguyen et al., 2016b). All of the precursors observed have singleton cyclotide domains similar to that observed in Petunia x hybrida (Solanaceae), whereas cyclotide precursors from the Cucurbitaceae, Rubiaceae, and Violaceae families possess multiple cyclotide domains (**Table 9**) (Felizmenio-Quimio et al., 2001; Mylne et al., 2011; Poth et al., 2011a; Koehbach and Gruber, 2015; Park et al., 2017). Unlike precursors from the Cucurbitaceae, Rubiaceae, Solanaceae, and Violaceae, C. ternatea cyclotides are encoded in albumin-1 genes (Poth et al., 2011a).

Pea-like albumin-1 genes are restricted to the tribe Faboideae, as evidenced by the lack of hits when albumin-1 prepropeptide sequences from P. sativum are used as queries in a tblastn search on all sequences exclusive of the Faboideae. The canonical albumin-1 gene structure in all taxa examined thus far consists of a signal peptide followed immediately by a b-chain peptide domain with a knottin fold, a short intervening sequence, and a ∼54 aa a-chain domain (**Figure 6**, Cter M precursor). Typical functions ascribed to the albumin-1 gene family include protein storage and defense through the potentially toxic b-chain. Their function as a toxin is exemplified in the Pisum sativum albumin-1 b-chain (Pa1b), a peptide that effectively kills weevils and select insects through inhibitory activity of insect vacuolar proton pumps (Jouvensal et al., 2003; Chouabe et al., 2011).

Interestingly, the loops between the cystine residues are similar in size and in some cases residue composition between C. ternatea cyclotides and other albumin-1 b-chains (Gilding et al., 2015). This observation implies that the development of cyclotide domains from progenitor albumin-1 b-chains would have involved adaptation of the b-chain into a cyclotide domain structure. A further necessary adaptation to facilitate cyclization is the acquisition of an Asp or Asn residue at the C-terminus of the cyclotide domain. These specific residues are required for cyclization by asparaginyl endopeptideases (AEPs) through a transpeptidation reaction between the C-terminal Asp/Asn residue and the N-terminal residue (**Figure 6**) (Nguyen et al., 2014; Harris et al., 2015).

In C. ternatea, all transcripts encoding albumin-1 a-chain domains contain a cyclotide domain in place of what would otherwise be the b-chain domain. The complete transition of this region in C. ternatea albumin-1 genes to cyclotide domains implies canonical b-chains were disfavoured in the evolutionary history of C. ternatea (Gilding et al., 2015). The albumin-1 gene family members of C. ternatea are ∼74 in number, whereas albumin-1 gene families from the other genome-sequenced Faboideae, Glycine, Medicago, and Phaseolus, are 3, 33, and 17 in number respectively (Goodstein et al., 2012; Gilding et al., 2015; Nguyen et al., 2016b). This observation on albumin-1 gene expansion in C. ternatea further supports the hypothesis that cyclotide domains exhibit qualities and functions that increase fitness.

Transcriptomic expression analysis of various C. ternatea organs illustrates the partitioning of cyclotide expression to certain organs for some precursors, while other precursors are expressed constitutively throughout the examined organs (Gilding et al., 2015). As a class of defense molecules, it is logical that some would be preferentially expressed to target specific threats that different organs may face. Other albumin-1 genes are expressed at a notable level throughout the whole plant. The precursor for Cter M is an example of a cyclotide that is expressed constitutively, so much so that transcripts encoding Cter M are upward of ninefold higher than the rubisco small subunit in shoots (Gilding et al., 2015). Clearly, the plant is investing large amounts of resources to produce these transcripts and the resulting peptides.

#### Peptide Characterization

Clitoria ternatea cyclotides generally have Gly residues at the proto- N-terminus and Asn residues at the proto- C-terminus

#### TABLE 8 | Cyclotides in Clitoria ternatea.

fpls-10-00645 May 25, 2019 Time: 16:29 # 15


(Continued)

#### TABLE 8 | Continued

fpls-10-00645 May 25, 2019 Time: 16:29 # 16


<sup>∗</sup>No mass spec data available; <sup>α</sup>predicted to be non-cyclic; <sup>β</sup> low confidence sequences.

TABLE 9 | Characteristics of cyclotide gene precursors.


of the cyclotide domain within the precursor proteins, similar to the case from other plant families. By contrast with the conserved terminal residues, the intervening backbone loops between the conserved Cys residues tend to be variable in size and sequence. Some of the biophysical properties of C. ternatea cyclotides deviate notably from peptides of other cyclotideproducing plant families. For example, Cter 13 contains eight Arg residues that confer a predicted charge of +7 and pI of 10, well above that predicted for MCoTI-I, which contains four Arg residues, from the Cucurbit Momordica cochinchinensis

(Felizmenio-Quimio et al., 2001; Mylne et al., 2012). The more highly charged and high-pI cyclotides of C. ternatea are preferentially expressed in organs that encounter challenges from the soil, namely the roots and seeds of the plant. Cyclotide extracts from roots, compared to leaves, exhibit increased toxicity against the juvenile L1 stage of the model nematode C. elegans, whereas adults and late stage juveniles were not affected (Gilding et al., 2015). The high charge of these potentially nematicidal peptides in on trend with other described nematicidal peptides (Liu et al., 2011). Further study is required to test for specific activity of organ-specific cyclotides against organisms.

Cyclotide sequences observed in aerial tissue typically have lower predicted charges and pI values than cyclotides in soilcontacting tissues. Cyclotide extracts of these aerial tissues exhibit a different MALDI-MS profile compared to other plant parts and greater propensity to bind to insect-mimetic plasma membranes. This implies that the aerially-expressed cyclotides are targeting insects (Gilding et al., 2015).

The cyclotide diversity of C. ternatea is further increased by post-translational modifications (PTM). Serra et al. (2016) described the first observations of hexosylation and methylation of cyclotides through enzymatic digests and MS techniques, with the estimated cyclotide diversity conferred by primary sequence and PTM diversity numbering in the hundreds. What the function of these post-translational modifications may be remains to be defined. Modifications of amino acid side chains reported in cyclotides include oxidation (Met and Trp), methylation, deamination (common at C-terminal Asn to Asp), hexosylation, dehydration, and hydroxylation (select Pro residues) (Plan et al., 2007; Serra et al., 2016).

#### Biosynthetic Auxiliary Enzymes

Cyclotide transcripts of C. ternatea encode for a signal peptide that immediately precedes the N-terminal residue of the cyclotide domain (Poth et al., 2011a; Gilding et al., 2015; Nguyen et al., 2016b). The current model for C. ternatea cyclotide biosynthesis mimics that of other cyclotide producing species and begins with the signal peptide inducing the docking of the ribosome-transcript complex with the rough endoplasmic reticulum (ER) (Conlan and Anderson, 2011; Göransson et al., 2015). Unique to C. ternatea cyclotide precursors is that the signal peptide cleavage alone releases the N-terminus of the cyclotide domain, thus no other N-terminal processing proteases are required. Following this, it is postulated that folding of the cyclotide domain begins, presumably aided by protein disulfide isomerases (PDIs), as the propeptide is held within the ER. From there the folded propeptide is transported via vesicles to the Golgi, and later to prevacuolar or vacuolar compartments. Somewhere during this transport pathway the propeptide encounters a specific type of AEP that catalyzes the simultaneous cyclization and cleavage of the cyclotide domain from the precursor (Göransson et al., 2015; Jackson et al., 2018). Post-translational modifications are possibly acquired along the maturation pathway but are poorly defined and thus need further investigation (Serra et al., 2016).

#### **Protein disulfide isomerases**

The disulfide knot of cyclotides must be formed from the oxidation of the six cysteine residues in a specific order. Incorrect connectivity may result in the precursor not being able to be cyclized and flagged as a faulty molecule needing destruction. In all cyclotide producing taxa, the specific in vivo physical of genetic interactions of PDI family members and cyclotide precursors is not known. In vitro evidence for PDI involvement is known from a PDI cloned in the Rubiaceae plant, Oldenlandia affinis, however, under the conditions tested the isolated PDI was not as efficient as using an isopropanol buffer to effect proper disulfide bond formation (Gruber et al., 2007). A systematic in vivo examination of C. ternatea PDIs discovered in the transcriptome is hindered by the lack of reverse genetic resources in C. ternatea.

#### **Asparaginyl endopeptidases**

Asparaginyl endopeptidases (AEPs), like most proteases, are known primarily for their function in peptide bond hydrolysis (Yamada et al., 2005), thus a proposed role for peptide bond creation for a selection of AEPs is particularly intriguing. The first direct evidence for this came about through work by the Tam group (Nguyen et al., 2014), who set out to identify

the peptide ligase responsible for the maturation of cyclotides in C. ternatea. Through activity-guided protein-fractionation studies, the researchers identified a single C. ternatea AEP isoform (termed butelase-1) that was highly efficient in intermolecular peptide cyclization. Since the discovery of butelase-1 in 2014, several other AEP peptide ligases have been identified from cyclotide producing plant species, including OaAEP1<sup>b</sup> from O. affinis (Harris et al., 2015), PxAEP3b (Petunia x hybrida) (Jackson et al., 2018), and HeAEP3 (Hybanthus enneaspermus) (Jackson et al., 2018). Through bioinformatic and functional testing the structural features that differentiate AEP ligases from proteases are beginning to emerge. Specifically, plant AEPs that function as transpeptidase-preferring enzymes in vivo have been shown to possess specific markers in their protein sequence, most notably one termed the Marker for Ligase Activity (MLA) (Jackson et al., 2018).

Subsequent work by Gilding et al. defined the expression levels of butelase-1 (referred to as CtAEP1) and the full length sequence of butelase-2/CtAEP2, CtAEP3, and CtAEP5 via RNAseq (Gilding et al., 2015). In contrast, a total of six butelases were described by Nguyen et al. (2014), with assembled sequences for butelase-4 and -6 not showing any homology to any of the CtAEPs described by Gilding et al. (2015). It might be the case that there is natural AEP isoform variation amongst C. ternatea accessions, or that differences in data assembly conditions, or choice of tissue RNA sampled between the studies of Nguyen et al. (2014) and Gilding et al. (2015) are responsible for this apparent discrepancy. Importantly, of all six AEPs, only butelase-1 has been shown to prefer transpeptidation over proteolysis.

#### NEXT GENERATION APPLICATIONS

In this section we describe recent applications of C. ternatea components in biotechnological, agricultural and pharmaceutical industries.

## Butelase

Butelase-1 has proven to be a very versatile molecular tool for a range of in vitro peptide engineering applications (Nguyen et al., 2016a,c; Bi et al., 2017). When compared to other characterized AEP ligases, butelase-1 displays superior reaction kinetics. Despite this, one obvious limitation for end-user uptake is that a recombinant production system is yet to be established (Nguyen et al., 2014). In lieu of this, a detailed protocol for the purification of butelase-1 from C. ternatea seed pods is available (Nguyen et al., 2016c), but is restricted to those with access to the source material and protein purification expertise. It remains unclear if butelase-1 has evolved superior structural features over other AEP ligases or that its greater catalytic efficiency is a by-product of purifying source activated enzyme.

#### Butelase-1 Mediated Intramolecular Peptide/Protein Cyclization

Tools to enable backbone cyclization of peptides have garnered considerable interest from the pharmaceutical industry as a means to provide proteolytically stable peptide therapeutics (Craik et al., 2012). In this regard, butelase-1 has been demonstrated as a highly versatile enzyme, cyclizing a range of diverse peptides, including cysteine rich cyclotides, conotoxins (e.g., MrIA) and sunflower trypsin inhibitors (SFTI-1) (Nguyen et al., 2014). Additionally a wide range of non-cysteine containing peptides have been cyclized, including human apelin, galanin, neuromedin U and salusin (Nguyen et al., 2015). In all cases, the substrate requirements for cyclization include the introduction of, if not already present, an Asn residue at the peptide ligation point, which must be linked to the C-terminal tailing residues of His-Val. These tailing residues, which are subsequently cleaved off and are not incorporated into the final cyclized product have been shown to be essential for butelase-1 cyclization efficiency (Nguyen et al., 2014). At the substrates N-terminus, requirements are flexible at the P1' position, with all residues accepted apart from Pro. However, at the P2' position more stringent requirements exist, with Cys, Ile, Leu, and Val all preferred (Nguyen et al., 2014). Together these requirements mean that most peptides, require at least some modifications of the termini residues to allow butelase-1 mediated cyclization. When these substrate requirements are met, butelase-1 has remarkably catalytic efficiency, with substrate to enzyme ratios of 100 ∼ 1000:1 commonly used, with typical cyclization reactions completed within 5 ∼ 30 min (Nguyen et al., 2015).

The benefits of backbone cyclization are not limited to small peptides, with the thermal and proteolytic stability of a number of larger proteins also improved by backbone cyclization. Like smaller peptides, these proteins must first be engineered to include optimal flanking residues for butelase-1 activity, with specific consideration given to the proximity of N and C residues. Where termini are not held close enough together, considerations should be given to include appropriate sized linker sequences. Using butelase-1, three different recombinantly produced proteins have been successfully cyclized, including green fluorescent protein (GFP), interleukin-1 receptor antagonist (IL-1Ra) and human growth hormone (somatropin) (Nguyen et al., 2015). In all cases butelase-1 (0.1uM) and target protein (25 µM) were incubated together with cyclization essentially complete within 15 min. In the case of IL-1Ra, backbone cyclization was shown to increase the thermostability of the protein, without affecting biological activity (Nguyen et al., 2015).

#### Butelase-1 Mediated Intermolecular Peptide Bond Formation

Butelase-1 has additionally shown great potential for the selective labeling of proteins by intermolecular peptide bond formation. Here, butelase-1 recognizes the required NHV motif engineered into a protein of interest and initiates ligation of incoming intermolecular nucleophiles, provided that substrate requirements are met. In this way a protein of interest can be labeled with any number of functional cargoes. Site specific labeling of proteins has applications for elucidating cellular pathways, defining protein–protein interactions and for the development of innovative medical imaging approaches and therapeutics (Falck and Müller, 2018; Harmand et al., 2018). One

additional promising application is the site specific labeling of surface proteins of live bacteria (Bi et al., 2017). To accomplish this, the authors engineered an NHV motif to the C-terminus of the anchoring protein OmpA of Escherichia coli. Upon incubation of live cells with butelase-1, a range of cargo molecules were able to be successfully linked to the engineered bacterial surface protein OmpA. These included a fluorescein probe, useful for cellular tracking of pathogen response, and a tumor associated monoglycosylated peptide, which provided a proof of concept for delivering post translationally modified antigens as live bacteria vaccines.

#### Insecticidal Applications of C. ternatea Peptide Extracts

Conventional pesticides have for decades been of paramount importance in sustaining agricultural productivity under an ever-increasing population burden. However, many traditional pesticides are increasingly becoming disfavored due to off-target toxicities and human health concerns. These concerns, together with increasing incidences of insects developing resistance mechanisms necessitates the discovery or engineering of novel pesticides with new modes of action (Perry et al., 2011). Recently an organic ethanolic extract prepared from C. ternatea vegetative tissue has shown promising insecticidal activity against a wide range of crop pests<sup>3</sup> . The extract, termed Sero-X <sup>R</sup> has thus far been registered in Australia for applications in cotton and macadamia, with further applications pending both in Australia and overseas. Although the exact mode of action of this ethanolic extract remains to be determined, it is likely in part to be due to the high concentrations of C. ternatea cyclotides present (Poth et al., 2011a,b; Gilding et al., 2015). The prototypic C. ternatea cyclotide Cter M is indeed enriched in the Sero-X <sup>R</sup> extract and when tested in isolation, displays lethality against cotton budworm (H. armigera) (Poth et al., 2011a). Like other cyclotides, such as kalata B1 from O. affinis, the predicted mode of action is through insect cell membrane disruption (Poth et al., 2011a; Craik, 2012), but it remains unclear if other non-proteinaceous components present in the Sero-X <sup>R</sup> extract play a synergistic role. Importantly, Sero-X <sup>R</sup> displays no toxicity to tested rodents or bee pollinators, and is considered non-hazardous according to the Globally Harmonized System of Classification and labeling of Chemicals.

#### Food Colorants/Consumer Products

Butterfly pea flowers can range from white to intense blue to shades in between. This coloring largely stems from the anthocyanin content and degree of aromatic acylation (Kazuma et al., 2003a). The deep blue pigment of C. ternatea has been particularly popular in Asia, where flower petals are used to color teas, deserts and clothes. More recently, C. ternatea flower extracts have been used to create vibrant blue alcoholic gins<sup>4</sup> , which change color depending on the pH, such as occurs on mixing with tonic water or lime. Specifically, the deep blue color of C. ternatea flowers is a particularly sought after alternative to synthetic blue food colorants which have become increasingly disfavored due to health concerns (Nigg et al., 2011). Studies reported that addition of C. ternatea extracts increased the polyphenolic and antioxidant contents of sponge cakes (Pasukamonset et al., 2018), enhanced the oxidative stability of cooked pork patties (Pasukamonset et al., 2017) and reduced the predicted glycemic index of flour (Chusak et al., 2018a). Microencapsulation using alginate prevented the degradation and enhanced the retainment of the antioxidant activities of C. ternatea polyphenolic extracts post gastrointestinal digestion (Pasukamonset et al., 2016). Currently there exists no commercial scale production of C. ternatea for anthocyanins, with harvesting of plant material at largescale not likely to be economically feasible. However, recent advances in engineering plant cell suspension cultures with anthocyanin regulatory pathway genes offers an alternative approach (Appelhagen et al., 2018).

# CONCLUSION AND FUTURE OUTLOOK

Here we have attempted to provide a comprehensive and multidisciplinary account of the diverse properties and applications of C. ternatea and its constituent molecules. The plant is readily grown in a range of habitats and there is wide opportunity for it to be used for rotational cropping to aid in soil nitrogen regeneration, as a fodder crop for cattle, or as source of novel phytochemicals. There are already a host of cosmetic and food colorants on the market and the first C. ternatea based insecticide (Sero-X <sup>R</sup> ) is also approved and being used for insect control on cotton and macadamia nut crops. The butelase-1 enzyme derived from C. ternatea pods is also creating a lot of interest as a biotechnological tool for peptide ligation and cyclization.

We anticipate that the success of products (including enzymes, extracts, and purified phytochemicals) deriving from C. ternatea will encourage more research on this plant and stimulate further discoveries that might lead to second and third generation products. For example, so far only a small fraction of the more than 70 cyclotides in this plant have been tested for pesticidal activity and there may be components in this cocktail of cyclotides that are significantly more potent as pesticides than what is currently known. Further work is needed to understand the biotic and abiotic factors that modulate the production of individual cyclotides in this plant and to understand possible synergies between different cyclotides and between cyclotides and non-cyclotide components.

We also anticipate that there will be more studies in the future on pharmaceutical applications of C. ternatea components. The ability to harvest large amounts of plant material means that one of the limitations encountered in many natural product research and commercialization (i.e., lack of source material) is not a factor for C. ternatea. While the multitude of medicinal applications reported so far from various C. ternatea preparations are impressive, we caution that many of these are one-off studies that have yet to be independently validated by groups other

<sup>3</sup>https://innovate-ag.com.au/

<sup>4</sup>https://www.inkgin.com/

than the original reporting group. It is to be expected that the claims for the various bioactivities of plant extracts need to be tested with rigorous controls to establish the efficacy of the plant components. Furthermore, very few of the cyclotides in C. ternatea have been screened for medicinal applications and we feel this would be a useful exercise for future studies. Likewise, none of the C. ternatea cyclotides have yet been used as molecular grafting frameworks to introduce new desired pharmaceutical activities as has been done for cyclotides from other plants such as kalata B1 or MCoTI-II. With these suggestions for future work on this fascinating plant we feel that many more exciting discoveries are on the horizon.

#### REFERENCES


#### AUTHOR CONTRIBUTIONS

DC and GO conceived and planned the framework for this article. All authors contributed to the writing and editing.

#### ACKNOWLEDGMENTS

We thank Innovate Ag Pty. Ltd for the financial support through an Australian Research Council (ARC) Linkage grant (LP130100550). DC is an Australian Research Council Australian Laureate Fellow (FL150100146).





**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Oguis, Gilding, Jackson and Craik. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Rice Seeds as Biofactories of Rationally Designed and Cell-Penetrating Antifungal PAF Peptides

*Mireia Bundó1† , Xiaoqing Shi1† , Mar Vernet1 , Jose F. Marcos2 , Belén López-García1 and María Coca1 \**

#### *Edited by:*

*1*

*Suvi Tuulikki Häkkinen, VTT Technical Research Centre of Finland Ltd, Finland*

#### *Reviewed by:*

*Eva Stoger, University of Natural Resources and Life Sciences Vienna, Austria István Pócsi, University of Debrecen, Hungary*

*\*Correspondence:* 

*María Coca maria.coca@cragenomica.es*

*† These authors have contributed equally to this work*

#### *Specialty section:*

*This article was submitted to Plant Metabolism and Chemodiversity, a section of the journal Frontiers in Plant Science*

*Received: 03 December 2018 Accepted: 16 May 2019 Published: 07 June 2019*

#### *Citation:*

*Bundó M, Shi X, Vernet M, Marcos JF, López-García B and Coca M (2019) Rice Seeds as Biofactories of Rationally Designed and Cell-Penetrating Antifungal PAF Peptides. Front. Plant Sci. 10:731. doi: 10.3389/fpls.2019.00731*

*Centre for Research in Agricultural Genomics (CRAG, CSIC-IRTA-UAB-UB), Barcelona, Spain, 2 Institute of Agrochemistry and Food Technology (IATA, CSIC), Paterna, Spain*

PAFs are short cationic and tryptophan-rich synthetic peptides with cell-penetrating antifungal activity. They show potent and selective killing activity against major fungal pathogens and low toxicity to other eukaryotic and bacterial cells. These properties make them a promising alternative to fulfill the need of novel antifungals with potential applications in crop protection, food preservation, and medical therapies. However, the difficulties of cost-effective manufacturing of PAFs by chemical synthesis or biotechnological production in microorganisms have hampered their development for practical use. This work explores the feasibility of using rice seeds as an economical and safe production system of PAFs. The rationally designed PAF102 peptide with improved antifungal properties was selected for assessing PAF biotechnological production. Two different strategies are evaluated: (1) the production as a single peptide targeted to protein bodies and (2) the production as an oleosin fusion protein targeted to oil bodies. Both strategies are designed to offer stability to the PAF peptide in the host plant and to facilitate its downstream purification. Our results demonstrate that PAF does not accumulate to detectable levels in rice seeds when produced as a single peptide, whereas it is successfully produced as fusion protein to the Oleosin18, up to 20 μg of peptide per gram of grain. We show that the expression of the chimeric *Ole18-PAF102* gene driven by the *Ole18* promoter results in the specific accumulation of the fusion protein in the embryo and aleurone layer of the rice seed. Ole18-PAF102 accumulation has no deleterious effects on seed yield, germination capacity, or seedling growth. We also show that the Oleosin18 protein serves as carrier to target the fusion protein to oil bodies facilitating PAF102 recovery. Importantly, the recovered PAF102 is active against the fungal phytopathogen *Fusarium proliferatum*. Altogether, our results prove that the oleosin fusion technology allows the production of PAF bioactive peptides to assist the exploitation of these antifungal compounds.

Keywords: PAF, antifungal, pathogens, fungi, rice, seed, oil bodies, protein bodies

# INTRODUCTION

Infections caused by fungi pose a serious threat to human and animal health and to food security and safety (Fisher et al., 2012). Invasive fungal diseases have significantly increased in recent decades and are important causes of mortality, particularly in immunocompromised patients, killing about one and a half million people every year. This value exceeds the death rate for malaria or breast cancer (Brown et al., 2012). Plant disease epidemics caused by fungi and fungallike oomycetes are an old problem that have been further exacerbated by intensive agricultural practices, globalization, and climate change (Bebber and Gurr, 2015). Today, cropdestroying fungi account globally for yield losses of ~20%, with further 10% postharvest losses (Fisher et al., 2018). In addition, food safety is challenged by mycotoxigenic fungi that contaminate food and feed with detrimental toxins for human health. The limited number of licensed antifungals currently available, together with the unprecedented rise of multidrug-resistant pathogenic fungi, makes crucial the development of novel antifungal compounds to combat fungal infections (Perfect, 2016; Fisher et al., 2018). Antimicrobial peptides (AMPs) are being actively explored to fulfill the need of novel antifungals with potential applications in crop protection, food preservation, and medical therapies.

AMPs are peptides and small proteins produced by most living organisms that exhibit lytic or inhibitory activity against microorganisms (Zasloff, 2002; Zhang and Gallo, 2016). However, most AMPs are obtained at low yields from natural sources, and some of them show properties, such as low stability or low specificity, that might compromise their applications. Their peptidic nature enables the rational design of novel molecules with improved properties to be produced at high yields through biotechnological systems. PAF102 was designed as a novel antifungal peptide with improved properties and optimized to be produced in biofactories (López-García et al., 2015). PAF102 is a modified concatemer of the hexapeptide PAF26 (RKKWFW), which was identified through a combinatorial approach as a Peptide with specific AntiFungal activity (PAF) (López-García et al., 2002, 2015; Muñoz et al., 2013). PAF102 shows potent antifungal activity against economically relevant phytopathogens and very low toxicity to other eukaryotic cells, including human erythrocytes. The antifungal mechanism of PAF102 is similar to that of the parental PAF26, and it involves the interaction with the fungal cell envelope, followed by cell penetration and intracellular effects that cause cell death (Muñoz et al., 2013; López-García et al., 2015). This mode-of-action is different to the one of licensed antifungal drugs; thus, PAF peptides might be an alternative to combat fungal pathogens. However, the difficulties of cost-effective manufacturing of PAFs by chemical synthesis, or by conventional microbial-based production systems associated to host toxicity, have hampered their development for practical use.

Plants provide a platform for the production of AMPs that offer advantages in terms of cost-effectiveness and scalability as they are economical and easy to grow, as well as of safety because of the low risk of contamination with human and animal pathogens (Twyman et al., 2003). Particularly, rice seeds have been reported as efficient bioreactors of AMPs, including natural or rational designed peptides (Bundó et al., 2014; Montesinos et al., 2016, 2017). AMP production is favored by limiting their accumulation to seeds that avoids the negative impacts on plant performance reported in some cases when accumulated in vegetative tissues (Coca et al., 2006; Nadal et al., 2012; Company et al., 2014). Several seed-specific promoters are now available to drive strong expression of *AMP* genes either in the rice endosperm or embryo (Qu and Takaiwa, 2004). Among endosperm-specific promoters are the ones from the seed storage proteins glutelins and globulins, including the *GluB1*, *GluB4*, and *Glb1* promoters; and among embryo-specific promoters are the ones from the oleosin proteins, such as the *Ole18* promoter. Another factor that favors the production of AMPs in seeds is their confinement into storage organelles, such as protein bodies (PBs) or oil bodies (OBs) (Bundó et al., 2014; Montesinos et al., 2016, 2017). Storage organelles offer a stable environment for packing a large amount of AMPs, together with host cell protection from AMP exposure. Proteins can be targeted to PBs through signal peptides linked at their N-terminus and/or KDEL sequence at their C-terminus, together through intrinsic physicochemical properties in certain storage proteins (Khan et al., 2012; Takaiwa et al., 2017). OB targeting is achieved using oleosin proteins as carriers (van Rooijen and Moloney, 1995; Montesinos et al., 2016). Oleosins are the most abundant structural proteins of plant seed OBs, whose lipophilic character and secondary structure determine their association to OBs (Abell et al., 1997). Both PBs and OBs have served to stabilize AMPs in rice seeds and to reach high yields (Bundó et al., 2014; Montesinos et al., 2016).

This work explores the feasibility of using rice seeds as a platform for the production of cell-penetrating antifungal PAF peptides, exemplified as PAF102. Two different strategies are evaluated: (1) the production as a single peptide targeted to PBs or (2) the production as an oleosin fusion protein targeted to OBs. Here, we report that PAF103, a His-tagged and KDELextended PAF102, was not accumulated to detectable levels in rice seeds, whereas PAF102 was successfully produced as fusion to the Ole18 protein in rice seeds. We demonstrate that the Ole18-PAF102 fusion protein was accumulated in OBs without affecting seed yield or germination capacity. We also show that biologically active PAF102 can be recovered from rice OBs. Our results demonstrate that the oleosin fusion technology is a good strategy for the production of PAF antifungal peptides.

# MATERIALS AND METHODS

#### Preparation of Plant Expression Vectors

Four different constructs were prepared for the expression of the synthetic *PAF* genes in rice seeds (**Figure 1A**). Three of them were designed for the production of a PAF as an individual peptide, and the last one as a fusion to the rice Oleosin 18 kDa

protein (Ole18). The individual peptide was His-tagged in N-terminal and KDEL-extended in C-terminal resulting in a new PAF that was named PAF103 (**Figure 1B**). The *PAF103* gene was synthesized by GenScript based on the codon usage bias in *Oryza sativa* and flanked in both ends by *BamH*I restriction sites (**Supplementary Figure S1**). To drive the expression of *PAF103*, three different endosperm-specific promoters were used, namely *Glutelin B1* (*GluB1*), *Glutelin B4* (*GluB4*), and the 26 KDa *Globulin* (*Glb1*) (Qu and Takaiwa, 2004). Additionally, the sequence encoding the signal peptide of the corresponding seed storage proteins was fused to the N-terminus of the *PAF103* gene for internalization into the endoplasmic reticulum (ER) system (**Figure 1B**). The vectors containing the *pGluB1:PAF103:tNos* and *pGluB4: PAF103:tNos* constructs were prepared replacing the *BamH*I-*BamH*I *CecA-KDEL* DNA fragment by the *BamH*I-*BamH*I *PAF103* DNA fragment into the previously described pC::*pGluB1:CecAKDEL:tNos* and pC::*pGluB4:CecAKDEL:tNos* vectors (Bundó et al., 2014). The pC::*pGlb1:PAF103:tNos* vector was prepared by replacing the *Nar*I-*Nar*I *BP178* fragment by a *Nar*I-*Nar*I *PAF103* fragment into the previously described pC::*pGlb1:BP178:tNos* vector (Montesinos et al., 2017). The *Nar*I-*Nar*I *PAF103* fragment was obtained by PCR amplification from the GenScript clone using the oligonucleotides NarIPAF103\_ fwd and PAF103NarI\_rev in **Supplementary Table S1**.

An additional vector for the expression of the chimeric gene encoding an Ole18-PAF102 fusion protein was prepared (**Figure 1A**). In this case, the gene expression was driven by the embryo-specific *Ole18* own promoter (Montesinos et al., 2016). The fusion protein corresponds to the Ole18 protein linked to the PAF102 peptide through a Tobacco Etch Virus NIa protease recognition site (TEV protease) (**Figure 1B**), without His-tag or KDEL extension but with an extra glycine residue at N-terminus that remains after TEV protease digestion. The construct was prepared by PCR amplification of the *Ole18* promoter and the Ole18 protein coding sequence (*pOle18:Ole18*) from the pC::*pOle18:Ole18-CecA:tNos* vector (Montesinos et al., 2016) using the primers in **Supplementary Table S1**, which introduce an *EcoR*I and a *Sal*I site at 5′ and 3′ ends of the fragment, respectively. The PAF102 coding sequence extended in frame at N-terminus with the TEV protease recognition site (PRS), and flanked by *Sal*I and *Pst*I restriction sites, was synthesized by GenScript (**Supplementary Figure S1**). The *Nopaline Synthase* (Nos)-terminator sequence was introduced into the cloning vector containing the *PRS-PAF102* (pUC57::*PRS-PAF102*) as a *Pst*I-*Hind*III fragment, which was amplified by PCR using the primers that add these restriction sites (**Supplementary Table S1**). Into this plasmid, the *pOle18:Ole18* fragment was also introduced as an *EcoR*I-*Sal*I restriction fragment. Finally, the whole construct was mobilized to the binary vector pCAMBIA1300 as an *EcoR*I-*Hind*III fragment to generate the pC::*pOle18:Ole18-PAF102:tNos* vector for rice transformation (**Figure 1A**). All the constructs were then verified by nucleotide sequencing.

#### Production of Transgenic Plants

Transgenic rice plants (*Oryza sativa* cv. Ariete) were produced by *Agrobacterium*-mediated transformation of embryogenic calli as previously described (Sallaud et al., 2003). Transgene insertion was confirmed in the regenerated plants by PCR analysis using leaf genomic DNA as template. The positive plants were selected to obtain homozygous lines in the T2 generation. The homozygous lines were identified by segregation of hygromycin resistance afforded by the *htp*II marker gene in the T-DNA region of pCAMBIA1300-derived vectors. The transgene copy number was estimated by quantitative PCR (qPCR) using the *Sucrose Phosphate Synthase* (*SPS*) reference gene as previously described (Yang et al., 2005; Bundó et al., 2014). Rice plants transformed with the empty vector (pCAMBIA 1300) were also produced as a control for this study. All rice plants were grown at 28 ± 2°C with a 14/10 h light/dark photoperiod.

#### Protein Extraction and Immunoblot Analysis

Protein extracts were prepared from dehulled mature seeds (10 seeds, ~200 mg) imbibed in water for 1 h. Seeds were ground and homogenized in a sucrose-containing buffer (10 mM phosphate buffer pH 7.6, 0.6 M sucrose). After filtration with miracloth, homogenates were centrifuged at low speed (200*g*) to remove cellular debris and starch. Clarified homogenates were then centrifuged at high speed (2,000*g*) to obtain PB-enriched fractions, as the precipitated dense fractions (Bundó et al., 2014), or the OB-enriched fractions, as the floating fractions (Montesinos et al., 2016). PB-enriched fractions were resuspended directly in SDS-loading buffer, separated on tricine-SDS-PAGE (16.5%), transferred to a nitrocellulose membrane (Amersham Protran 0.2 μm), and immunodetected using commercial monoclonal antibodies anti-His tag (A00186 GeneScript).

Immunoblot analysis of OB-associated proteins was done after solubilization in SDS-loading buffer, separation in SDS-PAGE, transfer to nitrocellulose membranes (Amersham Protran 0.4 μm), and immunodetection with antibodies against the PAF102 (1:1,000 dilution, this work) and the rice Ole18 [1:2,000 dilution, (Montesinos et al., 2016)]. Mouse polyclonal antibodies against synthetic PAF102 (GeneScript) were produced at the Laboratory Animal Facilities (registration number B9900083) of the Center for Research and Development (CID) from the Spanish National Research Council (CSIC), in strict accordance with the bioethical principles established by the Spanish legislation following international guidelines. The protocol was approved by the Committee on Bioethics of Animal Experimentation from CID and by the Department of Agriculture, Livestock, Fisheries, Food and Environment of the Government of Catalonia (permit number DAAM:7461). All efforts were made to minimize suffering of the animals. Four injections of synthetic PAF102 (0.5 mg each) in a three-weekly basis were applied to mice, which were bleed 1 week after the last injection to obtain the PAF102 antiserum.

The amount of PAF102 accumulation per seed was estimated on immunoblot by quantification of signal intensities of Ole18-PAF102 to known amounts of synthetic PAF102. Signal intensities were quantified using the Quantity Tools Image Lab™ Software (Version 5.2.1) included in the ChemiDoc™ Touch Imaging System (Bio-Rad, USA).

# RT-PCR Analysis of Transgene Expression

Transgene expression was determined by RT-PCR analysis of total RNA isolated from a pool of 10 immature seeds (before seed desiccation, around 20–25 days after flowering). Total RNA was extracted using the method previously described (Chang et al., 1993). DNase-treated RNA (1 μg) was retrotranscribed using the High Capacity cDNA Reverse Transcription kit (Applied Biosystems) using the oligo(dT) primer. PCRs were carried out using specific primers (**Supplementary Table S1**) that annealed to GluB1, GluB4, and Glb1 SP encoding sequences (forward primers) and to the *PAF103* gene sequence (reverse primers). Amplified transcripts encoded by the three different transgenes were compared to the *OsEF1a* housekeeping transcripts (Os03g08060).

## *In situ* Immunodetection of PAFs in Whole Seeds

PAF103 and PAF102 accumulation in the transgenic rice seeds was analyzed by *in situ* immunodetection using the anti-His tag (dil 1:1,000) and anti-PAF102 antibodies (dil 1:500), respectively, and the fluorescent labeled AlexaFluor448 antimouse secondary antibody (Molecular Probes, 1:5,000 dilution) as previously described (Bundó et al., 2014).

# Fungal Infection Assays on Rice Seeds

Transgenic rice seeds were evaluated for resistance to the seed fungal pathogen *Fusarium proliferatum* as previously described (Bundó et al., 2014). Briefly, 12 surface-sterilized seeds per line and treatment were placed on MS medium without sucrose and then inoculated with 50 μl of sterile water (control) or of *F. proliferatum* spore suspension (103 spores/ml). Seeds were allowed to germinate for 7 days to determine the percentage of germination under control or infection conditions. Three independent assays were performed.

# OBs Isolation and PAF102 Purification

OBs were isolated from the OB-enriched fractions by two consecutive cycles of flotation-centrifugation on a sucrose containing buffer. The integrity of the isolated OBs was tested by selective staining with Nile red (1 ng/ml, Sigma) and confocal fluorescent visualization. PAF102 was recovered from the OBs containing the Ole18-PAF102 protein by digestion with TEV protease (Invitrogen, 1:100). Proteolytic digestion was conducted overnight at 30°C in the TEV protease buffer supplemented with 0.25 M sucrose. TEV protease efficiency was estimated based on the disappearance of the Ole18-PAF102 signal in immunoblot analysis by quantification of signal intensities.

#### Antifungal Assays

Growth inhibition assays of *F. proliferatum* were performed in 96-well flat-bottom microtiter plates, as previously described (López-García et al., 2015). Basically, 70 μl of fungal conidia (1.4 × 103 conidia/ml) in half strength of potato dextrose broth (PDB) containing 0.02% chloramphenicol were mixed in each well with 30 μl of samples in OB resuspension buffer (0.2 M sucrose; 10 mM Tris-HCl pH 7.5; 0.02% Tween-20). Samples were prepared in triplicate. Plates were incubated with agitation for 72 h at 28°C. Fungal growth was monitored every 24 h by measuring the optical density (OD) at 600 nm using a Spectramax M3 reader (Molecular Devices), and mean values and standard deviation (SD) were calculated. Experiments were repeated twice.

#### RESULTS

#### Generation and Characterization of Transgenic Rice Plants

Three different constructs were prepared for the expression of a *PAF* synthetic gene in rice seed endosperm (**Figure 1A**). Two of them contain the glutelin promoters (*pGluB1* and *pGluB4*) to drive expression in the peripheral region of the endosperm, and the other one contains the 26 kDa globulin promoter (*pGlb1*) to drive expression in the inner starchy endosperm tissue (Qu and Takaiwa, 2004). All three constructs were designed to produce an individual PAF peptide targeted to PBs by including the signal peptides (SP) of the corresponding storage protein and the KDEL signal. The N-terminal signal peptides target proteins to the secretory pathway, are co-translationally cleaved, and are indispensable for PB sorting (Takagi et al., 2005b). However, the C-terminal endoplasmic reticulum (ER) retention sequence (KDEL) remains in the mature proteins, and although it is not strictly required for PB deposition, it is reported to favor accumulation levels (Takagi et al., 2005b). In addition to the KDEL sequence, the PAF102 was His-tagged at the N-terminus to facilitate its purification from rice endosperm, resulting in a new PAF peptide that was named PAF103 (**Figure 1B**). Growth inhibitory activity against the *F. proliferatum* fungal pathogen revealed equivalent antifungal activity for both peptides, namely PAF102 and PAF103, with a minimal inhibitory concentration (MIC) value of 3.5 μM.

One additional construct was prepared to produce PAF102 as a fusion protein to the Ole18 (**Figure 1A**). In this case, the expression of the chimeric fusion gene *Ole18-PAF102* is directed by the embryo specific promoter of the *Ole18* gene (Qu and Takaiwa, 2004). This strategy intends to target PAF102 to OBs.

Transformation of embryogenic rice calli was performed *via Agrobacterium tumefaciens*. Using the hygromycin resistance for selecting transformed calli, a similar number of plants was regenerated for each transformation event (around 10 independent lines obtained from independent calli). The presence of the transgenes was confirmed in most of the regenerated plants by PCR analysis using leaf genomic DNA as template. The positive plants were grown under containment greenhouse conditions to obtain homozygous transgenic lines in T2 generation. Four to five independent homozygous lines per construct were identified based on the segregation of the hygromycin resistance marker. No apparent adverse effects on growth, flowering, or grain yield

were observed on the transgenic PAF plants across generations; all of them performed similar to the plants transformed with the empty vector grown simultaneously under the same conditions. Stability and inheritance of transgenes in T3 plants were then confirmed by PCR analysis of genomic DNA, as done in the T0 plants (**Figure 2**). The copy number of transgene insertions in the different lines was estimated by qPCR analysis in comparison to the *SPS* single copy gene in the rice genome. The results show that the transgenes were present in a single copy in all the selected homozygous lines, in agreement with the segregation in previous generations of the antibiotic resistance marker encoded in the T-DNA (**Supplementary Table S2**).

#### PAF103 Does Not Accumulate in PBs of Rice Seeds

The accumulation of the single PAF103 peptide was evaluated on T2 homozygous seeds carrying the constructs *pGluB1:PAF103*, *pGluB4:PAF103*, or *pGlb1:PAF103*. Seed protein extracts enriched in dense organelles were successfully used previously for detection of cationic antimicrobial peptides deposited in PBs, such as cecropin A or BP178 (Bundó et al., 2014; Montesinos et al., 2017). Accordingly, we prepared PB-enriched extracts from mature seeds of all the homozygous lines carrying the constructs for *PAF103* expression. Western blot analysis using anti-His tag antibodies for PAF103 detection revealed no differential band on protein extracts from transgenic lines in comparison to wild type, whereas PAF103 was clearly immunodetected on a wild-type extract supplemented with the synthetic PAF103 peptide (**Supplementary Figure S2**). This result suggests that PAF103 does not accumulate in the transgenic seeds. To discard an extraction problem, we conducted an *in situ* immunodetection assay in whole seeds that equally failed to detect PAF103 on the transgenic seeds.

A different approach to assess PAF103 accumulation is through the detection of its antifungal activity. Thus, transgenic *PAF103* seeds were then evaluated for resistance to *F. proliferatum*. However, *PAF103* seeds showed similar, or even increased, susceptibility to the fungal pathogen than wild-type or empty vector seeds (**Supplementary Figure S3**). All the seeds from the different lines showed reduced germination capacity and reduced seedling growth after inoculation with fungal spores, whereas they germinated and grew normally under control conditions. Therefore, the antifungal activity of PAF103 was not detected in the transgenic rice seeds, providing additional support that the PAF103 peptide is not accumulated in these seeds.

Given that transgenes were integrated in the genome of all the independent transgenic *PAF103* lines but the transgene products were not detected, we evaluated the transgene expression in immature seeds at the developmental stage where *GluB1*, *GluB4*, and *Glb1* promoters are active (Qu and Takaiwa, 2004). We amplified the corresponding mRNAs in the tested independent lines (**Figure 3**) by RT-PCR analysis on total RNA isolated from seeds and using specific primers (**Supplementary Table S1**). These results indicate that although *pGluB1:PAF103*, *pGluB4:PAF103*, or *pGlb1:PAF103* transgene is expressed in

the rice seeds, the corresponding product PAF103 does not accumulate to detectable levels.

#### PAF102 Accumulates as Fusion to the Ole18 Protein in Rice Seeds

The accumulation of the PAF102 when fused to the Ole18 protein was evaluated in the T2 homozygous seeds of transgenic lines carrying the *pOle18:Ole18-PAF102*. We first analyzed PAF102 accumulation by *in situ* immunodetection in whole mount seeds. As shown in **Figure 4A**, PAF102 was immunodetected in the seed embryo and aleurone layer of *pOle18:Ole18-PAF102* seeds but not in wild-type seeds. This distribution pattern corresponds to the expression pattern of the *Ole18* promoter (Qu and Takaiwa, 2004) and correlates with OB accumulation in rice seeds (Montesinos et al., 2016).

In order to confirm that PAF102 was produced as a fusion protein and retains the natural targeting of Ole18, we isolated OBs from seeds of five independent *pOle18:Ole18-PAF102* homozygous lines. After solubilization, OB-associated proteins were separated by SDS-PAGE and immunodetected using anti-PAF102 antibodies. A polypeptide of apparent molecular mass of 23 kDa (the expected mass of the fusion protein is 23.14 kDa, corresponding to 18 kDa of oleosin + 1.2 kDa of TEV protease recognition size + 3.94 kDa of PAF102) was clearly detected in the OBs of four out of five *pOle18:Ole18-PAF102* lines and was absent in the empty vector and wild-type OBs (**Figure 4B**). This protein was also immunoreacting with the anti-Ole18 antibodies as an additional and less intense band than the one corresponding to the Ole18 protein. The accumulation of the fusion protein seems not to alter the protein profile of OBs as visualized by protein Coomassie blue staining. These results demonstrate that PAF102 accumulates in rice OBs when fused to the Ole18 protein.

The amount of produced PAF102 was estimated in T3 seeds in comparison to known amounts of PAF102. The highest value was found in the line #6 producing 20 ± 3 μg/g of seed, and the mean value for the four independent lines was 15 ± 6 μg/g of seed.

In addition to be stably produced across plant generations, the fusion protein remains stable in seeds during long-time storage at room temperature. The fusion protein was still detected in seeds after 3 years of storage in the laboratory with a mean value of 16 ± 6 μg/g of seed.

FIGURE 4 | PAF102 accumulates stably in rice seeds as an oleosin fusion protein. (A) *In situ* immunolocalization of PAF102 in *pOle:Ole18-PAF102* transgenic seeds (line #6) in comparison to wild-type (WT). Immunoreaction was detected using a fluorescent-labeled secondary antibody. (B) Immunoblot analysis of OB fractions purified from seeds of WT, empty vector (EV), and five independent transgenic lines carrying the *pOle18:Ole18- PAF102* transgene, using anti-PAF102 or anti-oleosin18 antibodies as indicated. Protein profile of OB fractions is shown by Coomassie blue staining of SDS gel. Proteins were purified from recent harvested seeds.

#### Ole18-PAF102 Accumulation Has No Negative Impact in Rice Plant Performance

PAF102 is a cell-penetrating peptide that kills fungal cells intracellularly (López-García et al., 2015). In order to evaluate potential toxicity of PAF102 fused to Ole18 and accumulated inside the embryonic rice cells, we characterized phenotypically the transgenic rice plants expressing the *Ole18-PAF102* under the control of the *Ole18* promoter. These plants showed normal phenotypical appearance during the vegetative phase, similar to the wild-type and empty vector plants (**Figure 5A**). Interestingly, they did not show a penalty in grain yield (**Figure 5B**), indicating that the accumulation of the Ole18-PAF102 does not affect seed production. Although differences in seed weight were observed among lines, those seeds accumulating the highest levels of Ole18-PAF102 showed similar weight on average to the control wild-type and empty vector seeds (**Figure 5C**). These data indicate that accumulation of the Ole18-PAF102 has no impact in seed filling. Additionally, seeds accumulating the recombinant fusion protein germinated at the same rate and timing as the control seeds, and their seedlings showed similar appearance (**Figures 5D,E**). Thus, the presence of Ole18-PAF102 in the OBs seems not to affect the viability of rice seeds and seedling growth. Altogether, these results suggest that the expression of *pOle18:Ole18-PAF102* does not alter the fitness of the rice plants.

#### Ole18-PAF102 Accumulation Does Not Protect Rice Seeds Against Fungal Infection

To investigate whether the fusion protein Ole18-PAF102 retains the antifungal activity of the single PAF102 peptide, we evaluated the *pOle18:Ole18-PAF102* seeds for resistance to *F. proliferatum*. These transgenic seeds showed similar susceptibility to the fungal pathogen than the wild-type or empty vector seeds (**Figure 6**). All the seeds from the different lines showed reduced germination capacity and reduced seedling growth after inoculation with fungal spores. These data suggest that the Ole18-PAF102 does not protect plants *in situ* against the fungal infection.

## Biologically Active PAF102 Is Recovered From pOle18:Ole18-PAF102 Seeds

We next assessed the recovery of the single PAF102 peptide from rice seed OBs carrying Ole18-PAF102. For that, we digested the recombinant OBs with the TEV protease, since Ole18 and PAF102 polypeptides were linked through the protease recognition site. The immunoblot analysis of OB fractions before and after proteolytic digestion is shown in **Figure 7A**. We observed that the fusion protein nearly disappeared after protease digestion of the OBs from two *pOle18:Ole18-PAF102* transgenic lines (#3, #6). Subtle differences were detected among lines and experiments, and TEV protease efficiency was calculated at 87.5 ± 6.5% on average. These data indicate a high efficiency of proteolytic processing of the fusion protein Ole18-PAF102 on intact OBs. Next, we investigated the presence of the PAF102 single peptide in the protease-digested fractions. We immunodetected a polypeptide in the fractions of lines #3 and #6 with a higher electrophoretic mobility to the synthetic PAF102 peptide, but that was absent in the WT fractions (**Figure 7B**). Equally, the polypeptide immunodetected in the EV fractions supplemented with synthetic PAF102 peptide showed a different mobility than the synthetic PAF102 peptide alone. These results indicate that, in the presence of plant extracts enriched with OBs, PAF102 exhibits different electrophoretic mobility, consistent with the altered electrophoretic mobility that has been previously reported for other small cationic peptides (Coca et al., 2006; Bundó et al., 2014; Montesinos et al., 2016, 2017). We also observed a couple of immunoreactive bands for the pure synthetic PAF102 indicating a tendency to form multimers (**Figure 7B**). Thus, our data suggest that PAF102 is released from the fusion protein and associates with other compounds in OB fractions or multimerizes.

type (WT) and the transgenic rice plants carrying the empty vector (EV) or the *pOle18:Ole18-PAF102* gene at 30 days after sowing. (B) Average grain yield per plant calculated from four plants per line in three independent assays (*n* = 12). (C) Average weight of 100 seeds per line (*n* = 12). (D) Phenotypical appearance of seedlings at 7 days post imbibition. (E) Percentage of germinated seeds at 2 and 4 days after imbibition. Values correspond to the mean value of three independent assays. Error bars represent standard deviation.

To characterize the released PAF102 peptide, we tested the antifungal activity of the different OB fractions in *in vitro* fungal growth inhibitory assays (**Figures 7C**–**E**). First, we checked whether the synthetic PAF102 was active against *F. proliferatum* in the OB isolation buffer (**Figure 7C**). We observed total fungal growth inhibition at the concentration of 4 μM PAF102.

FIGURE 6 | Ole18-PAF102 accumulation does not protect rice seeds against the fungal pathogen *F. proliferatum.* (A) Phenotypical appearance of wild-type (WT), empty vector (EV), and *pOle18:Ole18-PAF102* transgenic seedlings (lines #1, #3, #6, and #5) at 7 days after inoculation with *F. proliferatum* spore suspensions (103 spores/ml). Pictures are representative of three independent experiments. (B) Percentage of seed germination upon infection in comparison to control conditions (see Figure 5D). The graph shows mean and standard deviation values of the indicated lines from three independent assays.

This inhibitory concentration agrees with reported values (López-García et al., 2015). Then, we tested intact OBs carrying the Ole18-PAF102 fusion protein from two independent lines in comparison to OBs from empty vector lines, and we did not detect fungal growth inhibitory activity (**Figure 7D**). This is an additional evidence that the fusion protein Ole18-PAF102 has no antifungal activity, as suggested by the fungal infection assays of the seeds accumulating the fusion protein (**Figure 6**). Finally, OB fractions digested with TEV protease containing the released PAF102 showed clear growth inhibitory activity against *F. proliferatum*, whereas the empty vector fractions did not possess antifungal activity (**Figure 7E**). OB fractions

from line *pOle18:Ole18-PAF102* #6 showed higher inhibitory capacity than those from line #3, in agreement with the protein accumulation levels. The inhibitory activity depicted by fractions from line #6 was similar to wild-type fractions supplemented with the synthetic PAF102. According to the activity, the amount of PAF102 in OB fractions was around 3–4 μM (30 μl from a total of 100 μl obtained from 10 seeds), which represents 13.3–17.6 μg/g of seed. These values agree with the estimation for the fusion protein Ole18-PAF102, which was 15 ± 6 μg/g of seed. Therefore, the results indicate that most of the PAF102 was released after the proteolytic digestion of the Ole18-PAF102 fusion protein as an active antifungal peptide.

# DISCUSSION

Our study demonstrates that rice seeds can be used as biofactories of rationally designed antifungal peptides, exemplified in the PAF102. The production of this small bioactive peptide was only feasible when fused to the Ole18 protein, and its accumulation was not detected when expressed as a single peptide. Since the identification of PAF26 (López-García et al., 2002), this and other PAF-derived peptides have been recalcitrant to be produced through biotechnology (either in bacteria, yeast, or higher eukaryotes). Therefore, one major point of novelty of the current study is to manage the biotechnological production of PAF peptides through fusion to oleosin. We have defined a model for the mode of action of fungal-specific and PAF-derived peptides in three steps: interaction with fungal cells, internalization, and intracellular killing (Muñoz et al., 2013; López-García et al., 2015). This implies that the specificity of these peptides relies in the interaction with the fungal cell envelope and the subsequent internalization. Once the peptide is inside the cell, it might be active and killing other cells different than fungal cells, such as bacteria cells. We speculate that the peptide produced in any cell factory could be toxic when it accumulates intracellularly, unless fused to a carrier such as the oleosin, as demonstrated in this study. The fusion of PAF102 to the Ole18 protein targeted the peptide to the OBs, where it remains stable during long-time storage and accumulates to high amounts of up to 20 μg/g of seeds for such small peptides (3.9 kDa).

The production of PAFs as single peptides was approached using three different promoters, namely *pGluB1*, *pGluB4*, and *pGlb1*, which drive strong endosperm-specific expression. Our approach was based on targeting the peptide to PBs with the three corresponding seed storage protein signal peptides and the KDEL extension to produce the derivative PAF103. These SPs are known to guide protein sorting into PBs, including antimicrobial peptides (Müntz, 1998; Yang et al., 2003; Ibl and Stoger, 2012; Bundó et al., 2014; Montesinos et al., 2017; Takaiwa et al., 2017). The KDEL extension was introduced because it normally increases protein accumulation levels in PBs (Takagi et al., 2005b; Takaiwa et al., 2017). The two glutelin promoters, namely *pGluB1* and *pGluB4*, have been reported to direct expression to the outer endosperm and have been successfully used to express the *Cecropin A* gene, encoding a small, linear, and cationic antimicrobial peptide (Bundó et al., 2014). The globulin promoter directs the expression to the inner endosperm and worked better than *pGluB1* and *pGluB4* for the expression of the *BP178* gene, encoding also a synthetic small, linear, and cationic antimicrobial peptide (Montesinos et al., 2017). Although we showed that all the three promoters directed the expression of the gene to the rice seeds, we could not detect the product using different methods. Failed detection supports that the peptide did not accumulate in the rice seed. Given that all the rice plants expressing the *PAF* transgenes showed normal growth and development, and no altered seed filling and yield, cytotoxic effects seem not to be responsible for the lack of PAF accumulation. The most plausible reason is peptide instability in plant tissues. Further experiments are needed to understand why these peptides are not accumulated in rice seeds. Whatever was the reason, our results clearly show difficulties to produce PAFs as single peptides in rice seeds.

A better strategy for the production of this type of peptides is the fusion to the Ole18 protein. The fusion protein guided by the oleosin is embedded in OBs where peptides are immobilized and inactivated. Our assays clearly show that Ole18-PAF102 does not exhibit *in vivo* or *in vitro* activity against fungi, but as soon as it is released from the Ole18 it becomes active. The immobilization in OBs confers protection and offers stability to PAF102, allowing its accumulation. In our experiments, the amount of PAF102 reached up to 20 μg/g of seed, which taking into account the low molecular weight of the peptide (3.9 kDa), corresponds to 5.1 nmoles/g of seed. This yield is a little lower than the one obtained with cecropin A using the same production strategy (8 nmoles/g) (Montesinos et al., 2016), but still on the average of reported yields for small peptides in rice seeds (0.03–10 nmoles/g of seed) (Yasuda et al., 2005; Takagi et al., 2005a,b, 2008, 2010; Suzuki et al., 2011; Wakasa et al., 2011; Wang et al., 2013), or even proteins, such as lysozyme yielding up to 80 μg/g (5.6 nmoles/g of seed) when produced from a single expression cassette or 150 μg/g (10.5 nmoles/g of seed) from two independent expression cassettes (Hennegan et al., 2005). Most of these recombinant proteins were produced in the rice endosperm, which accounts for most of the grain volume (90%), whereas our strategy of protein accumulation in OBs is restricted to the rice embryo and the aleurone layer that represents only 10% of the rice grain volume. It would be interesting to explore the production of PAF102 in OB-rich seeds such as safflower, sesame, rapeseed, soybean, or sunflower. These oily seeds have been used to produce recombinant proteins using the oleosin fusion technology (Parmenter et al., 1995; Boothe et al., 2010; Nykiforuk et al., 2011). Reported yield ranges from 0.13% of total protein for insulin in *Arabidopsis thaliana* seeds (Nykiforuk et al., 2006), 0.27% for hirudin in *Brassica napus* seeds (Parmenter et al., 1995), to 0.55% for human growth factor protein in safflower (Boothe et al., 2010). Taking into account that rice grain is not particularly rich in OBs, and our PAF yield in rice is 0.025% of seed proteins, we predict that commercial relevant values might be reached in oily seeds. Therefore, our proofof-concept study indicates that the technology of oleosin fusion might be the best strategy to produce PAF peptides with the projection to improve yields in oily crops.

In addition to offer stability, OB accumulation facilitates the purification of PAFs from plant material by simple flotation in dense sucrose solutions. However, the PAF102 immobilized in the OBs was not active and required to be released from the Ole18 for activity. Equally, the antimicrobial peptide CecA did not exhibit activity while immobilized on the OBs (Montesinos et al., 2016), whereas other proteins have been reported to be active while associated to OBs, such as the ß-glucuronidase (van Rooijen and Moloney, 1995), a xylanase (Liu et al., 1997; Hung et al., 2008), the D-hydantoinase (Chiang et al., 2006), or the D-psicose-3-epimerase (Tseng et al., 2014), among others. The lack of activity shown by the antimicrobial peptide PAF102 while immobilized in OBs might be related to its mode of action. Being attached to OBs might prevent the PAF102 internalization into fungal cells, a process that is required for its antifungal action (Muñoz et al., 2013; López-García et al., 2015). Appropriately, antifungal activity was recovered upon release from OBs by TEV protease digestion, exhibiting equivalent activity against *F. proliferatum* to the synthetic peptide. Therefore, our results show that biologically active PAFs can be produced *in planta*.

Although rice seeds are starchy more than oily, and they might not be the best host for production of oleosin fusion proteins, they offer unique opportunities as bioreactors since the rice gene transfer technology is well developed, cropping conditions are easy and well-established worldwide, and high grain yield can be obtained (Stoger et al., 2005; Takaiwa et al., 2017). Moreover, the accumulation in seeds provides long-term stability during storage at room temperature, up to 3 years in the case of PAF102. Seeds can be stockpiled without the need to synchronize production with product demand. Additionally, the OBs are restricted to the embryo cells and aleurone layer in the rice grain. Thus, the production of the Ole18-PAF fusion protein driven by the *Ole18* promoter was only found in these specific tissues. Along with the embryo, the seed coats, including aleurone and pericarp, are separated during rice milling to obtain the white refined grain and remain as the rice bran by-product. Consequently, downstream purification is facilitated using the PAF102 enriched rice bran as the starting plant material. The use of rice bran for the production of PAFs could add an extra value to this by-product assisting their exploitation to bring them to market.

#### REFERENCES

Abell, B. M., Holbrook, L. A., Abenes, M., Murphy, D. J., Hills, M. J., and Moloney, M. M. (1997). Role of the proline knot motif in oleosin endoplasmic reticulum topology and oil body targeting. *Plant Cell* 9, 1481–1493. doi: 10.1105/tpc.9.8.1481

#### ETHICS STATEMENT

Mouse polyclonal antibodies were produced at the Laboratory Animal Facilities (registration number B9900083) of the Center for Research and Development (CID) from Spanish National Research Council (CSIC), in strict accordance with the bioethical principles established by the Spanish legislation following international guidelines. The protocol was approved by the Committee on Bioethics of Animal Experimentation from CID and by the Department of Agriculture, Livestock, Fisheries, Food and Environment of the Government of Catalonia (permit number DAAM:7461). All efforts were made to minimize suffering of the animals.

#### AUTHOR CONTRIBUTIONS

JM, BL-G, and MC conceived and designed the study. BL-G and MB prepared the gene constructs to be introduced in rice. MB carried out all rice transformation experiments and the molecular characterization of transgenic plants. MB, MV, and MC characterized phenotypically the generated transgenic rice plants. XS, MV, and MC characterized PAF production in rice seeds. MC coordinated the study and prepared the manuscript. All the authors read and approved the final manuscript.

#### FUNDING

This work was supported by SEPSAPE grants (Plant KBBE programme) EUI2008-03769 and EUI2008-03619 and by the grant BIO2015-68790-C2-2-R; through the "Severo Ochoa Programme for Centres of Excellence in R&D" (SEV-2015- 0533) from Spanish Ministerio de Ciencia, Innovación y Universidades (co-financed FEDER funds); and by the CERCA Programme/Generalitat de Catalunya.

#### ACKNOWLEDGMENTS

We thank Izar Achaerondio for help with parts of this work and Blanca San Segundo for scientific advice.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/article/10.3389/fpls.2019.00731/ full#supplementary-material


Yang, D., Guo, F., Liu, B., Huang, N., and Watkins, S. C. (2003). Expression and localization of human lysozyme in the endosperm of transgenic rice. *Planta* 216, 597–603. doi: 10.1007/s00425-002-0919-x

Yasuda, H., Tada, Y., Hayashi, Y., Jomori, T., and Takaiwa, F. (2005). Expression of the small peptide GLP-1 in transgenic plants. *Transgenic Res.* 14, 677–684. doi: 10.1007/s11248-005-6631-4

Zasloff, M. (2002). Antimicrobial peptides of multicellular organisms. *Nature* 415, 389–395. doi: 10.1038/415389a

Zhang, L., and Gallo, R. L. (2016). Antimicrobial peptides. *Curr. Biol.* 26, R14–R19. doi: 10.1016/j.cub.2015.11.017

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Bundó, Shi, Vernet, Marcos, López-García and Coca. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Critical Analysis of the Commercial Potential of Plants for the Production of Recombinant Proteins

*Stefan Schillberg1,2 \*, Nicole Raven1 , Holger Spiegel1 , Stefan Rasche1,3 and Matthias Buntru1*

*1 Fraunhofer Institute for Molecular Biology and Applied Ecology IME, Aachen, Germany, 2 Institute for Phytopathology, Justus-Liebig-University Giessen, Giessen, Germany, 3 Aachen-Maastricht Institute for Biobased Materials, Geleen, Netherlands*

#### *Edited by:*

*Jussi Joonas Joensuu, VTT Technical Research Centre of Finland Ltd, Finland*

#### *Reviewed by:*

*Arjen Schots, Wageningen University and Research, Netherlands Richard Strasser, University of Natural Resources and Life Sciences Vienna, Austria*

*\*Correspondence: Stefan Schillberg stefan.schillberg@ime.fraunhofer.de*

#### *Specialty section:*

*This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science*

*Received: 18 February 2019 Accepted: 16 May 2019 Published: 11 June 2019*

#### *Citation:*

*Schillberg S, Raven N, Spiegel H, Rasche S and Buntru M (2019) Critical Analysis of the Commercial Potential of Plants for the Production of Recombinant Proteins. Front. Plant Sci. 10:720. doi: 10.3389/fpls.2019.00720*

Over the last three decades, the expression of recombinant proteins in plants and plant cells has been promoted as an alternative cost-effective production platform. However, the market is still dominated by prokaryotic and mammalian expression systems, the former offering high production capacity at a low cost, and the latter favored for the production of complex biopharmaceutical products. Although plant systems are now gaining widespread acceptance as a platform for the larger-scale production of recombinant proteins, there is still resistance to commercial uptake. This partly reflects the relatively low yields achieved in plants, as well as inconsistent product quality and difficulties with larger-scale downstream processing. Furthermore, there are only a few cases in which plants have demonstrated economic advantages compared to established and approved commercial processes, so industry is reluctant to switch to plant-based production. Nevertheless, some plant-derived proteins for research or cosmetic/pharmaceutical applications have reached the market, showing that plants can excel as a competitive production platform in some niche areas. Here, we discuss the strengths of plant expression systems for specific applications, but mainly address the bottlenecks that must be overcome before plants can compete with conventional systems, enabling the future commercial utilization of plants for the production of valuable proteins.

Keywords: cell-free biosynthesis, CHO cells, molecular farming, plant-made pharmaceuticals, *Pseudomonas fluorescens*

#### INTRODUCTION

The function of a protein is determined by the number and sequence of amino acids, which controls the three-dimensional structure of the resulting folded polypeptide. Proteins are therefore molecules of great complexity and near infinite diversity, making them suitable for many different applications. More than 300 protein-based medicines have been approved in the USA and Europe, and proteins account for almost a third of all pharmaceuticals in development (Walsh, 2018). Proteins are also widely used in industry, including enzymes used to manufacture textiles and chemicals and to process food and feed. Many other proteins are used as diagnostics or research reagents. The demand for recombinant proteins is therefore rising steadily, with a market valued at US\$1.654 billion in 2017 predicted to reach US\$2.850 billion by 2022 (Markets and Markets, 2017). Therapeutic proteins (e.g., antibodies, vaccines, enzymes, cytokines, and growth factors) account for almost half of this market, followed by industrial proteins (e.g., technical enzymes) and research reagents (e.g., antibodies for protein detection and purification) (Markets and Markets, 2017). Market growth has been supported by advances in recombinant protein production technologies, including the engineering of expression hosts, the optimization of upstream cultivation (e.g., bioreactor design, nutritional, and physical parameters), and the development of more efficient protein extraction and purification methods. Most recombinant proteins are currently produced in prokaryotic cells (mainly the bacterium *Escherichia coli*) and a small number of wellcharacterized mammalian cell lines, such as Chinese hamster ovary (CHO) cells. Other systems are used in commercial processes but are less common, including insect cells, yeast, algae, and cell-free expression platforms (Markets and Markets, 2017). There are also several platforms based on plants and plant cells, but these have not been included in the latest market studies, indicating they have not yet commanded a significant share of commercial protein production capacity. Even so, plants as an alternative expression platform offer unique advantages, particularly when target proteins are difficult to produce in conventional systems, require specific qualitative properties such as particular glycan profiles, and/or must be produced on a larger scale in response to urgent demand. Improvements in expression levels and the economics of downstream processing will promote the utilization of plants for commercial protein production.

#### Conventional Expression Systems – Mammalian and Prokaryotic Cells

Industry favors recombinant protein expression systems that have a long and successful track record, specifically with three goals in mind: high quality, high yields, and low costs. In addition, such systems should meet the demands of an industrial process with respect to robustness, and economic sustainability, and must comply with regulatory requirements. This is particularly relevant for pharmaceutical proteins produced according to good manufacturing practice (GMP), a set of guidelines ensuring that biopharmaceuticals are sufficient in terms of quality and batch-to-batch consistency in order to prevent harm to patients. The industry has therefore focused its resources on a small number of cell-based systems, in particular CHO cells and *E. coli*, which are now considered the gold standards for industrial protein manufacturing.

Many complex proteins, including most therapeutic antibodies, are routinely produced in CHO cells because they have the capacity to carry out authentic post-translational modifications, including glycosylation. Recombinant proteins produced in CHO cells are secreted into the culture medium to facilitate recovery and purification. Various strategies have been pursued to maximize the productivity of CHO cell cultures, including: (1) the engineering of production lines and expression vectors, (2) amplification of the expression cassette, (3) optimization of the cell culture medium, including the switch from early formulations containing serum to chemically defined and near protein-free formulations that simplify protein purification even further, (4) increasing the cell density during cultivation, and (5) the introduction of fermentation strategies that balance the nutrient supply while maintaining optimal cultivation conditions (Hausmann et al., 2018; Ritacco et al., 2018). These developments have led to remarkable increases in yields. For example, early processes for the manufacture of monoclonal antibodies achieved titers of hundreds of milligrams per liter, but this has increased to routine titers of 5–10 g/L and in some cases up to 20 g/L (Pujar et al., 2017), reducing the cost of goods to as little as €20/g (Kelley, 2007). Combined with GMP-compliant cell lines and processes and well-established approval procedures, it is hard to imagine that the CHO platform will be displaced by any other expression system for the manufacturing of complex proteins in the near future.

Although mammalian cells are favored for the production of complex proteins, prokaryotic cells are much easier to handle and are much less expensive in terms of media requirements. Accordingly, where the product is a simpler protein, *E. coli* is often the ideal choice of production host. Indeed, the first recombinant therapeutic protein (human insulin) has been commercially produced in *E. coli* since 1982 (Baeshen et al., 2014). Many other commercial recombinant protein products including cytokines for cancer treatment or technical enzymes for industrial applications have been produced in *E. coli*, but its status as the gold standard prokaryotic host is mainly for historical reasons and a range of other prokaryotes may be more suitable (Sanchez-Garcia et al., 2016; Singh et al., 2016). In our laboratory, we use *Pseudomonas fluorescens* for the larger-scale production of recombinant proteins, which accumulate in the cytosol by default or can be secreted to the culture medium (Retallack et al., 2012). For example, we chose cytosolic accumulation for the production of a 19-kDa phenylalanine-free protein that can be used for the dietetic management of patients suffering from phenylketonuria – an inborn error metabolism that results in decreased metabolism of the amino acid phenylalanine (Hoffmann et al., 2018). The phenylalanine-free protein can easily be extracted from the cells by high-pressure homogenization and isolated *via* a single affinity purification step (**Figure 1**). Simple medium-scale cultivation in 2.5-L shake flasks with a culture volume of 0.5–1.0 L achieved yields of 2.5 g/L. Fed-batch fermentation in bioreactors with a working volume of 5–350 L increased productivity to 20 g/L, enabling the production of 3.5 kg of the target protein for animal tests within a few weeks and demonstrating the feasibility of a scaled-up industrial process that provides ton quantities as a supplement for phenylalaninefree food production.

Despite the availability of high-performance protein expression hosts, there is a constant demand for improved or completely new systems to reduce manufacturing costs by increasing productivity, quality, and/or yields. During the late 1980s and early 1990s, plants and plant suspension cell cultures were

FIGURE 1 | Production of a 19-kDa phenylalanine-free protein in *P. fluorescens*. After extraction and the removal of cell debris, the product was purified from the clarified extract by single-stage immobilized metal-ion affinity chromatography. Due to the high protein concentration, representative samples of the load and elution fraction were analyzed by SDS-PAGE at dilution ratios of 1:80, 1:160, and 1:320. The strong band at ~32 kDa represents the phenylalanine-free protein – the larger size of the protein probably reflects a combination of its high surface charge and generally high stability, which prevents full de-folding during sample preparation. This example shows the high efficiency of the purification step: only traces of other proteins are found in addition to the target protein in the final elution fraction, corresponding to a purity of >95%.

proposed as alternative production systems (Hiatt et al., 1989; Fischer et al., 1999). In particular, the scalability of plant-based systems combined with the low cost of plant cultivation was predicted as a major driver to reduce manufacturing costs. However, this promise has yet to be fully realized, mainly reflecting the low yields of plants and the high costs of product recovery and purification.

#### Plant-Based Production Systems and Plant-Derived Protein Products

Since the 1990s, many researchers have aspired to produce recombinant proteins in plants. Typically, they favored plants that were already used for other research purposes because the techniques required for gene transfer were readily available. This led to the development of an extremely diverse array of production systems, including whole plants, various tissue and cell systems (hairy roots and cell suspension cultures), and numerous expression approaches (stably transformed transgenic and transplastomic plants, transient expression systems, inducible expression, and different protein targeting strategies; Twyman et al., 2003, 2005; Schillberg et al., 2013; Spiegel et al., 2018). A suitable platform is therefore likely to be available for any conceivable product, but the absence of a standard platform scatters and slows down efforts to optimize productivity and makes it more difficult to define industrial production standards.

In terms of product candidates, research has focused mainly on biopharmaceuticals with a higher added-value compared to diagnostic and technical proteins. In this context, three main protein product classes have emerged: antibodies, vaccine candidates, and replacement human proteins such as blood products (human serum albumin), replacement proteins for common and rare diseases (gastric lipase for cystic fibrosis, insulin for diabetes, glucocerebrosidase for Gaucher's disease), or growth factors and cytokines (Spiegel et al., 2018). Recombinant antibodies, antibody fragments, and antibody fusion proteins have become the most common products expressed in plants (Nölke et al., 2003; Vasilev et al., 2016) because they are both economically important as pharmaceuticals (Walsh, 2018) and also relatively stable and easy to characterize. This means they accumulate to high levels (>100 mg/kg fresh plant weight or > 100 mg/L culture medium), are easy to purify even from complex plant matrices, and their functionality can be verified using simple binding assays. However, antibody titers in plants still lag far behind the yields currently achieved in CHO cells, making it uncertain that plants will ever become suitable as a routine commercial platform for antibody products. As discussed below, however, there are certain niche markets where plants offer capabilities that cannot be matched by CHO cells or any other platform.

Many studies involving the production of recombinant proteins in plants have not ventured into the hard realities of commercial development and have focused instead on early-stage objectives such as verifying expression, optimizing production and purification to a certain extent, and the completion of initial functionality assays. Few studies have included translational research demonstrating commercial competitiveness, partly due to the financial and organizational challenges that must be overcome before plant-derived biopharmaceuticals can be tested in clinical trials. It is almost impossible to secure financial and business support if the market potential and intellectual property rights are unclear, as is the case for most protein products made in plants. But without a solid business case industry will not switch from its established microbial and mammalian production systems to plants because the risk would not be justified. Nevertheless, a few plant-derived biopharmaceutical product candidates have entered clinical trials, helping to define GMP-compliant processes now approved by the regulatory authorities (Fischer et al., 2012, 2013; Sack et al., 2015). A handful have reached the market, the first of which was recombinant glucocerebrosidase (prGCD), generic name taliglucerase alfa, marketed as Elelyso, which is manufactured in carrot cells by Protalix Biotherapeutics (Rup et al., 2017; Zimran et al., 2018).

Given the long timelines and huge investment needed to provide proof of concept for the business potential of plantderived biopharmaceuticals, it may be better to pick the low-hanging fruit. This means products that allow quicker access to the market due to the less-stringent regulatory requirements, as is the case for diagnostic, technical, and cosmetic products. Key examples include the diagnostic reagent avidin, which was first commercially produced in maize 20 years ago (Hood et al., 1997) and is still sold by Sigma-Aldrich (catalog no. A8706) and human epidermal growth factor produced in barley as a cosmetics additive, distributed by Sif Cosmetics (Iceland). However, two major challenges that must be addressed before plants can become more generally competitive with other expression systems are the low product yields and the cost of downstream processing.

## CHALLENGES FACING RECOMBINANT PROTEIN PRODUCTION IN PLANTS

In 1995, a secretory antibody was produced in tobacco plants with a titer of 500 μg/g fresh plant material (Ma et al., 1995). Although this has been exceeded by model proteins such as green fluorescent protein, which was transiently expressed in *Nicotiana benthamiana* leaves with a yield of 4 mg/g fresh weight (Marillonnet et al., 2005; Yamamoto et al., 2018), or the *Bacillus thuringiensis* (*Bt*) Cry2Aa2 protein, which accumulated as crystals in tobacco chloroplasts with a yield of ~5 mg/g fresh weight (De Cosa et al., 2001), other products including antibodies rarely accumulate to levels exceeding 100 μg/g fresh weight. This is despite extensive research to optimize protein expression and stability in plants by addressing internal factors (e.g., expression cassettes, protein targeting strategies, and the co-expression of protease inhibitors) and external factors (e.g., nutritional and physical cultivation parameters affecting plant growth and fitness) (Twyman et al., 2013). In addition, the challenge of protein purification from complex plant matrices reduces final yields while contributing to the high overall manufacturing costs.

#### The Yield Challenge

Recombinant protein yield is defined by the intrinsic productivity of the host, the biomass of the expression host in a given volume or area, and the potential for scale-up. Most approaches for yield improvement aim to increase the cell-specific productivity (qP) by genetic engineering or optimizing the culture conditions. In optimized CHO bioprocesses, the qP can reach 50–90 pg per cell per day, whereas human secretory plasma cells are capable of secreting IgM at a rate of 200–400 pg per cell per day (Hansen et al., 2017). It is rare to find qP values quoted for plant-based systems. For a tobacco cell suspension culture with a maximum yield of 100 mg/L for a full-size antibody, the equivalent qP value is 8.0 pg per cell per day (Havenith et al., 2014). Although this is an order of magnitude lower than the maximum qP values of elite CHO cell lines, it is nevertheless promising because the further optimization of protein productivity in plants appears to be possible by controlling genetic and epigenetic factors, as well as cultivation parameters (Twyman et al., 2013).

A major limitation of plant cells is their size. Compared to plant cells, bacterial and mammalian cells are rather small and reach a packed cell volume (PCV) of less than 5% in conventional batch cultures. Therefore, a common strategy to increase productivity is to increase the cell density/number in industrial production process leading to PCVs of almost 50%. In contrast, plant cells have a significantly larger volume caused mainly by the presence of dominant vacuolar compartment. The large size of plant cells means that productivity cannot be enhanced in suspension cultures by increasing the cell number, because at the end of the cultivation period, the PCV is already 60–80% of the culture volume. Plant cell size can be reduced by increasing the osmolality of the culture medium to shrink the vacuole, resulting in higher cell numbers at the same PCV, but even then plant cells remain much larger than both mammalian cells and bacteria (Vasilev et al., 2013).

Protein yield can also be enhanced by increasing the production volume. This involves a simple process of scaling up when dealing with plant cell suspension cultures, but the costs also increase, thus not helping to improve commercial feasibility (Raven et al., 2015). The situation is different for whole plants, which have the capacity to produce much more biomass than conventional fermenters and at lower costs, even when cultivation is restricted to greenhouses. Even so, increasing yields by boosting overall biomass production rather than intrinsic productivity transfers demand to the protein extraction and purification steps, and this must be considered with regard to the overall production costs as described in "The Purification Challenge."

Ultimately, plants offer a high capacity for protein synthesis but the production unit, represented by a single cell, is too large to be competitive with the smaller cells of conventional protein production systems, which therefore achieve greater productivity on a smaller footprint. One solution is to remove all the unnecessary components from plant cells, e.g., the vacuole, to concentrate their production capacity and simultaneously eliminate factors that decrease protein yields, such as endogenous plant proteases (Schiermeyer et al., 2005; Mandal et al., 2016). In a step toward this goal, the plant protein synthesis machinery has been separated from unnecessary and undesirable components by preparing plant cell-free lysates. The most widely used lysates are prepared from wheat germ embryos and contain everything necessary for transcription and translation, but extensive washing during extract preparation removes translational inhibitors so that the resulting *in vitro* transcription-translation reactions achieve yields of 100 μg/ml in a single batch process (Biotechrabbit, 2019a). In a dialysis bag continuously fed with substrates and with small inhibitory byproducts continually removed, the

yields can reach 1,000 μg/ml (Biotechrabbit, 2019b). However, the preparation of wheat germ extracts is time consuming and expensive, and the potential for scale-up is limited to the milliliter range (Takai et al., 2010).

Recently, we described a new cell-free lysate based on tobacco BY-2 cells. The BY-2 lysate (BYL) achieves yields of up to 270 μg/ml when producing the fluorescent protein eYFP and involves a coupled transcription-translation reaction in a simple 18-h batch process (Buntru et al., 2014, 2015; Havenith et al., 2017). The productivity of the BYL system has been increased to 3,000 μg eYFP per ml by optimizing lysate preparation and the reaction components, and by extending the transcription-translation process to 24–48 h by including active mitochondria that deliver energy for protein biosynthesis (unpublished data). Although model proteins like eYFP are known to accumulate to very high levels, the maximum yields of the optimized BYL are 15-fold higher than any other eukaryotic batch-based cell-free system expressing similar proteins (**Figure 2**), demonstrating the enormous capacity of tobacco cell suspension cultures for protein biosynthesis. The BYL system has been commercialized by the company LenioBio and is marketed under the brand name ALiCE1 . Lysate volumes of up to 150 ml can be prepared within a few hours by isolating protoplasts, followed by the removal of the vacuole (which contains most of the nucleases and proteases that reduce protein yields) by density centrifugation, and the final mechanical disruption of the evacuolated protoplasts (Buntru et al., 2015). Interestingly, the BYL system contains microsomes, vesicles generated by the disruption of the endoplasmic reticulum during lysate preparation. Therefore, proteins can be targeted to the microsomes by including N-terminal signal peptides, enabling the formation of disulfide bonds and the efficient folding and assembly of complex and multimeric functional proteins such as enzymes, full-size antibodies, and even membrane proteins. In addition, targeting to the microsomes enables *N*-linked protein glycosylation. However, elucidation of the detailed glycan pattern is still pending. Cell-free platforms are predominantly used at smaller scales (~50 μl) for screening and protein optimization, but the preparation and scale-up of the BYL is simple and inexpensive, and cell-free reactions have already been completed at the 6-ml scale. It therefore appears feasible that a further scale-up to 1–10 L will enable the production of gram quantities of recombinant proteins, especially those which are difficult to produce in cell-based systems due to their toxicity, instability or incompatibility with intracellular enzymes such as kinases (Huck et al., 2017).

#### The Purification Challenge

Industrial processes for the extraction and purification of recombinant proteins produced by microbial and mammalian cells are well established, although downstream processing is still the main driver of overall costs (Singh and Herzer, 2018). Protein recovery is particularly straightforward when products are secreted to the culture medium by cells growing

<sup>1</sup> www.leniobio.com/about-alice

in suspension, which is generally the case for CHO cells. Therefore, the use of synthetic, protein-free medium for the cultivation of CHO cells avoids contaminating the product with endogenous host cell proteins almost completely (**Figure 3C**). In contrast, recombinant proteins produced in whole plants have to be extracted from the plant material, requiring the elimination of large quantities of insoluble debris and soluble plant host cell proteins during downstream processing (**Figure 3A**). Even when recombinant proteins are secreted by plant cell suspension cultures, the medium also

FIGURE 3 | SDS-PAGE analysis of total soluble proteins in *N. benthamiana* leaf extracts (A), tobacco BY-2 culture medium (B), and CHO culture medium (C) without (left) and with (right) secreted monoclonal antibody (100 μg/ml). The positions of the antibody light chain (LC) and heavy chain (HC) are indicated. M: PageRuler Pre-stained Protein Marker.

contains several secreted host cell proteins that complicate product purification (**Figure 3B**).

To determine the manufacturing costs for a plant-based process, we produced the human full-size antibody M12 (Kirchhoff et al., 2012) in tobacco plants and purified the recombinant protein from the plant matrix. We grew 1,440 transgenic, homozygous T4 *Nicotiana tabacum* cv. Petit Havana SR1 plants in the greenhouse, and the pre-purification antibody yield was 400 μg/g fresh leaf tissue (**Figure 4**). The presence of a KDEL signal peptide on the C-terminus of the heavy chain caused the recombinant antibody to accumulate in the endoplasmic reticulum. We harvested 200 kg of leaf material 8 weeks after sowing, and total soluble proteins were extracted using a custom-designed large-scale processing unit. To remove insoluble debris, the plant extract was passed over a sequential filtration cascade consisting of an initial bag filtration followed by three depth filtration steps with exclusion sizes of 8, 1, and 0.3 μm, respectively. Additionally, the clarified plant extract was passed through a 0.2-μm filter module before filling into disposable bags used for storage to avoid bacterial contamination. The M12 antibody was purified from the clarified plant extract using four process steps, i.e., Protein A chromatography, CaptoAdhere chromatography, ultrafiltration, and final diafiltration. The purity of the antibody in the final eluate was estimated to be >90%. The eluate from the final filtration step contained 0.5–5.0 units/ml of endotoxin and some residual Protein A (5–20 ng/mg IgG). The total yield of the purified product was 77 g, corresponding to a rather high recovery of 88% (**Figure 4**). The total process cost was €87,550 including labor, consumables, and infrastructure depreciation for plant cultivation and downstream processing, and all necessary analytics, which was equivalent to €1,137 per gram of purified antibody.

antibody. FLW, fresh leaf weight; DSP, downstream processing.

Schillberg et al. Competitiveness of Plant-Based Production Systems

Plant-based systems are often described as cost-effective due to the low cost of upstream cultivation. However, as discussed for the M12 antibody above, cultivation accounts for only 16% of total process costs, whereas downstream processing represents the lion's share of these costs due to the effort required to extract the protein from the intracellular environment and remove not only insoluble components of the plant matrix but also the many soluble host cell proteins that are released along with the product during homogenization. Recovering the same antibody from the culture medium of tobacco BY-2 cells involved much less effort, but downstream processing nevertheless accounted for 77% of total process costs (Raven et al., 2015). The productivity of the BY-2 cells was 20-fold lower than whole tobacco plants, thus the cost of goods for the purified antibody was 10-fold higher than the same product extracted from whole plants.

The low space yield and high cost of downstream processing are major weaknesses limiting the commercial utilization of plant-based production systems. In contrast, CHO cells achieve high antibody titers and the downstream process is straightforward, reducing the cost of goods to generally around US\$200/g (Lim et al., 2010) and in exceptional cases to less than US\$25/g (Kelley, 2007). As discussed above, many groups are pursuing various strategies to boost protein yields in plants, albeit with only limited success (Twyman et al., 2013). Another promising approach to make plant-based production more attractive is to reduce the costs of downstream processing, e.g., by heating plant extracts to achieve the rapid and efficient precipitation of host cell proteins (Beiss et al., 2015), or by developing new affinity ligands to improve purification efficiency (Ruhl et al., 2018). Plant-based production could also become more competitive by incorporating value from biomass side streams, especially for the recovery of bioactive plant proteins and small molecules, or by using the residual biomass to generate biogas (Buyel, 2019).

## OPPORTUNITIES FOR PLANT-BASED PRODUCTION

Although plants cannot compete with microbes or mammalian cells for most protein manufacturing processes, they become much more attractive in market niches relying on one or more of the following unique features:

▪ *Improved protein functionality*. Many biopharmaceuticals contain *N*-linked glycans, but plant glycans differ slightly from human glycans, especially the core *β*(1,2)-xylose and *α*(1,3)-fucose residues that are found in plants but not in endogenous mammalian glycoproteins. Therefore, several studies have focused on the humanization of glycan chains in plants by knocking out plant glycosyltransferases and introducing their human counterparts to avoid any adverse reactions when plant-derived biopharmaceuticals are injected into patients (Montero-Morales and Steinkellner, 2018). In contrast, vaccines and certain biopharmaceuticals for cancer immunotherapy may benefit from plant glycans because the immunogenicity stimulates the activity of antigen-presenting cells, particularly *via* lectins or mannose/ fucose receptors on the surface of dendritic cells (Rosales-Mendoza et al., 2016). Furthermore, some therapeutic proteins carrying plant-derived glycans function in a superior manner to their native counterparts. One example is Elelyso, the recombinant form of human glucocerebrosidase mentioned above. This is produced in carrot cells and is targeted to the vacuole because the vacuole-specific glycans improve the uptake of the protein by human macrophages (Rup et al., 2017). Another example is the production of plant allergens requiring the proper presentation of plant glycans to enable the detection of IgE antibodies against plant crossreactive carbohydrate determinants.


#### CONCLUSIONS AND OUTLOOK

Despite their low productivity, large production footprint and high downstream processing costs compared to traditional platforms, plants, and plant cells possess some unique selling points which make them attractive for specific product lines such as proteins requiring plant-specific glycans, veterinary therapeutics, emergency vaccines, and animal-free proteins. However, the translation of many of these products from research to market is slow, limiting the visibility and commercial exploitation of plant-based platforms for recombinant protein production. Commercial translation is more straightforward and faster for non-therapeutic proteins because of the lower regulatory burden. These products are therefore more suitable in the first instance to demonstrate the economic sustainability of plant-based production systems. In contrast, therapeutic proteins promise higher profit margins but their commercialization requires comprehensive and expensive preclinical and clinical studies with backing from industrial partners providing expertise in drug development. Many studies concerning the production of therapeutic proteins in plants therefore never go beyond expression, purification, and cellbased analysis. Progress along the value chain requires additional work, including toxicity studies, the identification of biomarkers for patient stratification and therapeutic monitoring, clinical trials, and acceptance by the health insurance and regulatory agencies. Further commercial considerations include the intellectual property/freedom to operate portfolio, market share, potential competitors, and time to market. The launch of more plant-derived biopharmaceuticals on the market will require significant investment and closer cooperation with the pharmaceutical industry, regulatory authorities, and clinicians. Importantly, this only makes sense if the product offers unique advantages in terms of quality, efficacy, production scale/timing and/or cost when it is produced in plants rather than CHO cells or microbes.

#### REFERENCES


#### DATA AVAILABILITY

All datasets generated for this study are included in the manuscript and/or the supplementary files.

#### AUTHOR CONTRIBUTIONS

SS wrote the manuscript. NR, HS, SR, and MB revised the manuscript.

#### FUNDING

Some of the unpublished results presented here were generated as part of the EU-funded project CoMoFarm (227420) and projects Phe-free 2 (FZK 031A587B) and Cell-Free Biosynthesis (FKZ 0315942) funded by the Federal Ministry of Education and Research (BMBF).

#### ACKNOWLEDGMENTS

We acknowledge Natalia Jablonka, Nadja Vöpel (both Fraunhofer IME, Aachen) and Dr. Yvonne Mücke (metaX Institut für Diätetik, Friedberg, Germany) for help with production of the phenylalaninefree protein in *P. fluorescens*, Simon Vogel (Fraunhofer IME, Aachen) for help with the cell-free expression work, and the Fraunhofer IME team for the plant-based manufacturing of the M12 antibody. We also thank Dr. Richard M. Twyman (Twyman Research Management Ltd., Scarborough, UK) for revising the manuscript. The authors would like to thank the members of the Pharma-Factory (774078) and Newcotiana (760331) consortia, both funded by the EU, for stimulating discussions on the challenges and business potential of plants for the production of recombinant proteins.


expression of effector genes: lessons learned and future directions. *Biotechnol. Adv.* 35, 64–76. doi: 10.1016/j.biotechadv.2016.11.008


gastrointestinal parasitic infections in chickens. *BMC Biotechnol.* 9:79. doi: 10.1186/1472-6750-9-79

Zimran, A., Gonzalez-Rodriguez, D. E., Abrahamov, A., Cooper, P. A., Varughese, S., Giraldo, P., et al. (2018). Long-term safety and efficacy of taliglucerase alfa in pediatric Gaucher disease patients who were treatment-naive or previously treated with imiglucerase. *Blood Cells Mol. Dis.* 68, 163–172. doi: 10.1016/j.bcmd.2016.10.005

**Conflict of Interest Statement:** SS is member of the Scientific Advisory Board of LenioBio GmbH distributing the BYL platform developed by Fraunhofer IME and Dow AgroSciences.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Schillberg, Raven, Spiegel, Rasche and Buntru. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Production of Biopharmaceuticals in *Nicotiana benthamiana*—Axillary Stem Growth as a Key Determinant of Total Protein Yield

*Marie-Claire Goulet1 , Linda Gaudreau1 , Marielle Gagné1 , Anne-Marie Maltais1 , Ann-Catherine Laliberté1 , Gilbert Éthier1 , Nicole Bechtold2 , Michèle Martel2 , Marc-André D'Aoust2 , André Gosselin1 , Steeve Pepin1 and Dominique Michaud1 \**

*1 Centre de recherche et d'innovation sur les végétaux, Faculté des Sciences de l'agriculture et de l'alimentation, Université Laval, Québec, QC, Canada, 2 Medicago Inc., Québec, QC, Canada*

#### *Edited by:*

*Kirsi-Marja Oksman-Caldentey, VTT Technical Research Centre of Finland Ltd, Finland*

#### *Reviewed by:*

*Teresa Capell, Universitat de Lleida, Spain Emmanuel Aubrey Margolin, University of Cape Town, South Africa*

*\*Correspondence:* 

*Dominique Michaud dominique.michaud@fsaa.ulaval.ca*

#### *Specialty section:*

*This article was submitted to Plant Metabolism and Chemodiversity, a section of the journal Frontiers in Plant Science*

*Received: 20 February 2019 Accepted: 16 May 2019 Published: 11 June 2019*

#### *Citation:*

*Goulet M-C, Gaudreau L, Gagné M, Maltais A-M, Laliberté A-C, Éthier G, Bechtold N, Martel M, D'Aoust M-A, Gosselin A, Pepin S and Michaud D (2019) Production of Biopharmaceuticals in Nicotiana benthamiana—Axillary Stem Growth as a Key Determinant of Total Protein Yield. Front. Plant Sci. 10:735. doi: 10.3389/fpls.2019.00735*

Data are scarce about the influence of basic cultural conditions on growth patterns and overall performance of plants used as heterologous production hosts for protein pharmaceuticals. Higher plants are complex organisms with young, mature, and senescing organs that show distinct metabolic backgrounds and differ in their ability to sustain foreign protein expression and accumulation. Here, we used the transient protein expression host *Nicotiana benthamiana* as a model to map the accumulation profile of influenza virus hemagglutinin H1, a clinically promising vaccine antigen, at the whole plant scale. Greenhouse-grown plants submitted to different light regimes, submitted to apical bud pruning, or treated with the axillary growth-promoting cytokinin 6-benzylaminopurine were vacuum-infiltrated with agrobacteria harboring a DNA sequence for H1 and allowed to express the viral antigen for 7 days in growth chamber under similar environmental conditions. Our data highlight the importance of young leaves on H1 yield per plant, unlike older leaves which account for a significant part of the plant biomass but contribute little to total antigen titer. Our data also highlight the key contribution of axillary stem leaves, which contribute more than 50% of total yield under certain conditions despite representing only one-third of the total biomass. These findings underline the relevance of both considering main stem leaves and axillary stem leaves while modeling heterologous protein production in *N. benthamiana.* They also demonstrate the potential of exogenously applied growth-promoting hormones to modulate host plant architecture for improvement of protein yields.

Keywords: plant molecular farming, *Nicotiana benthamiana*, influenza virus hemagglutinin H1, light regime, apical pruning, hormone treatment, 6-benzylaminopurine

#### INTRODUCTION

Several plants are used as heterologous expression hosts to produce recombinant proteins of medical interest, notably including the wild relative of tobacco *Nicotiana benthamiana* (Bally et al., 2018). This small plant from Australia presents a number of traits, such as a fast growth rate and a natural ability to express heterologous gene sequences, that make it particularly well suited to the production of biopharmaceuticals (Lomonossoff and D'Aoust, 2016). Efficient procedures have been devised for the transient expression of recombinant proteins in *N. benthamiana* that often involve the vacuum infiltration of leaf tissue with agrobacteria harboring a DNA transgene for the protein of interest delivered by either a viral replicon or a binary vector system (Leuzinger et al., 2013; Norkunas et al., 2018). A variety of therapeutic and diagnostic proteins have been produced in agroinfiltrated *N. benthamiana* plants recently, including mammalian antibodies (Qiu et al., 2014; Jutras et al., 2016; Li et al., 2016; Lee et al., 2018; Marusic et al., 2018; Kommineni et al., 2019; Kopertekh et al., 2019), viral antigens (Jutras et al., 2015; Tusé et al., 2015; Regnard et al., 2017; Mbewana et al., 2018; Roychowdhury et al., 2018; Tottey et al., 2018; Vanmarsenille et al., 2018; Zhumabek et al., 2018; Laughlin et al., 2019), and other proteins of potential clinical value (Rattanapisit et al., 2017, 2019; Fu et al., 2018; Ramirez-Alanis et al., 2018; Silberstein et al., 2018).

In practice, recombinant protein yield in plant (e.g., *N. benthamiana*) transient expression settings depends on the net amount of protein per gram of leaf tissue; the leaf biomass per plant prior to agroinfiltration; and the number of plants per growing area before plant tissue downstream processing (Fujiuchi et al., 2016; Shang et al., 2018). Several studies have reported the development of molecular tools or strategies to improve recombinant protein yield and quality in *N. benthamiana* leaves, notably to maximize the structural resemblance between plantmade proteins and their original counterparts or to protect those proteins that show limited stability in plant cell environments (Faye et al., 2005; Gomord et al., 2010). Recent studies have for instance described the expression of an accessory oligosaccharyltransferase to maximize *N*-glycan occupancy on maturing glycoproteins (Castilho et al., 2018), the *in situ* modulation of glycan-processing enzymes to optimize protein glycosylation patterns (Kallolimath et al., 2016; Li et al., 2016; Jansing et al., 2018), or the expression of protease inhibitors to prevent unintended hydrolysis by resident proteases (Goulet et al., 2012; Robert et al., 2016; Grosse-Holz et al., 2018; Jutras et al., 2019). Other studies have described the expression of a viral proton channel to stabilize labile proteins in the cell secretory pathway (Jutras et al., 2015, 2018), the expression of a human convertase to promote the post-translational proteolytic processing of clinically useful proteins *in vivo* (Wilbers et al., 2016; Mamedov et al., 2019), or the exogenous induction of the jasmonic acid defense pathway to reduce endogenous protein content in leaf tissue prior to recombinant protein purification (Robert et al., 2015). In parallel, studies have documented the effects of cultural practices on *N. benthamiana* growth and leaf biomass production before agroinfiltration (Fujiuchi et al., 2014; Shang et al., 2018), the influence of environmental parameters in growth chambers following agroinfiltration (Matsuda et al., 2017, 2018), or the impact of plant density on overall protein yield in specific culture settings (Fujiuchi et al., 2017; Shang et al., 2018).

Our goal in this study was to document eventual relationships between cultural practices, host plant growth pattern and recombinant protein yield in *N. benthamiana* leaves. Higher plants are complex organisms with young, mature, and senescing organs that show distinct metabolic backgrounds and differ in their ability to sustain protein biosynthesis and accumulation. In particular, low protein content in aging leaves due to reduced synthesis and increased degradation for nitrogen recycling toward growing organs has a strong impact on soluble protein distribution in the plant (Avila-Ospina et al., 2014; Havé et al., 2017). Accordingly, mammalian antibody yields in transgenic or agroinfiltrated tobacco plants were found to be low in old (bottom) leaves compared to younger leaves (Stevens et al., 2000; Buyel and Fischer, 2012). Likewise, antibody accumulation patterns in agroinfiltrated *N. benthamiana* leaves are age-dependent and closely match the distribution pattern of endogenous proteins in young and older leaves of the main stem (Robert et al., 2013; Jutras et al., 2016). A question at this stage is whether commonly adopted cultural practices in greenhouse settings may influence the overall yield of a recombinant protein *via* measurable effects on the host plant leaf pattern. A related question is whether such eventual effects of cultural practices can be harnessed to generate protein yield gains on a whole plant basis. We here addressed these questions using supplemental lighting, apical bud pruning, and axillary growth-promoting hormone treatments as model cultural practices eventually impacting host plant growth. *N. benthamiana* plants agroinfiltrated to express the vaccine antigen influenza virus A hemagglutinin H1 were used as a model protein factory of economic relevance (D'Aoust et al., 2010).

#### MATERIALS AND METHODS

#### Plant Growth Conditions

Experiments took place at Laval University in Québec City, QC, Canada (46°46′ N, 71°16′ W). *N. benthamiana* plants were grown from seeds kindly provided by Medicago Inc. (Québec City, QC). The seeds were soaked in water for 24 h at 20°C and then placed in peat moss substrate for germination in a PGR15 growth chamber (Conviron, Winnipeg MB, Canada). Seedlings were selected after 2 weeks based on uniformity, transplanted in 350-ml plastic pots filled with peat moss substrate, and let to grow in greenhouse for an additional 3 weeks at a culture density of 33 plants·m−2 under different cultural conditions before leaf agroinfiltration (see below). Day and night temperatures in the greenhouse were maintained at 29 and 27°C, respectively. Supplemental lighting was provided by 400-W high-pressure sodium lamps installed above the plant canopy (P.L. Light Systems, Beamsville ON, Canada). The plants were irrigated as needed with the Plant-Prod 12-2-14 Optimum complete nutrient solution supplemented with the Plant-Prod Chelated Micronutrient Mix (Plant Products, Laval QC, Canada). Electrical conductivity in the nutrient solution was maintained at 1.6 dS·m−1 for 1 week after transplantation, and then increased at 2.6 and 3.6 dS·m−1 for the second and third weeks, respectively.

#### Light Regime Treatments

Supplemental lighting trials for biomass production involved two photosynthetic photon flux densities (PPFDs) and two photoperiods, for a total of four treatments. Treatment 1 involved a 16-h day/8-h night photoperiod with a PPFD of 80 μmol/m2 ·s; Treatment 2, a 16-h day/8-h night photoperiod with a PPFD of 160 μmol/m2 ·s; Treatment 3, a 24-h day photoperiod with a PPFD with 80 μmol/m2 ·s; and Treatment 4, a 24-h day photoperiod with a PPFD of 160 μmol/m2 ·s. The plants were left to grow in greenhouse for 3 weeks under either light regime before use for leaf agroinfiltration. Light integrals for the 3-week growth period were estimated at ~200, 275, 240, and 365 mol·m−2 for Treatments 1, 2, 3, and 4, respectively.

#### Pruning Treatments

Pruning treatments to promote axillary stem (or "secondary stem") growth involved removal of the main stem ("primary stem") apical bud 5, 7, or 12 days after seedling transplantation in plastic pots. The plants were then left to grow for an additional 16, 14, or 9 days in greenhouse under the same conditions, for a total growth period of 3 weeks before leaf agroinfiltration. Light conditions for this trial were as above for Treatment 4.

#### Hormone Treatments

Growth hormone treatments to promote axillary stem growth involved exogenous application of the synthetic cytokinin 6-benzylaminopurine (6-BAP) (Bio Basic, Markham ON, Canada). Seedling transplants left to grow in greenhouse were sprayed 7 or 12 days after transplantation with the cytokinin diluted in water at working doses of 100, 500, or 1,000 ppm. The treated plants were kept in greenhouse for an additional 13 or 8 days, respectively, for a total growth period of 20 days before leaf agroinfiltration. Light conditions for this trial were as above for Treatment 4.

#### Leaf Agroinfiltration and H1 Heterologous Expression

Three plants from each treatment were collected randomly at the end of the growth period to determine the leaf fresh weight (LFW) of primary and secondary stem leaves at agroinfiltration. Five other plants were collected for vacuum infiltration at Laval U (Jutras et al., 2018) or at Medicago research facility (Québec City QC) with agrobacteria harboring a Medicago proprietary vector with a DNA coding sequence for the H1 antigen of influenza virus, strain A/California/07/09 driven by the Cauliflower mosaic virus 35S constitutive promoter (Jutras et al., 2015). Agroinfiltrated plants were incubated in PGR15 growth chambers (Conviron) for 7 days under ambient CO2 concentration. Conditions in the growth chambers were as follows: air temperature set at 20°C, a relative humidity of 70%, a plant culture density of 55 plants·m−2, tap water irrigation provided as needed, and a PPFD of 150 μmol·m−2·s−1 provided 16 h a day by fluorescent tubes installed above and within the plant canopy. Primary and secondary stem leaves were harvested separately from each plant 7 days after agroinfiltration, weighed to measure LFW at harvest, and frozen at −80°C until use for H1 antigen determinations.

#### H1 Antigen Hemagglutination Assay

Leaf tissue for H1 activity determination was broken with 2.8-mm zirconium ceramic oxide beads in an OmniBead Ruptor 24 homogenizer (Omni International, Kennesaw GA, USA). Leaf proteins were extracted in two volumes (i.e., 1.5 g leaf tissue per 3 ml buffer) of ice-cold 50 mM Tris–HCl, pH 8.0, containing 500 mM NaCl, 1 mM phenylmethylsulfonyl fluoride, and 2 mM sodium metabisulfite. The leaf homogenate was centrifuged at 4°C for 10 min at 20,000× *g* to recover soluble proteins in the supernatant. Protein content was determined according to Bradford (1976), with bovine serum albumin (Sigma-Aldrich) as a protein standard. H1 hemagglutination activity per LFW unit was determined using a Medicagostandardized hemagglutination assay (D'Aoust et al., 2008) based on the method of Nayak and Reichl (2004). Serial double dilutions of the test samples (100 μl) were made in V-bottomed 96-well microtiter plates containing 100 μl of phosphate-buffered saline, to leave 100 μl of diluted sample per well. One hundred μl of a 0.25% (w/v) Turkey red blood cell suspension (Bio Link Inc., Syracuse NY, USA) was added to each well and the plates were incubated for 2 h at 20°C. Reciprocal of the highest dilution showing complete hemagglutination was recorded as HA activity. H1 yield per plant (or per "leaf production unit"; see below) was calculated by multiplying HA activity per LFW by the LFW per plant (or per leaf production unit) at harvest.

## Experimental Designs and Statistical Analyses

Lighting trials followed a split-plot experimental design with seven replications in time (each including two sub-replicates) with at least 15 plants per replication unit, two photoperiods (16 vs 24 h) as main plots and two PPFDs (80 vs 160 μmol/m2 ·s) as sub-plots, for a total of four treatments. The pruning trial, with four treatments, followed a factorial experimental design with two replications per trial repeated two times, and five plants per replication unit. The growth hormone trial, with four treatments, followed a factorial experimental design with three replications and five plants per replication unit. Data in each trial were processed by an analysis of variance (ANOVA), using the GLIMMIX procedure of SAS, v. 9.4 (SAS Institute Inc., Cary NC, USA). Protected Fisher LSD (lighting trial) or Tukey's (pruning/hormone trials) multiple comparison tests were used for mean separation following statistically significant ANOVA's, using an alpha value threshold of 5%.

# RESULTS AND DISCUSSION

The main goal of this study was to assess the effects of common cultural practices such as supplemental lighting, apical pruning, or growth hormone treatment on the overall yield of a clinically useful vaccine antigen transiently expressed in *N. benthamiana*. Given the well-known effects of cultural practices on host plant development patterns, a first step toward this goal was to evaluate the contribution of primary and secondary stem leaves to total H1 yield per plant, also considering leaf position relative to the primary stem apex given the link previously reported between leaf age and recombinant protein production in *N. benthamiana* (Robert et al., 2013). To this end, we divided the plant into six parts, or leaf production unit, each regrouping leaves of comparable physiological age (**Figure 1**). Primary (P) and secondary (S) stem leaves were considered separately to determine the relative contribution of axillary growth to total protein yield. Both groups of leaves were subdivided into three categories (1, 2, and 3) corresponding to, or emerged from, top (young) (P1, S1), middle (mature) (P2, S2), or bottom (oldest) (P3, S3) leaves down from the main stem apex, to take the impact of leaf aging into account.

plant is composed of six "leaf production units," each regrouping leaves of similar physiological age. P units regroup top (young) leaves (P1), middle (mature) leaves (P2), and bottom (older) leaves (P3) of the main (primary) stem. S units regroup axillary (secondary) stems and leaves emerged from P1 leaves (S1), P2 leaves (S2), and P3 leaves (S3). Main stem apex (Apex) includes the main stem apical bud and small leaves appeared after agroinfiltration during the protein expression period.

#### The H1 Antigen Is Not Distributed Evenly in the Plant

We first produced a reference table for the relative contribution of each leaf production unit to total biomass and H1 yield on a whole plant basis (**Table 1**, **Figure 2**). Average, consolidated values were calculated for harvested biomass and H1 activity per g LFW using the whole set of data produced for Treatments 1–4 during the lighting trial. In line with Robert et al. (2013) reporting high-yield expression of a recombinant mammalian antibody in young leaves of the main stem, P1 and P2 were by far the most productive units for H1, both showing a specific hemagglutinin activity greater than 0.25 M Units/g LFW (**Figure 2A**) and accounting, together, for 56% of total H1 activity units in the plant for less than 40% of the leaf biomass (**Table 1**). In sharp contrast, P3 showed a specific H1 activity lower than 0.05 M Units/g LFW and accounted for only 12% of total H1 activity despite representing almost one-third of the leaf biomass. As for P units, a leaf position-related decline of H1 activity was observed in the S units on a LFW basis (**Figure 2A**), associated with an H1 yield/leaf biomass contribution ratio lower than 1 for S3 compared to a ratio well above this

TABLE 1 | Relative contribution of P and S leaf production units to total biomass and H1 yield per plant following heterologous expression.¥


*¥ Data are expressed relative to total leaf biomass or H1 antigen yield per plant (100%) after heterologous expression. Each value is the mean of 56 independent (replication) values.*

produced for Treatments 1–4 during the PPFD/photoperiod lighting trial. Each bar is the mean of 56 independent (replication) values ± se.

value for the S1 and S2 units (**Table 1**). As expected, given the less advanced physiological age of S3 leaves compared to P3 leaves from which they emerged, a specific H1 activity of ~0.12 M Units/g LFW was measured for the S3 unit, more than twice the activity measured for P3 (**Figure 2A**). This, along with a harvested biomass for S3 much greater than the biomass harvested for S1 and S2, prevented the onset of a leaf age-related decline of total H1 yield in S production units such as that observed in the P units (**Figures 2B,C**).

#### Axillary Stem Leaves Act As H1 Yield Contributors and Source Organs for Young, High-Protein Expressing Leaves on the Main Stem

We conducted a complementary experiment to determine whether P3 leaves, although producing a small amount of H1 given their large biomass, would contribute to the overall H1 yield per plant by behaving as source organs for younger leaves actively expressing the recombinant antigen (**Table 2**). The possible contribution of axillary leaves as source organs was assessed in parallel to underscore an eventual reliance of young, fast-growing P1 and P2 leaves on external metabolic resources from the S production units during H1 expression. Seedling transplants were left to grow for 3 weeks in greenhouse and leaf-infiltrated with agrobacteria harboring

TABLE 2 | Impact of P3 and/or S1–S3 units removal at infiltration on total biomass and H1 yield per plant following heterologous expression.¥


*¥ Data are expressed relative to non-excised control plants (0% loss). Each value is the mean of three independent (replication) values ± se.*

the H1 gene construct. Right after infiltration, the oldest six leaves on the main stem (corresponding to P3), all axillary stems (bearing S unit leaves), or both groups of leaves were gently removed, keeping intact the main stem, the P1/P2 units, the P3 unit, and/or the S units before H1 expression in growth chamber. Despite a total leaf biomass reduced by 28%, P3-excised plants produced as much H1 as non-excised control plants under the same cultural conditions (**Table 2**). As expected, given the significant contribution of S units to total H1 yield per plant, axillary stem-excised plants showed a total H1 yield loss of 22% for a leaf biomass reduced by 20%, in close match with the yield/biomass contribution ratio of 1.1 calculated above for the S units (see **Table 1**). By comparison, plants devoid of both the S and P3 leaves showed a total yield loss of 38% for a total biomass reduced by 47%, about two times a consolidated yield loss of 20% calculated for the S and P3 units excised separately. These data confirmed the reliance of P1 and P2 production units on metabolic resources provided by the S and P3 units during H1 expression. They also underlined the key role of axillary stem leaves in *N. benthamiana* used as a protein factory, both as contributors to recombinant protein yield and as source organs to sustain the yield contribution of young, highly-expressing leaf production units on the main stem.

#### High PPFD and an Extended Photoperiod Enhance Total H1 Yield Per Plant *via* a Positive Effect on Axillary Stem Growth

We took a closer look at our lighting trial data to characterize the impact of light regime on leaf biomass production and H1 yield per leaf production unit (**Figure 3, Supplementary Table S1**). Light intensity and day length directly determine the amount of light available to plants for photosynthesis and supplemental lighting generally has a positive impact on biomass production in greenhouse settings (Dorais and Gosselin, 2002). Accordingly, the overall yield of mouse antibody MGR48 in

transgenic tobacco plants was promoted under high-light conditions, mostly *via* a positive effect on leaf biomass production (Stevens et al., 2000). Similarly, increasing PPFD from 80 to 160 μmol/m2 ·s here had a limited impact on H1 activity per LFW in both P and S leaves (**Figure 3A**) but a strong positive impact on leaf biomass and total H1 yield in both groups of leaves (**Figures 3B,C**). For instance, leaf biomass and total H1 yield were improved by 25–50% in P leaves provided 16 h/day with the high PPFD regime of 160 μmol/m2 ·s (Treatment 2; see Light regime treatments, above). Likewise, leaf biomass and total H1 yield were improved by more than 60% in S leaves provided 24 h/day with the highest PPFD (Treatment 4). Extending daylight photoperiod from 16 to 24 h also had a positive impact on leaf biomass and H1 yield per plant, for instance improving H1 yield by ~50% in P units at low PPFD (Treatment 3) or by ~100% in S units at high PPFD (Treatment 4) (**Figure 3C**).

A high PPFD, or an extended photoperiod, were sufficient to reach maximal biomass or H1 yield gains for the P units under our experimental conditions (**Figures 3B,C**). For instance, comparable biomass or H1 yield gains were obtained for P leaves provided 16 h/day with the highest PPFD (Treatment 2), 24 h/day with the lowest PPFD (Treatment 3), or 24 h/day with the highest PPFD (Treatment 4). Likewise, a total H1 yield/biomass contribution ratio of 0.9 was calculated for P leaves grown under Treatment 4, compared to a ratio of 1.0 for plants grown under the other three light regimes (**Supplementary Table S1**). By comparison, additive effects of PPFD and day photoperiod were observed for the S units of plants grown under Treatment 4 (**Figures 3B,C**). For instance, leaf biomass and H1 yield per plant were higher in S leaves under this treatment than in S leaves grown under Treatment 2 or Treatment 3. Similarly, the H1 yield/biomass ratio in S leaves increased with light intensity or day photoperiod, from 0.9 for Treatment 1 to 1.0 or 1.1 for Treatments 2 or 4, respectively (**Supplementary Table S1**). These data pointed overall to the limited potential of P units for additional yield gains under commonly used lighting regimes in greenhouse settings. By contrast, they suggested the potential of axillary growth-promoting cultural practices for additional protein yield gains on a whole plant basis.

#### Tip Pruning and 6-BAP Treatment Both Promote Axillary Stem Growth But Differentially Impact Total H1 Yield Per Plant

Apex pruning and foliar application of synthetic cytokinins are common cultural practices to control plant architecture for improved yield or trait quality in greenhouse settings. These two cultural practices exert a strong downregulating effect on shoot apical dominance over axillary buds, to induce bud outgrowth and axillary stem growth *via* complex signaling pathways dependent on auxins, cytokinins, strigolactones, gibberellins, sugars, and an array of protein receptors and gene regulators (Ferguson and Beveridge, 2009; Domagalska and Leyser, 2011; Rameau et al., 2015). Here, we tested the potential of mechanical pruning and 6-BAP treatment to promote H1 yield in *N. benthamiana via* a positive effect on axillary stem growth (**Figures 4, 5**). Apex pruning 7 or 12 days after seedling transplantation had little impact on total biomass at harvest but showed strongly divergent effects on growth of the P and S leaf production units, to generate leaf biomass composed of S leaves at 50–80% compared to less than 35% for untreated control plants (**Figure 4A**). By comparison, 6-BAP treatment had little impact on P leaves but a positive impact on axillary growth, to give a fresh biomass increased by 15% at harvest, mostly explained by a 35–40% increase of S leaf biomass (**Figure 5A**). Despite positive effects of both treatments on axillary growth, pruning and 6-BAP had strongly divergent impacts on total H1 yield per plant (**Figures 4B, 5B**). Whereas apex pruning treatments showed null or negative effects on total H1 yield, 6-BAP treatments increased this variable from ~6 M H1 units/plant in untreated plants to more than 10 M units/plant in treated plants, for a relative yield increase of 65–75% in plants given a 6-BAP working dose of 100 or

FIGURE 4 | Apex pruning positively influences axillary growth but has no positive impact on H1 yield per plant. Main stem apices were removed 5, 7, or 12 days after seedling transplantation, and the pruned plants then left to grow in greenhouse for an additional 16, 14, or 9 days, respectively, before vacuum infiltration and transfer in growth chamber. Harvested biomass was recorded (A), and total H1 yield determined (B), for P and S leaf production units following heterologous protein expression. Each bar is the mean of four independent (replication) values ± se. Ctrl, control, no-pruning treatment. P, leaf production units P1–P3; S, leaf production units S1–S3.

500 ppm 7 days post-transplantation. Additional studies are now needed to decipher the physiological effects of both cultural practices on H1 yield changes at the plant scale. The negative impact of apex pruning on H1 yield/plant was likely associated to some extent with the removal of young, high-expressing leaves on the main stem, but the large yield reduction observed in P units given the biomass produced suggests additional effects *in planta*. Likewise, the positive impact of 6-BAP on H1 yield/ plant was associated with an increased biomass of the S units, but an H1 yield increase of 65–75% measured in plants treated with the synthetic hormone suggests additional causes given the total biomass increase of only 15% on a whole plant basis.

*p* < 0.05). P, production units P1–P3; S, production units S1–S3.

# CONCLUSION

Cultural conditions have a significant impact on recombinant protein yields in *N. benthamiana* (Fujiuchi et al., 2014, 2017; Matsuda et al., 2017, 2018; Shang et al., 2018). Our goal in this study was to document eventual links between the effects of some current cultural practices on plant development and the production yield of influenza virus hemagglutinin H1, a promising vaccine antigen, in leaves of *N. benthamiana* used as a host for protein expression. Plants grown under different supplemental lighting regimes, submitted to tip pruning, or treated with the axillary growth-promoting hormone 6-BAP were agroinfiltrated to express the recombinant antigen for a week in growth chamber. In accordance with previous accounts on the distribution of recombinant mammalian antibodies in tobacco leaves (Stevens et al., 2000; Buyel and Fischer, 2012), we showed the importance of young leaves on H1 yield per plant, unlike older leaves accounting for a significant part of the plant biomass but contributing little to total antigen titer. We also documented the key contribution of axillary stem leaves, which under certain conditions contributed more than 50% of the antigen yield despite accounting for less than a third of the total biomass. These findings underline the relevance of both considering main stem and axillary stem leaves for a valid monitoring of heterologous protein expression in *N. benthamiana*. They also support the practical potential of 6-BAP in greenhouse settings to modulate host plant architecture for improved protein yields. Additional studies will be welcome in coming years to test the general effectiveness of this approach with other recombinant proteins and to decipher the physiological effects of cytokinins in agroinfiltrated leaves during recombinant protein expression. Studies will also be welcome to further assess the contribution of oldest leaves on the main stem, paying attention in particular to their impact on plant growth before leaf agroinfiltration.

# DATA AVAILABILITY

No datasets were generated or analyzed for this study.

# AUTHOR CONTRIBUTIONS

M-CG contributed to the conception of the project, to the experimental design, to the experiments, to data acquisition and processing, and to the first draft of the manuscript. LG contributed to the experimental design, to the experiments, to data acquisition, and to the coordination of the project. MG, A-MM, and A-CL contributed to the experimental design and to the experiments. GÉ and NB contributed to the experimental design. MM, M-AD, AG, and SP contributed to the conception of the study and to the experimental design. DM contributed to the conception of the project, to the experimental design, to data processing, and to the writing of the manuscript.

#### FUNDING

Work funded by a CRD grant from the Natural Science and Engineering Research Council of Canada to AG, SP, and DM (NSERC-CRD 442139–12), with the financial support of Medicago Inc. (Québec City, QC, Canada).

#### REFERENCES


#### ACKNOWLEDGMENTS

Technical help by Medicago staff and financial support by Medicago are acknowledged and much appreciated.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/article/10.3389/fpls.2019.00735/ full#supplementary-material


**Conflict of Interest Statement:** M-AD, NB, and MM are currently employed by Medicago. Several coauthors on this manuscript are named inventors on a patent describing the effects and uses of growth-promoting hormones to improve recombinant protein yields in plants.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Goulet, Gaudreau, Gagné, Maltais, Laliberté, Éthier, Bechtold, Martel, D'Aoust, Gosselin, Pepin and Michaud. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Plant and Microalgae Derived Peptides Are Advantageously Employed as Bioactive Compounds in Cosmetics

Fabio Apone1,2, Ani Barbulova<sup>1</sup> \* and Maria Gabriella Colucci1,2

<sup>1</sup> Arterra Bioscience srl, Naples, Italy, <sup>2</sup> Vitalab srl, Naples, Italy

#### Edited by:

Jussi Joonas Joensuu, VTT Technical Research Centre of Finland Ltd., Finland

#### Reviewed by:

Pedro Carrasco, University of Valencia, Spain Jules Beekwilder, Wageningen University & Research, Netherlands

> \*Correspondence: Ani Barbulova ani@arterrabio.it

#### Specialty section:

This article was submitted to Plant Metabolism and Chemodiversity, a section of the journal Frontiers in Plant Science

Received: 15 January 2019 Accepted: 24 May 2019 Published: 12 June 2019

#### Citation:

Apone F, Barbulova A and Colucci MG (2019) Plant and Microalgae Derived Peptides Are Advantageously Employed as Bioactive Compounds in Cosmetics. Front. Plant Sci. 10:756. doi: 10.3389/fpls.2019.00756 Bioactive peptides (BP) are specific protein fragments that are physiologically important for most living organisms. It is proven that in humans they are involved in a wide range of therapeutic activities as antihypertensive, antioxidant, anti-tumoral, anti-proliferative, hypocholesterolemic, and anti-inflammatory. In plants, BP are involved in the defense response, as well as in the cellular signaling and the development regulation. Most of the peptides used as ingredients in health-promoting foods, dietary supplements, pharmaceutical, and cosmeceutical preparations are obtained by chemical synthesis or by partial digestion of animal proteins. This makes them not fully accepted by the consumers because of the risks associated with solvent contamination or the use of animal derived substances. On the other hand, plant and microalgae derived peptides are known to be selective, effective, safe, and well tolerated once consumed, thus they have got a great potential for use in functional foods, drugs, and cosmetic products. In fact, the interest in the plant and microalgae derived BP is rapidly increasing and in this review, we highlight and discuss the current knowledge about their studies and applications in the cosmetic field.

Keywords: bioactive, peptide, plant, microalgae, cosmetics

# BIOACTIVE PEPTIDES IN ANIMALS AND PLANTS

Peptides are short amino acid chains, usually ranging from 2 to 20 units, with a molecular weight under 3 kDa. As most of them possess a wide range of biological activities in living organisms, they are generally defined as bioactive peptides (BP). Due to their specific amino acid sequence and 3D conformation, BP are capable of interacting with a huge number of biological macromolecules and biochemical compounds, modulating several functions and conditions in living organisms (Moller et al., 2008). Their mode of action resembles to that of hormones and drugs, capable of activating signal transduction mechanisms in the cells, leading to an up-regulation or down-regulation of the expression of important regulatory genes. It is proven that in humans endogenous BP are involved in a wide range of therapeutic activities, such as anti-microbial, anti-hypertensive, antioxidant, antitumoral, anti-proliferative, hypocholesterolemic and anti-inflammatory (Sánchez and Vázquez, 2017). Besides their primary functions in triggering transduction mechanisms, they can also exert a

nutritive function, providing the amino acid units as building blocks for new proteins to cells, thus contributing to the physiological protein turnover (Hartmann and Meisel, 2007).

Analogously, in plants the role of endogenous BP has been extensively studied and linked to defense response mechanisms, as well as cellular signaling during plant development stages (Schaller, 2001). A class of cationic, antimicrobial peptides (AMPs), was found to be involved in the plant innate immune response, which was one of the main mechanisms of defense against pathogens. AMPs exhibited bactericidal and fungicidal activity, thanks to their amphiphilic structure and positive charge, that facilitated the interaction and the integration into the negatively charged microbial membranes (Suarez et al., 2005). Other types of plant endogenous peptides include regulatory BP, such as systemin and enod40, whose roles in the plant wound response, mitotic activity and differentiation have been extensively reviewed (Schaller, 2001).

Signaling peptides, as plant natriuretic peptides (PNP), phytosulfokines (PSK), and rapid alkalinization factor (RAF) were purified from different plant species and it has been established their involvement in homeostatic functions (Salas et al., 2015). Another group of plant signaling peptides, members of the CLE (CLAVATA3/ESPrelated) family, synthesized in the plant apical or root meristem region, were studied for their role in growth and differentiation, as well as in the maintenance of the balance between differentiation and stemness (Ito et al., 2006; Salas et al., 2015).

Nevertheless in both animals and plants many BP exist as free forms, the vast majority of known BP are produced and released following a chemical hydrolysis or enzymatic intervention, since they are encrypted in the structure of a parent protein. Thus, several classes of proteins from plant and animal origins hide active portions in their structure with specific amino acid sequences, making them potential sources of encrypted BP (Carrasco-Castilla et al., 2012; Bhat et al., 2015). The characterization of the BP released after protein hydrolysis has highlighted their wide range of biological activities related to nutraceutical and medical applications (Maestri et al., 2016). BP-rich hydrolysates, obtained from different legume plants, such as chickpea, soy bean, pea, lentil, mung bean have been studied and characterized for their antimicrobial and antihypertensive activities (Ariza-Ortega et al., 2014; Maestri et al., 2016). Either as purified forms or mixtures, BP have been proposed for the treatment and prevention of various medical conditions due to their cholesterol-lowering effects, antiprotozoal, antiviral, antithrombotic, antioxidant, antihypertensive, and antimicrobial activities (Lemes et al., 2016), and were extensively characterized as nutraceuticals (Moldes et al., 2017) and functional foods (Haque et al., 2008). BP are becoming popular in the skin and hair care field too, as they possess a wide range of biological activities in skin cells and are capable of triggering signal transduction mechanisms, leading to the activation of important gene regulators. Moreover, differently from large proteins, BP are preferred by formulators thanks to their ability to penetrate into the skin easier and to reach the deeper skin layers whether associated with efficient delivery systems (Badenhorst et al., 2014).

# BIOACTIVE PEPTIDES IN THE COSMETIC INDUSTRY

## BP From Synthetic Processes and Animal Protein Digestion

The BP that are employed in cosmetic applications come from different sources. Two of the most used methods to obtain peptides are the chemical synthesis and the partial digestion of animal proteins. Chemical synthesis involves the use of amino acid mixtures as starting material, allowing to obtain peptides with different amino acid sequence and combinations. The advantage is mostly related to the almost infinite sequences that can be intentionally assembled, giving rise to a huge range of desired functional structures.

One of the examples is the copper complexed-tripeptide GHK, which enhances the synthesis of Extra Cellular Matrix (ECM) proteins in the skin, such as collagen I, thanks to its capacity to penetrate through the stratum corneum and reach the dermis layer (Pickart and Schagen, 2015). A second example is the peptide GEKG, known as tetrapeptide-21, which was studied in vitro and in clinical tests for its capacity to reduce facial wrinkles (Farwick et al., 2011). The tetra-peptide increased collagen production by human fibroblast cultures and induced the expression of type I procollagen mRNA levels in vivo. Histochemical analysis performed on ex vivo skin explants confirmed that GEKG induced the production of procollagen, hyaluronic acid, and fibronectin. The galloyl-RGD, which is a tripeptide conjugated to gallic acid, was differently characterized for its capacity to inhibit free radical formation in HaCaT keratinocytes at the concentration of 50 ppm, and to suppress L-3,4-DihydrOxyPhenylAlanine (L-DOPA) formation and oxidation when used at higher doses (Dae et al., 2014). A Palmitoyl tetrapeptide, which represented a fragment of an immunoglobulin G, was proven to decrease IL-6 secretion in a basal setting, and served as an anti-inflammatory compound after exposure to UVB-irradiation. In vivo reflectance confocal microscopy studies indicated that a blend of the palmitoyl tetrapeptide and another palmitoyl oligopeptide significantly enhanced the synthesis of the ECM factors compared to placebo (Watson et al., 2009). In addition, some neurotransmitter inhibitor peptides, such as the acetyl hexapeptide three which was capable of penetrating the skin and inducing muscle relaxation, has been employed in cosmetic formulas thanks to its effect against wrinkle and fine line formation (Blanes-Mira et al., 2002).

Besides synthetic peptides, other types of peptides commonly used in skin care are those obtained by partial digestion of animal proteins, such as collagen. These peptides, generally defined with the term of matrikins (Aldag et al., 2016), have been extensively studied for their capacity to induce the production of the ECM components in the skin. KTTKS, a peptide derived from the proteolytic hydrolysis of collagen, promoted the production of fibronectin, collagen types I and

III in skin fibroblasts, through the activation of TGF-β pathway (Tsai et al., 2007). A palmitoylated and more stable form of the KTTKS (pal-KTTKS) has been developed and shown to be more effective than its precursor in the collagen induction, thanks for its capacity to better penetrate the skin stratum corneum. The biomimetic tetrapeptide PEGP, due to the presence of the amino acids glycine and proline, was identified as a potent candidate to delay early-onset skin aging by the loss of ECM mass. Once tested, both in vitro and in vivo, it induced collagen I, elastin and fibronectin, resulting into a significant sharpening of the facial contours and wrinkle depth reduction (Maczkiewitz et al., 2018). A further example of BP derived from animal protein digestion is the mixture of peptides obtained from wool keratin. Topical formulations containing keratin-BP (MW < 1000 Da) were described as possessing moisturizing, repair-promoting and potentially radio-protective properties in the skin (Barba et al., 2008).

Instead of topical administration, a BP preparation has been studied for its benefits on the skin, once orally up-taken. The dietary supplementation of a specific bioactive collagen peptide (BCP), derived from porcine type I collagen hydrolysis, significantly reduced the wrinkles volume after 4 and 8 weeks of intake. Additionally, after 8 weeks the content of procollagen type I and elastin was higher by 65 and 18%, respectively, in the BCP-treated volunteers compared to the placebo-treated patients (Proksch et al., 2014).

# Disadvantages and Concerns of Synthetic and Animal Derived Peptides

The described examples of BP for cosmetic use, nevertheless still requested by the market, present certain drawbacks and are not fully accepted by all the consumers, mostly due to the risks associated with the presence of contaminating chemicals used in the process (for chemical synthesis) or the use of animal derived substances (for digested products). The use of chemical reactions to synthesize peptides implies the use of toxic solvents and reagents in the cycle and, although the last step of the process involves the elimination of most chemicals, the final product can still carry contaminating solvent residues or impurities (Guzman et al., 2007; D'Hondt et al., 2014). Even the solid-phase peptide synthesis, the most used manufacturing procedure for drug peptides today, can generate adducts, impurities and unwanted peptide counter ions, such as trifluoroacetate, which can be hardly eliminated from the final peptide product (D'Hondt et al., 2014).

On the other hand, the use of animal proteins in enzymatic digestions implies very strict veterinary controls in the animal source in order to exclude infections from any kind of pathogenic virus, or even the presence of dysfunctional protein aggregates, that can cause very serious diseases in human as well, such as the case of the Bovine Spongiform Encephalopathy (BSE) in cattle. Nevertheless of the several check points during the whole production process, certain risks associated with the presence of animal derived harmful substances cannot be completely abolished (Yokoyama and Mohri, 2008). Besides the safety concern, which certainly represents the most relevant priority for cosmetic brands and consumers, the whole tendency of the market is going toward eco-certified plant derived ingredients. Plant-extracted compounds are becoming always more requested by the market, both because of the trends associated with the idea of well tolerated and less allergenic ingredients, and because strict vegan consumers are progressively increasing among the global world population. For Muslims, for example, additional rules for Cosmetics imply further considerations relative to the religion, which restrict or prohibit any contact with products that are inconsistent with Islam laws. In relation, Halal certification is required for many cosmetic ingredients to attest the product conformity during its whole production cycle with the special attention to the animal origin and the processing (Rigano, 2017). Both the concerns and the preferences toward plant derived material has prompted skin care ingredient producers to substantially invest on alternative ways to obtain BP, which can be analogously effective in terms of biological activity as their animal-derived or synthetic counterparts, and fully guaranteed for safety and sustainability.

# Plant and Microalgae-Derived Peptides for Cosmetic Applications

All plant and microalgae-derived peptides used in Cosmetics today, in particular in the skin care market, are formulated as mixtures of many different protein fragments, obtained from hydrolysis of bigger proteins, using either the chemical or enzymatic methods (**Table 1**). By choosing a specific protein source, and by modulating the extent and the conditions of the hydrolytic process, it was possible to obtain BP of different nature, length and composition, adapted to fulfill desired biological functions. All the examples that we have reported and discussed hereafter refer to raw peptide preparations, often containing other plant cell metabolites, whose activity has been characterized, either in vitro or in vivo, for their capacity to activate significant biological functions in skin cells, acquiring promising potentialities to be developed as cosmetic active ingredients.

An Avena sativa (oat) peptide-rich preparation, obtained from oat bran by enzymatic hydrolysis, was tested in vitro on H2O2-stressed dermal human fibroblasts and the results demonstrated that the preparation was effective to reduce oxidative stress-induced cell injury, through an enhanced activity of the enzymes SODs and an inhibition of the malondialdehyde (MDA) levels. Although further exploration about the skin care potential in vivo might be necessary, the oat BP preparation was proposed as functional ingredient in anti-aging skin creams to help preventing damages induced by oxidative stress and UV (Feng et al., 2013).

In recent studies, the wound healing potentialities of a Glycine max (soy) lysate, containing BP with amino acid sequences similar to those of ECM proteins present in the human skin, was explored (Tokudome et al., 2012). These ECM-mimetic peptides produced in soy improved wound healing by increasing dermal ECM synthesis, stimulating reepithelialization, promoting cell adhesion and supporting tissue regeneration (Chien et al., 2013). With the main goal of


increasing the number of applicative outputs, other authors developed a plant-based nanofibrous cellulose acetate scaffold, supporting a layer of soy BP extract (Ahn et al., 2018). The scaffold membranes, combined with the soy BP hydrolysate, successfully mimicked the physiological properties of the native skin and promoted fibroblast proliferation, migration and infiltration in vitro. In vivo, the scaffolds accelerated reepithelialization and epidermal thickening, as well as reduced scar formation and collagen mis-assembly during the wound repair stages. These findings provided a valuable example of how plant-derived BP could be employed in a next generation of regenerative dressings to push the envelope of nanofiber technology in the skin care market.

Protein hydrolysates from Triticum vulgare (wheat) have also been produced and proposed as skin and hair-conditioning ingredients in personal care products (Burnett et al., 2018). The BP contained in the hydrolysate, obtained by acidic, enzymatic or other chemical digestion procedures, have shown to be very effective in vitro in accelerating skin healing and in reducing the release of several inflammatory mediators, such as nitric oxide (NO), Interleukin 6 (IL-6), Tumor Necrosis Factor (TNF) -alpha, and ProstaGlandin E2 (PGE-2) (Sanguigno et al., 2018). To prevent skin sensitization reaction in allergic subjects due to wheat gluten-derived peptides (Adachi et al., 2012), other authors showed that, when the hydrolysis was carried out at longer times, the wheat BP did not exceed 30 amino acid in length and, in contrast to those with average MWs > 10 KDa, they did not elicit any hypersensitivity reactions in sensitized individuals, still keeping their biological activity (Burnett et al., 2018).

A Solanum tuberosum (potato) protein hydro-lysate provided an additional example of how BP, generated by a hydrolytic process, activated response mechanisms in skin cells and could be advantageously applied to skin care Cosmetics. The potato hydrolysate stimulated the lipid metabolism in skin keratinocytes when added to the culture medium (Popa et al., 2006), and induced a stronger increase of the long-chain fatty acidcontaining glucosylceramides in aged keratinocytes than in younger ones (Popa et al., 2010). Moreover, besides the effect on glucosylceramide synthases, it was found that the amount of the neutral phospholipids and the gangliosides was significantly induced by the BP-containing potato hydrolysate, suggesting an activation of the PPARs (peroxisome proliferation activated receptors), that are known to trigger the expression of many lipid biosynthesis genes in keratinocytes (Man et al., 2006).

Similar to plants, plant tissues grown as suspension cultures and obtained through green biotechnological approaches represent an even more valuable source of BP. Besides being free of pathogens, pollutants and agrochemical residues, which could contaminate most of the plant-derived extracts, plant tissue cultures rarely contain toxic compounds or potential allergens (Apone et al., 2017). Moreover, plant cells, grown as suspension cultures, are particularly rich in cell wall proteins,

whose production is enhanced during the culture formation and the developmental process that occurs when pieces of plant tissue are put in liquid media to form cultures (Apone et al., 2007, 2017). The most abundant proteins present in the cell wall are those belonging to the main classes of hydroxy-proline-rich glycoproteins (HRGPs), proline-rich proteins (PRPs), glycinerich proteins (GRPs), and arabinogalactan proteins (AGPs), with high content of glycine and proline (Ringli et al., 2001), which are also the most abundant amino acids in human collagen. A mixture of BP, together with sugars derived from the hydrolysis of the glycoprotein saccharides, was obtained from the digestion of Nicotiana sylvestris cell wall glycoproteins and extensively investigated in vitro for its activity to inhibit oxidative damages in skin cells, as well as for the capacity to induce DNA repair factors, such as the Growth Arrest and DNA-Damage-inducible protein 45 alpha (GADD45α) and the longevity markers, Sirtuins (Apone et al., 2010). In addition, the cell wall derived BP stimulated collagen I and collagen III synthesis in human fibroblasts, and significantly inhibited the metallo-proteinases 1, 3, and 9 production, suggesting promising skin anti-aging properties for the cosmetic market. Even though the role of the sugars in skin care has not been fully clarified, several evidences indicated that the sugar fraction contained in cosmetic preparations had beneficial effects on hydration and had an anti-inflammatory effect on dermal cells (Tolg et al., 2014).

An analogous preparation obtained from Brassica rapa root cultures was studied for its capacity to reduce the accumulation of melanin in skin cells both in vitro and ex vivo, by inhibiting the activity and the expression of the enzyme tyrosinase, main responsible of the melanin synthesis in melanocytes (Sena et al., 2015, 2017). The effect of the BP was that of negatively affecting the melanogenesis signal transduction mechanism, by downregulating MITF expression, a key transcription factor involved in the melanin pathway. The depigmenting activity produced by the preparation was also evaluated on skin explants, confirming a significant reduction of melanin content both in the sub-basal and in the basal epidermis layers (Sena et al., 2017).

In a more recent study a peptide/sugar preparation, obtained from the cell walls of Lotus japonicus cell cultures, was characterized for the presence of proline, hydroxyproline and leucine as the main amino acid components and was proposed as anti-aging ingredient for the skin care, thanks to its capacity to induce the expression and the production of a the cell rejuvenating growth factor GDF11 in vitro and a consequent restoration of some ECM component production level, such as collagens and periostin (Tito et al., 2016, 2019).

The chemistry and biological activities of BP obtained from marine algae have been investigated, although they have not been as well characterized as the peptides from other sources (Fan et al., 2014). A number of BP have also been obtained by hydrolysis of microalgae proteins but few of them have been proposed for skin care applications. Several microalgae species have got high protein content, up to 47% of the dry weight, compared to the terrestrial plants, and many of them evolved unique protein sequences and structures thanks to their adaptations to more complex habitats, such as those related to extreme conditions (Bimonte et al., 2016; Berthon et al., 2017).

Among the large variety of microalgae species, the genus Chlorella and Arthrospira (commonly known as Spirulina) certainly represent the most established and well known sources of compounds in the skin care market. The freshwater grown unicellular alga Chlorella was already known for the different biological effects on human cells, such as boosting immune functions (Suarez et al., 2010), accelerating dioxin elimination (Morita et al., 1999) and preventing the stressinduced ulcers (Tanaka et al., 1997). Recently it was established that the Chlorella derived peptide (CDP) fraction, with molecular distribution of 430–1350 Da, diminished UVBinduced matrix metalloproteinase-1 (MMP-1) and cysteine-rich 61 (CYR61) gene expression and monocyte chemoattractant protein-1 (MCP-1) production in UV-B irradiated fibroblasts (Chen et al., 2011). Moreover, the authors reported that the UVB-suppressed pro-collagen and TbRII (Transforming growth factor, TGF-β, receptor) mRNAs were restored by CDP treatment. Moreover, CDP inhibited UVB-induced MMP-1 expression in skin fibroblasts by reducing the expression of the activator protein-1 (AP-1), CYR61 and MCP-1 (Chen et al., 2011). Further studies by the same authors revealed also the protective effects of CDP against UVC-induced cytotoxicity through the inhibition of the caspase-3 activity, and reduced the expression of the phosphorylated Fas associated death domain (FADD) and the poly (ADP-ribose) polymerase-1 (PARP-1) (Shih and Cherng, 2012).

Analogously, the microalga Arthrospira platensis (Spirulina), belonging to the group of blue-green Cyanobacteria, is a consistent source of protein, being approximately 60% of its entire weight (Shabana et al., 2017). A BP-enriched fraction obtained from Spirulina was originally studied for its pharmacological activity on the angiotensin I-converting enzyme (ACE) and its anti-proliferative effect in lung cancer cells (Lu et al., 2010; Czerwonkaa et al., 2018). More recently, it was studied the skin care related activity of two different protein hydrolysates from Spirulina, an extract obtained by microbial fermentation (SEF) and extract obtained by enzymatic digestion (SED), and compared to a total undigested extract (SE) (De Lucia et al., 2018). SDS–PAGE analysis of the extracts was performed and the C-phycocyanin was used as indicator of the degree of protein hydrolysis and generation of BP. It was demonstrated that the SED extract, containing protein derived BP, was the most effective in promoting skin hydration and in fighting skin osmotic stress damages in vitro, by increasing the gene expression of factors specifically involved in the water balance maintenance in keratinocytes, such as the aquaporin3, hyaluronic acid synthase 3 and filaggrin. Moreover, the extract was able to abolish ROS induction caused by oxidative stress agents, and to counteract osmotic stress damage by modulating Smit gene (Na+/myoinositol cotransporter), that was responsible for an increased uptake of osmolytes (De Lucia et al., 2018).

# Limitations in the Use of Plant and Microalgae-Derived Peptides

As reported in **Table 1**, the most used plant and microalgae peptides in Cosmetics are obtained by protein hydrolysis and

consist of mixtures of peptides of different sizes. Although these hydrolysates may have a broad range of biological activities thanks to their complex nature, their use may be limited in Cosmetics whether a more specific biological function is preferred. The choice of a specific peptide with a known amino acid sequence and length can be sometimes suitable when the interaction with a single and defined target is desired, thus avoiding interactions with other unspecific biological components. This is particularly true in the Pharmaceutical field, where the employment of peptides as drugs is in perfect agreement with the aim of modulating a target-molecule without disturbing other biochemical functions and avoiding undesired side-effects. Analogously, the study of a single peptide activity could be preferred in some cosmetic applications as well, but the isolation of a single peptide fraction from plant and microalgae hydrolysates may become very challenging. Moreover, the most frequently used technologies to isolate peptides from more complex mixtures still include chromatography and pressuredriven filtration processes, which would increase the final cost of the products, making them no longer sustainable for the cosmetic industry (Fan et al., 2014).

An additional limitation could be associated with the risk of potential allergens, since most of the peptide preparations are produced as unpurified mixtures of several components (Jack et al., 2013). Plant and microalgae hydrolysates may contain allergens and potentially toxic contaminating compounds, whose presence must be necessarily certified. Thus, to ensure conformity to the quality standards and answer the safety concerns, the hydrolysates must be subjected to chemical analysis once they are developed as new active ingredients for cosmetic applications. In other case, additional studies on how to prevent and limit the possible allergic reactions due to identified chemicals may even become necessary (Goossens, 2011). For these reasons, in Cosmetics peptide preparations deriving from plant tissue cultures have today become preferable to common plant extracts, because they derive from biological sources grown in the laboratory, under controlled and axenic conditions, where the risk of the presence of potential allergens and environmental pollutants is almost completely abolished.

#### CONCLUSION AND FUTURE REMARKS

As described through the reported examples, BP, deriving from plant and microalgae sources, possess a broad spectrum of

## REFERENCES


biological activities and their uses as ingredients in skin care and cosmetic products are acquiring more and more opportunities.

Moreover, once derived from plants or microalgae, BP can be used as combinations and mixture with other metabolites in cosmetic formulas, and can exert their biological effects with very low risk of inducing allergenic reaction or undesired side effects.

Despite of the significant progress in the isolation and the characterization of BP from natural sources, as well as the assessment of their biological activities, there are still some aspects, as their production at large scale and the activity maintenance in formulas, which would need further studies in order to extend the range of applications and fulfill the requirements of the continuously growing cosmetic market.

As future development, the plant and microalgae can provide even more innovative opportunities to develop new products for the cosmetic market, thanks to their enormous versatility: for example, new species may represent a valuable source of unexplored BP with novel chemical features and unexpected biological properties. In addition, many plant and microalgae species can be employed as natural biofactories of recombinant protein and peptides with specific sequences and structures, whose production by chemical synthesis may present some limitations. Indeed, thanks to the recent progresses of Molecular Biology, the expression of proteins and peptides in plant and microalgae by using genetic transformation has become a common practice. Although the employment of genetically modified organisms to produce recombinant compounds has been adopted in Medicine for years, in Cosmetics the feedback of the consumers about this topic has not been totally positive so far, probably because of their limited understanding and unfamiliarity about this issue, and certainly the unclearness of the current legislation is not helping either. Nevertheless, proteins and BP produced in plant and microalgae systems can be considered even safer than those produced through conventional methods, thus they have got all the right credentials to be adopted as ingredients in different human health care products.

# AUTHOR CONTRIBUTIONS

All authors conceived and designed the manuscript, and revised, read, and approved the final version of the manuscript.

a review of the literature. Clin. Cosmet. Investig. Dermatol. 9, 411–419. doi: 10.2147/ccid.s116158


of in the Agronomical Field. U.S. Patent No WO2007104489. Naples: Arterra Bioscience srl.



and Cosmetic Compositions Containing those Extracts. U.S. Patent No WO2016173867. Naples: Vitalab srl.


**Conflict of Interest Statement:** AB was employed by the company Arterra Bioscience srl.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Apone, Barbulova and Colucci. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Assessing Carnivorous Plants for the Production of Recombinant Proteins

Sissi Miguel<sup>1</sup> \* † , Estelle Nisse<sup>1</sup>† , Flore Biteau<sup>2</sup> , Sandy Rottloff<sup>2</sup> , Benoit Mignard<sup>1</sup> , Eric Gontier<sup>3</sup> , Alain Hehn<sup>2</sup> and Frédéric Bourgaud<sup>1</sup>

<sup>1</sup> Plant Advanced Technologies SA, Vandoeuvre-lès-Nancy, France, <sup>2</sup> Laboratoire Agronomie et Environnement, INRA, Université de Lorraine, Vandoeuvre-lès-Nancy, France, <sup>3</sup> Laboratoire Biopi, Université de Picardie Jules Verne, Amiens, France

The recovery of recombinant proteins from plant tissues is an expensive and timeconsuming process involving plant harvesting, tissue extraction, and subsequent protein purification. The downstream process costs can represent up to 80% of the total cost of production. Secretion-based systems of carnivorous plants might help circumvent this problem. Drosera and Nepenthes can produce and excrete out of their tissues a digestive fluid containing up to 200 mg. L−<sup>1</sup> of natural proteins. Based on the properties of these natural bioreactors, we have evaluated the possibility to use carnivorous plants for the production of recombinant proteins. In this context, we have set up original protocols of stable and transient genetic transformation for both Drosera and Nepenthes sp. The two major drawbacks concerning the proteases naturally present in the secretions and a polysaccharidic network composing the Drosera glue were overcome by modulating the pH of the plant secretions. At alkaline pH, digestive enzymes are inactive and the interactions between the polysaccharidic network and proteins in the case of Drosera are subdued allowing the release of the recombinant proteins. For D. capensis, a concentration of 25 µg of GFP/ml of secretion (2% of the total soluble proteins from the glue) was obtained for stable transformants. For N. alata, a concentration of 0.5 ng of GFP/ml secretions (0.5% of total soluble proteins from secretions) was reached, corresponding to 12 ng in one pitcher after 14 days for transiently transformed plants. This plant-based expression system shows the potentiality of biomimetic approaches leading to an original production of recombinant proteins, although the yields obtained here were low and did not allow to qualify these plants for an industrial platform project.

Keywords: carnivorous plant, recombinant protein, secretion-based platform, Drosera, Nepenthes, agroinfiltration, virus-based plant transformation, protease

# INTRODUCTION

Development of efficient and cost-competitive expression systems for the production of recombinant proteins constitutes a scientific but also an important economical challenge in the field of biotechnologies. Beside bacterial and mammal cells, plants are considered as interesting models for several reasons. As for mammals, plant cells can perform posttranslational modifications including glycosylation required for the optimal biological activity of many eukaryotic proteins. Plants are also particularly interesting because of the absence of potential pathogens such as

#### Edited by:

Suvi Tuulikki Häkkinen, VTT Technical Research Centre of Finland Ltd., Finland

#### Reviewed by:

Supaart Sirikantaramas, Chulalongkorn University, Thailand Andrej Pavlovic,ˇ Palacký University Olomouc, Czechia

\*Correspondence:

Sissi Miguel sissi.miguel@plantadvanced.com †These authors have contributed equally to this work

#### Specialty section:

This article was submitted to Plant Metabolism and Chemodiversity, a section of the journal Frontiers in Plant Science

Received: 28 November 2018 Accepted: 31 May 2019 Published: 19 June 2019

#### Citation:

Miguel S, Nisse E, Biteau F, Rottloff S, Mignard B, Gontier E, Hehn A and Bourgaud F (2019) Assessing Carnivorous Plants for the Production of Recombinant Proteins. Front. Plant Sci. 10:793. doi: 10.3389/fpls.2019.00793

**199**

mammalian viruses or oncogenic nucleic acids that could subsequently be transmitted to humans. Scientists and industry have taken advantage of these positive points and several plant recombinant protein based-products are currently undergoing clinical trials. For example, Elelyso produced in disposable bioreactors by an engineered carrot plant root cell line (ProCellEx <sup>R</sup> ) and used for the treatment of Gaucher's disease, was the first recombinant glycoprotein produced by a plant system and has been commercialized since 2012 (Mizukami et al., 2018). This plant cell system has also been used to produce Pegunigalsidase α (treatment of Fabry disease), Alidornase α (treatment of cystic fibrosis) and OPRX-106 (inflammatory bowel diseases) which have reached different clinical trials phases<sup>1</sup> . Another company, Greenovation GmbH, has developed a moss-based platform in which Moss-aGal (treatment of Fabry disease) is under clinical trial phase I investigation, whereas Moss-FH (human complement Factor H) and Moss-GAA (treatment of Pompe disease) are in preclinical development<sup>2</sup> .

The production and purification of pharmaceutical proteins goes through unavoidable steps such as tissue disruption, clarification of the crude extract, and purification of the product according to good manufacturing practices (GMP)(Fischer et al., 2012). These downstream processes represent up to 80% of the protein production costs and are, therefore, a key factor for commercial use of the production systems (Drossard, 2005). In comparison to microbial and mammalian cells, extraction steps from plants lead to the production of high amounts of cell scraps as well as many unwanted plant small contaminants, including secondary metabolites, chlorophyll and endogenous proteins. To bypass these time-consuming extraction and clarification steps, plant secretion-based systems have been considered. Several new generation platforms have been reported such as plant cell suspensions secreting recombinant proteins in the growth medium (Nicotiana tabacum BY-2 cells, ProCellEx <sup>R</sup> system, hairy roots,. . .) (Misaki et al., 2001; Häkkinen et al., 2014; Xu and Zhang, 2014; Tekoah et al., 2015; Cardon et al., 2018) or rhizosecretion of recombinant proteins from plants cultivated in hydroponic conditions (Madeira et al., 2016a,b).

Carnivorous plants are able to attract, trap, retain, kill, and digest preys (Juniper et al., 1989). They are found all around the world growing on nutrient-poor soils. They have established an original trait to circumvent the shortage of mineral nitrogen resources: their leaves have evolved to form traps for catching preys which turn to be an original source of nutrients. These preys are subsequently digested, allowing substantial recovery of nitrogen-rich molecules. Drosera and Nepenthes are two carnivorous plants genera able to produce and excrete out of their tissues a significant amount of digestive fluid. Drosera leaves are covered on their upper face by stalked glands secreting sticky and viscous digestive mucilage. Nepenthes leaves are differentiated in pitchers, the lower internal part being covered by glands secreting a digestive liquid (Juniper et al., 1989). Their digestive fluids contain proteins (especially digestive enzymes), a polysaccharide (responsible of the viscoelasticity proprieties),

<sup>1</sup>http://www.protalix.com

<sup>2</sup>http://www.greenovation.com

secondary metabolites (mainly antimicrobial compounds), and minerals salts (Juniper et al., 1989; Michalko et al., 2013; Kokubun, 2017; Krausko et al., 2017; reviewed by Miguel et al., 2018). Two major drawbacks have been identified in the use of these carnivorous plants as host of recombinant protein production: (1) the polysaccharidic network and (2) the proteases naturally present in the secretions.

The polysaccharide network is reported in Drosera and Nepenthes genus at different concentrations and is produced by stalked and sessile glands for Drosera and by pitted glands for Nepenthes. Several studies describe approximatively the chemical composition of this glue in Drosera genus. The mucilage contains mostly organic substances. Nearly 65% of them correspond to a polysaccharide (Kokubun, 2017) with a molecular weight between 2 × 10<sup>6</sup> and 5 × 10<sup>6</sup> Da (Rost and Schauer, 1977; Erni et al., 2008) and is composed by L-arabinose, D-xylose, D-galactose, D-mannose, and D-glucuronic acid in the molar ratio of 3.6:1.0:4.9:8.4:8.2 (Gowda et al., 1983). Another molecule described in the mucilage has been described as myo-inositol, a non-polysaccharide organic component. It has been highlighted between the polysaccharide strands and might acts as a cross linker via the formation of a hydrogen bond-network between the hydroxyl groups (Kokubun, 2017).

Concerning the native proteases present in secretion, several studies based on proteomic and genomic investigations determined the composition of enzymatic pool of Nepenthes digestive liquid. A complex mix of proteases was described in secretions such as aspartic proteases, cysteine proteases, serine carboxypeptidases and prolyl-endopeptidases (Athauda et al., 2004; Takahashi et al., 2005; Stephenson and Hogan, 2006; Hatano and Hamada, 2012; Kadek et al., 2014a; Lee et al., 2016; Rottloff et al., 2016). Few of them have been studied in detail and display an acid pH-dependent activity (Nepenthesin 1 and 2, Neprosins) (Athauda et al., 2004; Kadek et al., 2014b; Rey et al., 2016; Schräder et al., 2017).

To overcome the bottleneck linked to downstream process (DSP) costs, we have aimed to exploit this natural ability of carnivorous plants to secrete proteins and to assess the possibility to produce recombinant proteins from Drosera and Nepenthes plants. To achieve these goals, we have set up both a stable and a virus-based transient expression system for the production of recombinant proteins in the digestive fluid of these plants. We have also developed technical solutions to limit the impact of digestive proteases and polysaccharide matrix in the recovery of the recombinant proteins.

#### MATERIALS AND METHODS

# Plant Material and Virus

#### Drosera Plants (D. capensis)

Drosera capensis seeds were provided by Karnivore (Colmar, France) and conserved at 4◦C. Seeds were sterilized by total immersion in a diluted commercial bleach solution containing 0.25% sodium hypochlorite for 5 min, and washed three times with sterile water. After a drying step on sterile paper, the seeds were sown on solid basal medium (capBM) composed by <sup>1</sup>/<sup>2</sup> Murashige and Skoog (MS) medium, 2-fold MS vitamin mixture, 2% (w/v) sucrose, 0.05% (w/v) casein hydrolysate, 50 mg/L citric acid, 100 mg/L ascorbic acid, 1 g/L polyvinylpyrrolidone and 0.7% (w/v) HP696 agar (Kalys, Bernin, France), and the pH was adjusted at 5.8. The seeds were incubated under a 16 h/8 h day/night photoperiod provided by natural white fluorescent lamps (160 µmol.m−<sup>2</sup> .s−<sup>1</sup> ) at a temperature of 23◦C. Seedlings were then transferred in the same fresh medium.

#### Nicotiana benthamiana

fpls-10-00793 June 17, 2019 Time: 17:31 # 3

Nicotiana benthamiana seeds were sowed in compost and cultivated in controlled-environment room under 16 h/8 h day/night photoperiod with artificial light (70 µmol.m−<sup>2</sup> .s−<sup>1</sup> ) at 26◦C with 70% of relative air humidity. Three weeks-old plantlets were transplanted in individual pots and were used after 3–4 weeks for agroinfiltration experiments.

#### Nepenthes Plants (N. alata)

Nepenthes alata plant were purchased from Araflora<sup>3</sup> and cultivated in heated greenhouses with natural light at 23◦C with the relative air humidity of 75–85%.

#### Wild-Type TMV (wt-TMV)

Tobacco Mosaic Virus is multiplicated by rub inoculation of N. benthamiana leaves and tissues were collected 6 days postinoculation. The TMV strain was provided by Prof Gilmer (IBMP Strasbourg – France).

## Genetic Constructions and Agrobacterium Preparation

To secrete GFP outside plant tissue, we used a GFP version with a signal peptide and without endoplasmic reticulum (ER) retention signal as in the case of native digestive enzymes. Consequently, gfp gene without ER retention signal was amplified from the binary vector pBin-m-gfp5-ER provided by Pr. Haselhoff (Division of Cell Biology, MRC Laboratory of Molecular Biology, Cambridge, CB2 2QH, United Kingdom). The primers were designed as following: gfp\_Fw: 5<sup>0</sup> - GGATCCAAGGAGATATAACAATGAAGACTAATCTTTTTC TC-3<sup>0</sup> ; gfp\_Rev: 5<sup>0</sup> -TTATTTGTATAGTTCATCCATGCCATGTG TAATCCCAGC-3<sup>0</sup> ). The amplification was done with Platinum Taq DNA Polymerase High Fidelity (InvitrogenTM, Thermo Fisher Scientific) and the PCR product was cloned into pCR8 <sup>R</sup> /GW/TOPO <sup>R</sup> vector (InvitrogenTM, Thermo Fisher Scientific) as specified by the supplier. The coding sequence was subsequently transferred by recombination using LR Clonase IITM (InvitrogenTM, Thermo Fisher Scientific) into binary vectors pGWB2-GW (AB289765.1) (Nakagawa et al., 2007) and pMW388-GW (JX971627.1) (Kagale et al., 2012) to obtain pGWB2-gfp for stable genetic transformation of Drosera and pMW388-gfp vectors (**Supplementary Figure 1**) for Nepenthes transient expression by recombinant virus inoculation. The recombinant plasmids were introduced into Agrobacterium tumefaciens C58C1Rif<sup>R</sup> using the freeze-thaw method (Chen et al., 1994) and selected in solid YEB (10 g/L Beef Extract,

<sup>3</sup>http://www.araflora.com

5 g/L Yeast Extract, 10 g/L peptone, 15 g/L sucrose, 0.5 g/L MgSO4, 20 g/L at pH 7.2) supplemented with 100 mg/l rifampicin, 100 mg/L carbenicilin and 30 mg/l kanamycin at 28◦C during 2–3 days.

# Stable Genetic Transformation of D. capensis

#### Agrobacterium Suspension Preparation

The A. tumefaciens strain C58C1Rif<sup>R</sup> transformed with binary vector pGWB2-gfp and pMW388-gfp were cultured in liquid YEB medium containing 100 mg/l rifampicin, 100 mg/L carbenicilin and 30 mg/l kanamycin at 28◦C during 2 days at 180 rpm. Four hours before plant transformation, 100 µM of acetosyringone was added to bacteria cultures to activate the vir genes. The bacteria were pelleted by centrifugation for 15 min at 5000 × g, washed two times with fresh YEB medium to remove antibiotics and re-suspended at a cell density of OD<sup>600</sup> 0.8 ± 0.1 in liquid capBM medium.

#### Agrobacterium Mediated-Transformation of Drosera

The protocol of genetic transformation D. capensis developed for this work is based on the method developed by Hirsikorpi et al. (2002). Leaf explants of 6 months old in vitro plants were wounded with a sterile needle and completely immerged in Agrobacterium suspension for 10 min. Explants were transferred on solid capBM medium and cultivated at 23◦C under a 16 h/8 h day/night photoperiod for co-culture step. After 3 days, explants were washed with liquid capBM medium supplemented by 200 mg/L cefotaxime for 10 min. They were then transferred on solid capBM medium with 60 mg/L kanamycin and 200 mg/L cefotaxime supplemented by 5 µg/L 6-benzylaminopurine.

#### Generation of Transgenic Plants

After 2 months, regenerated plantlets were separated from initial explants and transplanted on solid BM medium supplemented by 5 µg/L 6-benzylaminopurine, 200 mg/L cefotaxime and 300 mg/L kanamycin for 4 months. When plants were 4–5 cm high, they were rooted in solid BM medium supplemented by 200 mg/L cefotaxime, 300 mg/L kanamycin and 250 µg/L indole-3-butyric acid for 2 months.

#### Ex vitro Acclimation

Rooted plants were acclimated in greenhouse under natural light at a temperature of 23◦C and a relative air humidity of 75– 85%. The substrate was composed by 1/3 of sphagnum and 2/3 of blond peat.

#### Obtention of T1 Progeny

Because of the risk to obtain chimeric transgenic plants due to their multicellular origin, the production of recombinant proteins was based on T1 generation obtained 4 months after acclimation. Seeds of potentially chimeric transgenic plants were collected, sterilized and cultivated on solid BM medium supplemented with 100 mg/L kanamycin for 4 months. Then, the selected plants were rooted and acclimated as described above.

## Nepenthes Transient Expression by Recombinant Virus Inoculation Multiplication of Recombinant TMV in N. benthamiana by Agroinfiltration

Agrobacterium suspension was prepared as described above. The bacteria pellet was resuspended in infiltration buffer (10 mM MES, pH 5.6). The OD<sup>600</sup> nm was adjusted at 0.5. The Agrobacterium suspension was injected using a syringe into the abaxial side of N. benthamiana fully expanded leaves. Infiltrated plants were cultivated under 16 h/8 h day/night photoperiod at 26◦C with 70% of relative air humidity for 14 days. Leaves were then harvested at 5–7 days post-infiltration and stored at −80◦C.

#### Wild-Type and Recombinant Virus Inoculation in Nepenthes Leaves

Infected N. benthamiana leaves by wt-TMV and agroinfiltrated N. benthamiana leaves for recombinant virus multiplication were ground in NaPi buffer (0.5 M Na2HPO4) at pH 8 in 1:4 ratio using a mortar and pestle. The crushed tissues were supplemented with 1% (w/v) Celite 545 AW (Sigma-Aldrich) and used for rub inoculation of adaxial face of N. alata mature leaves attached to a just opened pitcher. Before inoculation, native secretions of each pitcher were removed, cleaned with water and replaced by 30- 40 ml phosphate buffer (137 mM NaCl, 2.7 mM KCl, 8.1 mM Na2HPO4, 1.5 mM KH2PO<sup>4</sup> at pH 7.4). Pitcher were bagged to keep high humidity and to limit contaminations. Fifteen minutes after inoculation, the leaves were rinsed with tap water to remove the residual sap and celite. Leaves were wrapped, and plants were returned to the standard growth conditions.

# Molecular Analysis

#### PCR Based Molecular Characterization of the Drosera Transgenic Plants

Genomic DNA of Drosera was extracted from leaves according to Biteau et al. (2012). Genes of interest integrated on genomic DNA was detected thanks to Master Mix from Thermo Fisher Scientific. The primers designed to amplify housekeeping gene 18S (AY096118.1) were Dcap18S1\_Fw: 5<sup>0</sup> -CGTGCAACAAACCCCGAC-3<sup>0</sup> and Dcap18S1\_Rev: 5<sup>0</sup> -TGCGCGCCTGCTGCCTT-3<sup>0</sup> . The primers designed to amplify gfp gene were gfp1\_Fw: 5<sup>0</sup> -ATCCTCGGCCGAATTCAGTAAAGG-3<sup>0</sup> and gfp1\_Rev: 5<sup>0</sup> -AGTTCATCCATGCCATGTGTAATCCC-3<sup>0</sup> .

#### RNA Extraction and RT-PCR From Drosera and Nepenthes Tissues

Total RNA was extracted from Drosera and Nepenthes tissues using the SpectrumTM Plant Total RNA kit (Sigma-Aldrich) according to the manufacturer's instructions. To efficiently remove genomic DNA, RNA was treated with the TurboTM DNA Free Kit (Ambion, Thermo Fisher Scientific). SuperScriptIII One-step RT-PCR Kit (InvitrogenTM, Thermo Fisher Scientific) was used to detect transcripts of interest from 50 ng of total RNA. For Drosera, the primers designed to amplify 18S transcripts were Dcap18S2\_ Fw: 5<sup>0</sup> -GGGTTCGCCCCGGTTGCTCTGATGATT-3<sup>0</sup> and Dcap18S2\_Rev: 5<sup>0</sup> -GGGCCGAGACGATAGGTGCACAC-3 0 , the primers designed to amplify nptII transcripts were nptII\_Fw: 5<sup>0</sup> -GGATTGCACGCAGGTTCTCCGGCCG-3<sup>0</sup> and nptII\_Rev: 5<sup>0</sup> -TGGCCAGCCACGATAGCCGCGCTG-3<sup>0</sup> and the primers designed to amplify gfp transcripts were gfp2\_Fw: 5<sup>0</sup> - GGATCCAAGGAGATATAACAATGAAGACTAATCTTTTT CTC-3<sup>0</sup> and gfp2\_Rev: 5<sup>0</sup> -GTCGTGCCGCTTCATATGATCTGG GTATC-3<sup>0</sup> . For Nepenthes, the primers designed to amplify movement protein transcripts of TMV were TMV-MP\_Fw: 5 0 -ATGGCTCTAGTTGTTAAAGGAAAAGTGAATATC-3<sup>0</sup> and TMV-MP\_Rev: 5<sup>0</sup> -CACATTTCTAATATTAACTAAAACTTG CCAG-3<sup>0</sup> , the primers designed to amplify capsid protein transcript of TMV were TMV-CP\_Fw: 5<sup>0</sup> - ATGTCTTACAGTATCACTACTCCATCTCAG-3<sup>0</sup> and TMV -CP\_Rev: 5<sup>0</sup> -TCAAGTTGCAGGACCAGAGGTCC-3<sup>0</sup> and the primers designed to amplify gfp transcript were gfp3\_Fw: 5 0 -ATGAGTAAAGGAGAAGAACTTTTC-3<sup>0</sup> and gfp3\_Rev: 5 0 -GTCGTGCCGCTTCATATGATCTGGGTATC-3<sup>0</sup> .

#### Analysis of Protease Activity and Recombinant GFP in the Secreted Liquid Treatment of Secretions

#### **Collection of secretions and recovery of recombinant proteins from Drosera glue**

Because Drosera plants produce a limited amount of viscous liquid, we decided to collect two series of secretions from five non-chimeric plants. These 8 months-old plants presented the same development stage and were previously obtained from independent transformation events. The harvest was realized by immerging D. capensis leaves in a buffer (Tris HCl 50 mM pH 7.5) during 5 min under gentle agitation and the leaves were rinsed with distilled water after the first harvest. In total, a volume of 200 ml of buffer was recovered and subsequently filtrated though a Whatman <sup>R</sup> filter with 2.5 µm of porosity. To capture recombinant proteins into Drosera mucilage, 1 ml of Anion Exchange chromatography resin (HyperCel STAR AX, 20197-026, PALL) was added for 200 ml of sample. The mix was agitated at 4◦C for an overnight and the resin was recovered and washed with 5 ml of washing buffer (Tris HCl 50 mM pH 7.5, NaCl 5 mM). Recombinant proteins were eluted three times with Tris HCl 50 mM pH 7.5 with increasing concentrations of NaCl from 0.2 M, 0.5 M to 1 M.

#### **Concentration of Nepenthes secretions by tangential flow filtration and sample preparation**

Two batches of Nepenthes secretions were prepared from four pitchers attached to four leaves inoculated independently by recombinant TMV. The two batches were pooled, filter sterilized (0.2 µm) and concentrated about 20 times by tangential flow filtration (Cogent Microscale, Millipore) thanks to 5K Minimate capsule with Omega membrane (PALL). Recombinant proteins were precipitated overnight at −20◦C with 4 volumes of acetone. The samples were then centrifuged at 12,000 × g, for 30 min at 4◦C. The pellets were dried 2 h at 37◦C.

#### Analyses of Carnivorous Plant Secretions

#### **Zymograms with Nepenthes secretion**

fpls-10-00793 June 17, 2019 Time: 17:31 # 5

Pellets were solubilized in 30 µL of Tris/Glycine buffer x1 (TrisHCl 25 mM, glycine 192 mM, pH 8.3) and 10 µl of native buffer (62.5 mM TrisHCl pH 6.8, SDS 2% w/v, 0.001 % bromophenol blue). The samples were separated by SDS-PAGE using a 10% polyacrylamide gel containing 0.1% casein. The gel was washed 2 times for 15 min with renaturation buffer (Triton X-100 2.5% at pH 3 or 8) to remove SDS and was incubated overnight at 37◦C in an activation buffer. To test the impact of pH conditions in Nepenthes protease activity, a set of activation buffers was constituted: for pH 2, 9.8 mM citric acid and 0.24 mM disodic phosphate; for pH 3, 8 mM citric acid and 4 mM disodic phosphate; for pH 4, 1.6 mM sodic acetate and 8.3 mM acetic acid; for pH 5, 6.8 mM sodic acetate and 3.2 mM acetic acid; for pH 6, 3.7 mM citric acid and 12.5 mM disodic phosphate; for pH 7, 1.9 mM citric acid and 16.2 mM disodic phosphate; for pH 8, 12 mM boric acid and 4.5 mM HCl; for pH 9,16.7 mM boric acid and 1.6 mM HCl.

**SDS-PAGE and western-blot analyses of Nepenthes secretions** Pellets were solubilized in 30 µL of Tris/Glycine buffer x1 (TrisHCl 25 mM, glycine 192 mM, pH 8.3) and 15 µL of denaturating buffer (0.313 M Tris-HCl pH 6.8 at 25◦C, 10% SDS, 0.5% bromophenol blue, 50% glycerol, 2 M dithiothreitol). Proteins were heated at 95◦C for 10 min. The samples were loaded on MiniPROTEAN <sup>R</sup> TGXTM precast gels, 10% polyacrylamide (50 µL, 10-well; Bio-Rad). The separated proteins were blotted on a polyvinylidene fluoride membrane (Membrane PVDF 0.45 µm AmershamTM HybondTM P, GE Healthcare) and GFP was detected thanks to a rabbit primary polyclonal antibody anti-GFP (NB600- 310, Novus Biological) at 1:5000 dilution in PBS (137 mM NaCl, 2.7 mM KCl, 8.1 mM Na2HPO4, 1.5 mM KH2PO4, pH 7,4) and to a secondary anti-rabbit antibody conjugated to alkaline phosphatase activity (Cat#A0418-1ML, Anti-Rabbit IgG (whole molecule)–Alkaline Phosphatase antibody produced in goat, Sigma-Aldrich) diluted to 1:6000 in PBS. Revelation was performed with NBT/BCIP (Promega, United States) as substrate.

#### **GFP quantification on Nepenthes secretions**

Enzyme-linked immunosorbent assay (ELISA) was used to quantify the amount of GFP present in the secretions. Pellets were solubilized in 100 µL of PBS and coated overnight on plates (microplates Greiner BioOne Cat #655051) at 4◦C. GFP standard (Recombinant Aequorea victoria GFP, AR09180PU-N, Acris) was used to generate a standard curve over a range between 0.16 and 10 µg/ml. A rabbit primary polyclonal antibody anti-GFP (Cat# NB600-310, Novus Biological) was used at 1:2000 dilution. A secondary anti-rabbit antibody conjugated to HRP [Peroxidase AffiniPure Goat Anti-Rabbit IgG(H+L) 111-035-003, Jackson Immuno Research] was used at 1:6000 dilution for detection, using TMB-ELISA (Thermo Fisher Scientific) as substrate. The plate was developed for 15 min and the reaction was stopped with 2M H2SO4.

#### **GFP quantification on Drosera glue**

GFP quantification was performed by fluorometry and the concentration was determined by comparing values of the protein extracts to a standard curve made with a standard of recombinant GFP (Recombinant Aequorea victoria GFP, AR09180PU-N, Acris).

#### **Total soluble proteins (TSP) quantification in secretions**

To quantify TSP in secretions, we used QubitTM Fluorometer (InvitrogenTM, Thermo Fisher Scientific) with QubitTM protein assays kit (InvitrogenTM, Thermo Fisher Scientific) according to supplier instructions.

# RESULTS

#### Production of Recombinant Proteins From Drosera Glue

Molecular Characterization of Transgenic D. capensis Establishment of an Agrobacterium-mediated transformation protocol and subsequent regeneration of transgenic plants require setting up several parameters such as an optimal hormonal balance adapted to plant regeneration, the selection of transformed cells and parameters linked to transfection of T-DNA from agrobacteria to plant cells. Such experimental conditions were described by Hirsikorpi et al. (2002) for D. rotundifolia but needed to be revisited for D. capensis. Thus, several parameters are completely different from those of D. rotundifolia. To regenerate D. capensis plantlets from leaf explants we only used low quantity of BAP (0.005 mg/L) in contrary to D. rotundifolia that needed a combination of auxin and cytokinin (0.45 mg/L BAP and 0.372 mg/L 1-Naphthaleneacetic acid). The selection of transformed plant was performed in presence of 60 mg/L of kanamycin at the explant level and 300 mg/L after bud apparition. These different steps were conducted at a continuous kanamycin concentration for D. rotundifolia (400 mg/L). Finally, plantlets were rooted in 0.27 mg/L IBA while no hormone was added for D. rotundifolia.

To avoid working on plants potentially chimeric, we produced seeds from self-pollinated transgenic plants. However, we didn't determine whether the transgenic lines were heterozygous or homozygous, a characteristic which might affect the amount of recombinant protein produced. The resulting plants were screened for the presence of the gfp coding sequence using a PCR approach with specific primers. An expected 720 bp amplicon could be highlighted in some plants (**Figure 1**). A series of 23 plants was selected, based on the occurrence of the expected PCR product, from a group of 50 plants previously obtained by direct regeneration of transformed calluses. Each of these 50 plants corresponded to independent transformation events.

Beside the insertion of the T-DNA in the genome, we also assessed the expression of gfp and nptII using a RT PCR approach on RNA extracted from different recombinant plants. In addition to the 329 bp amplicon corresponding to the gfp, we could also

FIGURE 1 | Molecular characterization of gfp-transformed D. capensis from T1 generation produced by chimeric transgenic plants. (A) PCR with primers specific to 18S gene as housekeeping gene (270 bp); (B) PCR with primers specific to gfp gene (720 bp); M, Marker; W, water as negative control; wt, wild-type; P, plasmid harboring the gene as a positive control; 1–12, non-chimeric plants issues from T1 generation.

amplify a 199 bp amplicon corresponding to the nptII selection marker (**Figure 2**).

#### Highlighting the Production of Recombinant Proteins Outside Plant Tissues in Drosera Secretions

Glue from wild-type D. capensis plant and gfp-transformant were collected on filter paper and visualized under exposition to UV-light (395 nm). In contrary to wild-type (**Figures 3- 1A,B,3-2A,B**), transformed plants displayed fluorescence at the tips of tentacles (**Figures 3-1C–F**) and into secretions (**Figures 3-2C,D**). These observations made evidence that GFP was produced into the leaf tissues but also secreted into the mucilage thanks to glandular hairs.

The digestive fluids of Drosera plants is composed of a highly viscous and elastic mucilage making the collection of recombinant proteins not trivial. To overcome this problem, we developed a two-step strategy. First, we showed that the binding of the polysaccharidic network can be relaxed by dilution and alkalinization with a pH 7.5-adjusted buffer. Then, to catch and concentrate the released proteins, we used an anion-exchange chromatography resin which binds to negatively charged molecules such as GFP displaying an isoelectric point at 5.8. This strategy led us to recover functional and concentrated GFP.

#### Quantification of Recombinant Protein Production by Drosera Genus

The quantification of GFP production by Drosera was performed by fluorimetry measurements using commercial GFP to establish a standard curve. Given that glue of Drosera presents an autofluorescence due to some compounds secreted in this matrix especially secondary metabolites, this native fluorescence was subtracted of signal obtained from glue issue from transgenic plants. Thus, we have estimated the production of recombinant

protein in Drosera glue at 26.07 ± 2.73 µg of GFP/ml of secretions. This represents at 26.07 ± 2.73 µg of GFP/ml of secretions for five plants. So, one plant can produce 0.2 ± 0.02 µg of GFP with the relative yield of about 2% of TSP from the glue.

# Production of Recombinant Proteins Into Nepenthes Digestive Liquid

#### Molecular Characterization of Inoculated Nepenthes by Recombinant TMV

#### **Capacity of TMV to infect Nepenthes**

Tobacco mosaic virus is one of the most extensively studied plant viruses and has consequently become a natural choice for vector development. As a preliminary experiment, we investigated the capacity of wt-TMV to infect Nepenthes tissues. Four mature leaves attached to just opened pitcher were inoculated per plant, with a N. benthamiana crude extract infected wild-type virus. 14 days post-inoculation (dpi), total RNA from different parts of plants were extracted and the presence of virus was assessed by RT-PCR. Despite the absence of any symptom, we could successfully amplify a specific 502 bp amplicon, corresponding to the gene encoding the movement protein transcripts, in the inoculated leaves but also in the pitcher tissues attached to the inoculated leaf, and in a neighboring leaf (**Figure 4**). This result clearly makes evidence that N. alata is a host plant of TMV which might therefore be used as a tool for realizing transient expression.

#### **Systemic expression of transiently expressed genes in Nepenthes tissues**

For producing GFP in Nepenthes tissues, we introduced the corresponding gene in the pMW388 vector leading to a recombinant TMV lacking the gene encoding the capsid protein (CP) but including the gfp coding sequence. This recombinant virus was multiplied in N. benthamiana leaves by using an agroinfiltration approach. A crude extract was produced from the infected N. benthamiana tissues and

used to inoculate Nepenthes leaves. Since the CP is necessary for a systemic infection, we realized a co-inoculation of the engineered virus with crude extracts from N. benthamiana leaves infected with wild-type TMV providing the necessary CP. Four mature leaves per plant, attached to just opened pitcher, were inoculated, and the native secretions of each pitcher were replaced by phosphate buffer at pH 7.4. Five and 14 dpi, tissues of inoculated leaves and the bottom part of the correspondingly attached pitcher containing secretions were collected. A RT-PCR performed on RNA extracted from these tissues showed the expression of cp gene (coming from the wild-type TMV), and gfp gene (coming from the recombinant virus), whereas no transcripts could be highlighted in non-inoculated plants (**Figure 5**). This result makes evidence that the engineered TMV can infect Nepenthes tissues and that the CP produced by the

wt-TMV can help in trans for its movement in the whole inoculated plant.

#### Recombinant Protein Production in Nepenthes Secretions

#### **Development of a strategy to protect recombinant proteins from secreted digestive enzymes**

Given the high proteolytic activity of Nepenthes digestive fluid, it was necessary to set up a strategy to protect recombinant proteins secreted in this unfavorable environment. Several studies described that some of the proteases present in the digestive fluids were acid pH-dependent (Athauda et al., 2004; Rey et al., 2016). We confirmed these data with casein containing zymograms realized on N. alata secretions. The gels were incubated in buffer ranging from pH 2 up to pH 9 (**Figure 6**). Our results showed that Nepenthes secretion exhibited a maximum proteolytic activity at pH 3-4 and was inhibited after pH 7. Several lysis spots could be observed highlighting a degradation of casein, probably by several proteases with different sizes.

We assumed that a way for preserving the recombinant protein might be to increase the pH in the pitcher. We realized a time-course experiment for the production of these proteases. We replaced the native secretion fluid by an alkaline buffer, and we collected and analyzed it 3 days after. We made four replacements of liquid for the same set of pitchers every 3 days. This experiment showed that the digestive protein pool is reconstituted 3 days after removing the liquid pitcher because we obtained similar protein profiles (**Figure 7**) and similar concentrations of TSP, respectively 45.2 ± 9.7, 33.75 ± 9.2, 44.0 ± 20, 51.75 ± 4.3, and 45.2 ± 15.7 µg of TSP/ml of secretion at each harvest (experiments performed with 4 pitchers). It also demonstrates that the replacement of the native fluid doesn't interfere with the production of proteins.

#### **GFP production by Nepenthes secretions**

For assessing the production of GFP in pitchers, engineered and wild-type TMV were co-inoculated to leaves connected to a mature pitcher for which the digestive fluid was replaced by a phosphate buffer at pH 7.4 prior to the inoculation. We collected the secretion 5 and 14 dpi and measured the pH, confirming this way its stability. The secretions were pooled, concentrated by tangential filtration and precipitated with acetone. The protein pellets were analyzed by westernblot using an anti-GFP serum (**Figure 8**). Whereas no GFP could be detected 5 dpi, a 27 kDa protein was produced 14 dpi. This analysis confirmed that GFP can be efficiently produced at the same time than the digestive protein pool.

in separated SDS-PAGE. Zymogram profiles at each pH value separated by a space were grouped together.

FIGURE 7 | Highlighting of the capacity of Nepenthes pitcher to renew its digestive pools after several collects and replacements of liquid by an alkaline buffer via SDS-PAGE analysis. (1) Secretion pool after opening pitcher and before the first replacement; (2) Secretion pool 3 days after the first replacement of digestive liquid by a buffer; (3) Secretion pool 3 days after the second replacement; (4) Secretion pool 3 days after the third replacement; (5) Secretion pool 3 days after the fourth replacement.

It also showed that the alkalization of secretions can, at least, limit or slow down degradation of the recombinant protein. Indeed, several low size bands can be observed, constituting potentially an evidence of partial degradation by digestive enzymes.

#### **Quantification of recombinant GFP protein production by Nepenthes model**

An ELISA quantification test was established by using commercial GFP as a standard. This quantitative assay showed that four pitchers attached to four leaves inoculated independently by recombinant TMV can produce 48.2 ± 1.4 ng of GFP in 14 days, namely 11.95 ± 0.45 ng of GFP/pitcher. The GFP was measured at an initial concentration of 0.595 ± 0.025 ng of GFP/ml secretions which represents on average 0.5% of TSP from the secretions.

# DISCUSSION

Producing recombinant proteins for pharmaceutic purposes by plant platforms is an attractive prospect for many welldocumented reasons. Plants can be easily cultivated at affordable cost at a large scale. Plants are also interesting because of their ability to correctly fold, assemble and process complex proteins. So far, no contamination by mammalian viruses or pathogens has been reported and using plants also avoid any ethical issues in comparison to transgenic animals.

Since the first reports dedicated to the construction of transgenic tobacco in the 1980s, plants have been largely used for the production of proteins for research purposes (Vyacheslavova et al., 2012) and were considered for the production of pharmaceuticals at the industry level. During these few decades, scientists demonstrated that plants could produce adequate quantities of functional human proteins. The comprehension of the gene silencing processes and the use of plant virus-based vectors opened some new tracks in the years 2000 (Pumplin and Voinnet, 2013). With these methods mostly based on agroinfiltration, large amounts of recombinant of proteins can be realized within a few days increasing the interest of plant systems in this field. Nowadays, both stable and transient systems are used for the production of such proteins (Paul and Ma, 2011; Sheshukova et al., 2016).

To access to any pharmaceutical market, proteins need to be purified with contaminants removed to acceptable levels. The downstream processing starts with the harvesting of plant biomass followed by a tissue disruption (Menkhaus et al., 2004; Wilken and Nikolov, 2012; Buyel and Fischer, 2014). The subsequent extraction step aims to release the recombinant protein from the plant material into an aqueous buffer. The main problem for plant material is the presence of coproducts such as endogenous cell proteins, intrinsic plant fibers, oils, polyphenols, chlorophylls or organic acids. Furthermore, plant solids are generally in higher concentration, wider in size range, and denser than conventional cell culture debris. Some of these contaminants can be removed through clarification using centrifugation and/or filtration (Buyel et al., 2015). These preliminary extraction and clarification steps are quite

costly in comparison to the direct purification used for microbial and mammalian cell systems (Buyel et al., 2015) and represent one of the most important drawbacks of the plant expression system.

To overcome this problem, scientist have focused on plant secreted based-systems that have been described in literature. In the case of rhizosecretion or cell cultures, the proteins can be released in the culture medium. Some of these systems have been described for several proteins and were used at the industrial scale (Tekoah et al., 2015). The carnivorous plant-based system described in this report can be considered as a secretion based-system. Drosera have large leaves recovered by glandular hairs and the bottom of the Nepenthes pitchers is recovered by a network of multicellular glands [between 3.72 glands/mm<sup>2</sup> and 8.87 glands/mm<sup>2</sup> in N. alata (Wang et al., 2009)]. In this work, we demonstrated that, thanks of these specialized organs, recombinant protein follows the same way as native proteins, and are excreted outside plant tissues. Recombinant proteins can be secreted and easily recovered by washes of Drosera tentacles or by emptying Nepenthes pitchers. The heterologous proteins are however mixed with 20–30 endogenous proteins, antimicrobial compounds, mineral nutrients and a polysaccharidic macromolecule (reviewed by Miguel et al., 2018).

Removing the polysaccharide represents the main challenge for purifying the secreted recombinant protein. This macromolecular network is responsible of the viscoelasticity proprieties of secretions and is a key element for the capture of prey used by carnivorous plants (Adlassnig et al., 2010; Król et al., 2012). The shear viscosity of the mucilage is about 10<sup>2</sup> Pa.s (Erni et al., 2008), the maximum viscosity is reported at pH 5 and can decrease when pH is raised or lowered (Rost and Schauer, 1977; Kokubun, 2017). These properties are helpful for purifying the recombinant proteins stacked in the glue. The alkalization realized by washes of the leaves with a buffer, significantly decreased the viscosity of the glue by disrupting the polysaccharidic network without degradation of the target protein. The alkaline pH also changed the electric charge of the protein that became negative and could therefore be purified with an anionic resin.

For Nepenthes, the exact composition of the biopolymer remains to be elucidated although the Nepenthes fluid could exhibit a similar acidic polysaccharide mucilage than Drosera (Gaume and Forterre, 2007). Its viscoelasticity strength of 1.4.10−<sup>2</sup> Pa.s is less important than Drosera and could be linked to a lower quantity of secreted polysaccharide. N. alata exhibits a non-viscoelastic apparently water-like fluid in contrary to some Nepenthes species producing viscous secretions as N. rafflesiana or N. aristolochioides, (Gaume and Forterre, 2007). The purification of the proteins might therefore be easier at large scale in this used in our study.

Unwanted degradation of recombinant proteins by endogenous proteases is another drawback of plant-based heterologous production systems (Pillay et al., 2014). These enzymes are involved in a multitude of processes from cellular

TMV and agroinfiltrated by TMV-1CP-gfp (2, 3, 5, and 6) at 5 days (2 and 3) and 14 days (5 and 6) post inoculation (St, GFP standard).

to whole organism level, but the exact functions and targets of most of them are still unknown (van der Hoorn, 2008). In the case of a production inside plant tissues, proteolysis events might occur into different cellular compartments and during the extraction step (Sharp and Doran, 2001; Drake et al., 2003; Niemer et al., 2014; Pillay et al., 2014). These proteases represent more than 10% of the extracellular proteins (Albenne et al., 2013). Several strategies have been described in literature to counteract these enzymes such as (1) using protein stabilization agents as gelatin, BSA (Bovine Serum Albumin) or other low value proteins acting as competitive substrates for proteases (Drake et al., 2003; Huang and McDonald, 2012), (2) using inhibitor cocktails, (3) coexpressing protease inhibitors or (4) inhibiting the expression through a gene silencing approach (Komarnytsky et al., 2006; Kim et al., 2008; Goulet et al., 2010; Robert et al., 2013; Jutras et al., 2016). According to Lallemand and collaborators, the nature of the proteases depends on the production systems but also the developmental stages, the culture medium or the plant species (Lallemand et al., 2015). Thus, the obtention of 'omics' data in a given production system is an interesting tool to perform the identification of proteases potentially dangerous for the target protein (Rottloff et al., 2016).

In carnivorous plant-based system, Nepenthes secretion exhibited maximum proteolytic activity at pH 3–4 but was inhibited beyond pH 7 as described by Rey et al. (2016) and confirmed by our experiments. The native pitcher fluid has a pH of 2.8 after opening in presence of prey which is favorable for activating a proteolytic activity and to digest insects. Changing the pH to 8 completely inhibited the proteolysis events in the digestive liquid without disturbing the secretion process of the digestive glands. The access to the Nepenthes pitcher is therefore quite comfortable for controlling the degradation of the recombinant protein.

This development of carnivorous plant-based expression systems requested to develop original protocols to successfully express gene into plant tissues. A first approach consists of generating stable transgenic plants. The first plant selected for this study was D. capensis, the Cape sundew which has longer leaves than the common sundew for the production of large amounts of glue (Hirsikorpi et al., 2002). For D. capensis, an Agrobacterium-mediated transformation protocol was developed, based on Hirsikorpi et al. (2002) work done on D. rotundifolia with readjustment of several parameters. This method necessitates 18 months to regenerate transformed plants dedicated to production of recombinant proteins, which is a long time scale in a research program. To reduce the production time, an alternative technique of genetic transformation was successfully developed, based on a floral dip protocol. This type of in planta transformation, consist in agro-infecting male or female gametophytes present in flowers and is used with several species like Arabidopsis thaliana (Clough and Bent, 1998), Raphanus sativus (Curtis and Nam, 2001), or wheat (Agarwal et al., 2009). It allows the obtention of an important amount of transformed seeds and avoid the possibility of chimeric plants since each plant is theoretically derived from an embryo of single cell origin. A critical parameter for success is the developmental stage of the flower at the time of inoculation with Agrobacterium (Clough and Bent, 1998). Given that floral stalks of Drosera exhibit flowers at different development stages, some of them are likely to be transformed by immersion in agrobacteria suspension. In our case, such an approach has allowed to reduce the time needed for the production of transgenic plants from 18 to 10 months.

A virus-based expression system was also investigated as another way to shorten the production times. With the exception of a polerovirus (Miguel et al., 2016), no virus has been reported to infect carnivorous plants. The Tobacco Mosaic Virus (TMV) has been largely reported infect a broad range of plants. To check the virulence of TMV on Nepenthes, we made preliminary experiments using a wild-type strain of virus inoculated on leaves. Two weeks post-inoculation, although no symptoms appeared, the virus could be detected in the infected leaves but also on neighboring organs due to systemic propagation of the virus. These results led us to use TMV-based vectors for producing a recombinant GFP protein in Nepenthes (Gleba et al., 2005). The virus could not be produced in Nepenthes using an agroinfiltration approach and, therefore, the recombinant virus was propagated by infiltrating N. benthamiana leaves prior to inoculating Nepenthes leaves with a crude N. benthamiana leaf extract. This second strategy was successful, and we could recover recombinant GFP from TMV-infected Nepenthes leaves. This approach might now be improved by using other virus-derived vector such as a full length TMV-based vector, where the gene of interest is cloned between CP gene and MP gene and is controlled by the CP promotor (Lindbo, 2007).

Through this work, the proof of concept of a new plant system based on carnivorous plants for producing recombinant proteins is now established. The results obtained in this work now allow to evaluate a potential production of recombinant protein at larger scale for Drosera- and Nepenthes-based systems. In the case of Drosera, since 65 plants can be cultivated in one square meter, and digestive fluid can be harvested every 7–10 days, we assume that the production of GFP might reach on average 650 µg/m<sup>2</sup> /year. In Nepenthes case, plants present commonly about 15 mature pitchers. Assuming that 30 plants/m<sup>2</sup> can be cultivated in suspension, 4.5 µg of GFP can be produced by one square meter every 2 weeks. Given that 26 collects can be done for 1 year, we can produce 117 µg of GFP /m<sup>2</sup> /year. Considering these small yields calculated per square meter, this platform is certainly not qualified for an industrial production scale but nonetheless demonstrates the potential interest of carnivorous plants for producing recombinant proteins. The magnitude of these estimated yields has been further confirmed with a human recombinant protein (Intrinsic Factor) for which the same experiments have been applied and have conducted to the obtention of similar yields. One prospect in the use of carnivorous plants for protein production might be to use Nepenthes endogen signal peptides that could increase the transcription of genes, the translation level of recombinant proteins and the secretion into the digestive fluid. Such small sequences formed by hydrophobic amino acids and localized at the N-terminal end of proteins have been documented (Faye et al., 2005) and the access to 'omics' databases might help identifying Nepenthes specific sequences. A Korean company has already patented the use of N. alata signal peptides to improve the secretion of recombinant proteins in trichomes of melon, cucumber, water melon, rape, and tobacco (Chang, 2005). Therefore, the interest of carnivorous plants for the production of recombinant proteins may lie in their outstanding organization of glandular tissues toward the excretion of endogenous proteins which could inspire biotechnologists for other plant-based platforms.

# AUTHOR CONTRIBUTIONS

EN and FLB realized the molecular biology experiments and constructed the transgenic Drosera. SR and FLB performed the protein related experiments. SM did the transient expression experiments on Nepenthes plants. BM, EG, AH, and FRB supervised the research program. SM, FRB, and AH wrote the manuscript.

#### ACKNOWLEDGMENTS

The authors would like to thank the Region Lorraine and the FEDER for financial support through the BioProLor 2 research program and also Ms. Cindy Michel and Julie Genestier for their technical assistance.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00793/ full#supplementary-material

# REFERENCES

fpls-10-00793 June 17, 2019 Time: 17:31 # 12


through simplified downstream processing. Biotechnol. J. 11, 910–919. doi: 10.1002/biot.201500371


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Miguel, Nisse, Biteau, Rottloff, Mignard, Gontier, Hehn and Bourgaud. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Substitution of Human Papillomavirus Type 16 L2 Neutralizing Epitopes Into L1 Surface Loops: The Effect on Virus-Like Particle Assembly and Immunogenicity

#### *Aleyo Chabeda1 , Albertha R. van Zyl1 , Edward P. Rybicki1,2 and Inga I. Hitzeroth1 \**

#### *Edited by:*

*Anneli Ritala, VTT Technical Research Centre of Finland Ltd, Finland*

#### *Reviewed by:*

*Bryce Chackerian, University of New Mexico, United States Ebenezer Tumban, Michigan Technological University, United States*

*\*Correspondence:* 

*Inga I. Hitzeroth inga.hitzeroth@uct.ac.za*

#### *Specialty section:*

*This article was submitted to Plant Metabolism and Chemodiversity, a section of the journal Frontiers in Plant Science*

*Received: 28 November 2018 Accepted: 28 May 2019 Published: 20 June 2019*

#### *Citation:*

*Chabeda A, van Zyl AR, Rybicki EP and Hitzeroth II (2019) Substitution of Human Papillomavirus Type 16 L2 Neutralizing Epitopes Into L1 Surface Loops: The Effect on Virus-Like Particle Assembly and Immunogenicity. Front. Plant Sci. 10:779. doi: 10.3389/fpls.2019.00779*

*1 Biopharming Research Unit, Department of Molecular and Cell Biology, University of Cape Town, Cape Town, South Africa, 2 Institute of Infectious Disease and Molecular Medicine, University of Cape Town, Cape Town, South Africa*

Cervical cancer caused by infection with human papillomaviruses (HPVs) is the fourth most common cancer in women globally, with the burden mainly in developing countries due to limited healthcare resources. Current vaccines based on virus-like particles (VLPs) assembled from recombinant expression of the immunodominant L1 protein are highly effective in the prevention of cervical infection; however, these vaccines are expensive and type-specific. Therefore, there is a need for more broadly protective and affordable vaccines. The HPV-16 L2 peptide sequences 108-120, 65-81, 56-81, and 17-36 are highly conserved across several HPV types and have been shown to elicit cross-neutralizing antibodies. To increase L2 immunogenicity, L1:L2 chimeric VLPs (cVLP) vaccine candidates were developed. The four L2 peptides mentioned above were substituted into the DE loop of HPV-16 L1 at position 131 (SAC) or in the C-terminal region at position 431 (SAE) to generate HPV-16-derived L1:L2 chimeras. All eight chimeras were transiently expressed in *Nicotiana benthamiana via Agrobacterium tumefaciens*-mediated DNA transfer. SAC chimeras predominantly assembled into higher order structures (*T* = 1 and *T* = 7 VLPs), whereas SAE chimeras assembled into capsomeres or formed aggregates. Four SAC and one SAE chimeras were used in vaccination studies in mice, and their ability to generate cross-neutralizing antibodies was analyzed in HPV pseudovirion-based neutralization assays. Of the seven heterologous HPVs tested, cross-neutralization with antisera specific to chimeras was observed for HPV-11 (SAE 65-18), HPV-18 (SAC 108-120, SAC 65-81, SAC 56-81, SAE 65-81), and HPV-58 (SAC 108-120). Interestingly, only anti-SAE 65-81 antiserum showed neutralization of homologous HPV-16, suggesting that the position of the L2 epitope display is critical for maintaining L1-specific neutralizing epitopes.

Keywords: HPV-16, L1:L2 chimera, L2 substitution, epitope display, plant-produced, cross-neutralization

# INTRODUCTION

Approximately one in six global deaths is due to cancer, with the economic cost estimated at US\$1.2 trillion in 2010 (World Health Organisation, 2017). Cancer is the second leading cause of death (Abubakar et al., 2015) and it was estimated that human papillomavirus (HPV)-related cancers account for 5% of all human cancers (De Martel et al., 2012). Cervical cancer is the fourth most common cancer in women globally and results in an estimated 567,000 cases and 311,000 deaths every year (Bray et al., 2018). About 80% of these cases occur in developing countries, largely due to limited healthcare resources. Most HPV infections are cleared by the immune system (Goodman et al., 2008; Rosa et al., 2008); however, some benign cervical lesions progress to invasive cervical cancer (ICC) caused predominantly by high-risk HPVs (zur Hausen, 2002). High-risk HPV-16 and HPV-18 are the most common cause of ICC and are associated with 70% of cervical cancer cases (Smith et al., 2007; de Sanjose et al., 2010), but at least 13 other high-risk HPVs cause cancer (zur Hausen, 2002; Parkin and Bray, 2006).

HPVs are small non-enveloped double-stranded DNA viruses with a genome size of approximately 8 kb (de Villiers et al., 2004) and infect mucosal and cutaneous basal epithelial cells after tissue microtrauma (Kines et al., 2009). The capsid is arranged in a *T* = 7 icosahedral formation and consists of major and minor capsid proteins, L1 and L2, respectively (Conway and Meyers, 2009). The major capsid protein consists of 360 copies of L1 that assembles into 72 pentamers and up to 72 copies of L2 can be integrated into each capsid (Buck et al., 2005, 2008). L1 assembles into virus-like particles (VLPs) in the presence or absence of the L2 minor capsid protein. VLPs retain the immunological properties of native papillomaviruses (Kirnbauer et al., 1992; Hagensee et al., 1993; Casini et al., 2004) and produce high titers of neutralizing antibodies (nAbs) when used as a vaccine (Christensen et al., 1994; Roden et al., 2000).

Three prophylactic vaccines: Cervarix™, a bivalent HPV-16/18 VLP vaccine; Gardasil®, a quadrivalent HPV-6/11/ 16/18 VLP vaccine; and Gardasil®9, a nonavalent HPV-6/11/16/18/31/33/45/52/58 VLP vaccine, based on the immunodominant L1 major capsid protein are currently on the market and have been shown to be effective in preventing cervical disease (Naud et al., 2014; Huh et al., 2017); however, the global burden of cervical cancer remains high, particularly in low-resource countries due to vaccine cost, type specificity of the vaccines, and poor screening and treatment programs. Although the most recent Gardasil®9 vaccine should address the low cross-neutralization observed with original vaccines, the addition of more L1 VLP types has not decreased the cost of current vaccines. Hence, there is a need for next-generation HPV vaccines that broadly target oncogenic HPV types, at reduced cost to women particularly in developing countries suffering most from cervical cancer (Roden and Stern, 2018) and penile cancer in men (Cardona and García-Perdomo, 2018).

Next-generation vaccines using L2 peptides have been investigated to generate more cross-protective responses (Schellenbacher et al., 2017). Anti-L2 antibodies can neutralize a broad range of mucosal and cutaneous HPVs (Pastrana et al., 2005; Alphs et al., 2008), suggesting that a L2 vaccine could address the type-restrictive efficacy of L1 vaccines. The N-terminus of HPV-16 L2 has a highly conserved region from amino acids (aa) 1-120 (Lowe et al., 2008), and L2 peptides 108-120 (Kawana et al., 1999), 65-81 (Jagu et al., 2013), 56-81 (Kawana et al., 1998; Kondo et al., 2007, 2008; Slupetzky et al., 2007), and 17-36 (Gambhira et al., 2007; Kondo et al., 2007, 2008; Alphs et al., 2008; Schellenbacher et al., 2009) have been shown to elicit nAbs that crossneutralize other HPV types and provide protection against passive challenge. However, L2 is immunologically subdominant to L1, therefore scaffolded display of L2 peptides and the construction of chimeric proteins with L1 has been used to overcome these limitations. The structure and assembly of L1 has been well described (Chen et al., 2000b; Modis et al., 2002; Bishop et al., 2007) and L1 surface-exposed regions contain the conformational epitopes involved in the production of nAbs (Christensen et al., 1994, 1996; Roden et al., 1997; White et al., 1999). Several studies have shown that the insertion or substitution of several peptides into several L1 surface loops does not affect chimeric VLP (cVLP) assembly, with both anti-L1 and anti-L2 responses observed (Slupetzky et al., 2001, 2007; Sadeyen et al., 2003; Varsani et al., 2003; Schellenbacher et al., 2009, 2013; McGrath et al., 2013; Pineo et al., 2013; Chen et al., 2018). The insertion of the HPV-16 L2 peptide aa 17-36 (RG-1) in the L1 DE surface loop has shown the most promise as a candidate cVLP vaccine as it has been shown to protect mice against challenge with high-risk mucosal pseudovirion (PsV) types HPV-16/18/45/31/33/52/58/35/39/51/59/68/56/73/26/53/66/34 and low-risk types HPV-6/43/44, with protection observed 1 year after vaccination (Schellenbacher et al., 2013). This candidate vaccine is currently under cGMP production and is expected to enter a phase I clinical trial soon (Buchman et al., 2016; Roden and Stern, 2018).

Plants provide a convenient protein production platform to potentially reduce the cost of vaccine production compared to traditional microbial fermentation or mammalian/insect cell expression systems. Their production is easily scalable, they are eukaryotes that contain the necessary machinery for mammal-like post-translational modification, and they have no risk of contamination by human pathogens (Biemelt et al., 2003; Fischer et al., 2004; Rybicki, 2010). HPV VLPs have been successfully produced in plants *via* transient expression (Varsani et al., 2006b; Maclean et al., 2007; Regnard et al., 2010; Matic et al., 2011; Pineo et al., 2013), and have been shown to be immunogenic and protective in animal models (Kohl et al., 2006). Furthermore, L1:L2 cVLPs (L2 substituted in the h4 helix of L1) previously produced in our group by Pineo et al. (2013) were shown to assemble into higher order structures, and elicit anti-L1 and anti-L2 antibody responses which neutralized HPV-16 and HPV-52 PsVs.

In this study, we report the purification of five plant-produced HPV-16 L1:L2 cVLPs with L2 substituted in the DE loop (SAC) or the C-terminal region of L1 between the h4 and β-J structural region (SAE), based on insect-cell produced chimeras described by Varsani et al. (2003). The effect of L2 peptide substitution on chimera assembly and presentation of L1 epitopes was analyzed, and the immunogenicity and the cross-neutralizing potential of the cVLPs investigated.

#### MATERIALS AND METHODS

#### Large-Scale Expression of L1:L2 Chimeras in *Nicotiana benthamiana*

The binary *Agrobacterium* vector pTRAkc-rbsc1-cTP was used to clone eight L1:L2 chimeric genes (**Figure 1**). Recombinant clones were transformed into *Agrobacterium tumefaciens* as described by Maclean et al. (2007). Successful transformation was confirmed by colony PCR, restriction enzyme digestion and sequencing. Starter cultures of recombinant *A. tumefaciens* pTRAkc-rbsc1-cTP SAC 108-120, SAC 65-81, SAC 56-81, SAC 17-36, SAE 65-81, hL1 (HPV-16 L1), and an empty vector (negative control) were grown at 28°C overnight in enriched Luria-Bertani broth (LBB) supplemented with 50 mg/ml carbenicillin, 30 mg/ml kanamycin, 50 mg/ml rifampicin, and 20 μM acetosyringone. The starter cultures were transferred to a bigger flask (without rifampicin) and incubated overnight. The cultures were prepared for infiltration by dilution to OD600 0.5 in infiltration medium (10 mM MES, pH 5.6, 10 mM MgCl2, 100 μM acetosyringone). *N. benthamiana* plants (5–6 weeks old) were infiltrated with recombinant *Agrobacterium* suspensions by applying a vacuum (100 kPa) and grown for 5 days at 22°C under 16 h/8 h light/dark cycle.

#### Purification of Vaccine Antigens

Whole leaves were harvested and thoroughly homogenized with a Waring-type blender in cold high-salt, low-pH extraction buffer at a w/v ratio of 1:1, supplemented with 1x Complete Mini EDTA-free protease inhibitor cocktail (Roche, Basel, Switzerland). Homogenates were incubated at 4°C with shaking for 1.5 h, filtered through four layers of Miracloth™ (Merck, Darmstadt, Germany), and clarified 2x at 10000 x *g* for 10 min at 4°C. The clarified extract was loaded onto discontinuous Optiprep™ (Sigma Aldrich, St Louis, MO) gradients (27, 33, 39 and 46%) and centrifuged for 3.5 h at 174500 x *g*, 15°C, in a SW 32 Ti rotor (Beckman, Brea, CA), after which 1-ml fractions were collected from the bottom of the tubes. Fractions 1–4 were pooled, added to a 5-ml ultracentrifuge tube (Ultra-Clear Thinwall TUBE, Beckman, Brea, CA) and centrifuged for 1 h at 183548 x *g* at 15°C, in a SW 55 Ti rotor (Beckman, Brea, CA). The opaque band visible after centrifugation was collected using a needle and syringe and quantified by indirect enzymelinked immunosorbent assay (ELISA). Total L1 protein yields of the vaccine antigens was detected using Camvir-1 mAb (McLean et al., 1990): SAC 108-120 (145 mg/kg), SAC 65-81 (7.8 mg/kg), SAC 56-81 (43 mg/kg), SAC 17-36 (29 mg/kg), SAE 65-81 (1.2 mg/kg), and hL1 (142 mg/kg).

#### Transmission Electron Microscopy of Purified Chimeric Virus-Like Particles

Carbon-coated copper grids (mesh size 200) were placed on a 20-μl drop of sample for 3 min and washed 5x in double distilled water. The samples were negatively stained for 1 min with 2% w/v uranyl acetate and viewed using a FEI Tecnai 20 equipped with a LaB6 emitter.

positions 131 in the DE loop (SAC) or 431 between the h4 and β-J structural region (SAE), to generate L1:L2 chimeras. Not drawn to scale.

#### Quantitation of Purified Chimeric Virus-Like Particles by Indirect Enzyme-Linked Immunosorbent Assay

The five L1:L2 chimeras and hL1 positive control were quantified by indirect ELISA. Ninety-six well plates (Nunc Maxisorp, ThermoFisher Scientific, Waltham, MA) were coated with: (a) 80 ng of purified HPV-16 L1 VLPs (100 μl/well) serially diluted 2-fold in coating buffer (10 mM Tris, pH 8.5) to generate a standard curve or (b) 100 μl of vaccine antigen serially diluted 2-fold from 1:50 to 1:400 in coating buffer, and incubated overnight at 4°C with gentle shaking. Plates were blocked with 300 μl of blocking buffer (1x Tris-Cl (TBS), pH 7.5, 5% non-fat dried milk) for 2 h at room temperature after which they were washed 4x with 1x TST (1x TBS, 0.05% Tween 20) wash buffer. A volume of 100 μl of Camvir-1 (1:15000) primary antibody was added to each well and the plates incubated at 37°C for 1 h. The plates were washed 4x with 1x TST, followed by the addition of 100 μl of alkaline phosphatase-conjugated anti-mouse IgG secondary antibody (1:10000) (Sigma Aldrich, St Louis, MO) to each well and incubated at 37°C for 1 h. For the final washes, plates were washed with 1x TBS (pH 9) after which 200 μl of SIGMAFAST™ p-nitrophenyl phosphate (Sigma Aldrich, St Louis, MO) substrate was added to each well and incubated in the dark for 30 min. The absorbance was detected at 405 nm using a Bio-Tek Powerwave XS spectrophotometer. Total L1 yield of each vaccine antigen was calculated using the average absorbance values obtained and the equation of the chart generated from the standard curve. The negative control was quantified by total soluble protein (TSP) using the Bio-Rad DC Protein Assay (Bio-Rad, Irvine, CA).

#### Characterization of Chimeric Virus-Like Particle Epitope Display by Indirect Enzyme-Linked Immunosorbent Assay

One hundred nanograms (SAC 108-120, SAC 65-81, SAC 17-36 and hL1) or 50 ng (SAC and SAE 65-81) of native cVLPs or hL1 VLPs prepared in 100 μl of coating buffer were coated onto 96-well plates (Nunc Maxisorp, ThermoFisher Scientific, Waltham, MA) and incubated overnight at 4°C with gentle shaking. For denaturing conditions, cVLPs or hL1 VLPs were dried onto the 96-well plates without a lid in 0.2 M NaHCO3 (pH 10.6) + 0.01 M freshly added dithiothreitol (DTT) buffer overnight at 37°C. The next day, plates were blocked with 300 μl blocking buffer for 2 h at room temperature, followed by 4x washes with 1x TST wash buffer. Five-fold serial dilutions of antibodies (1:200–1:125000) in blocking buffer were added to the wells in triplicate (100 μl/well) and incubated at 37°C for 1 h. Antibodies used were neutralizing monoclonal antibodies (mAbs) H16:V5, H16.E70, H16.U4, H16.9A, H16.J4 (Christensen et al., 1996), and L2 4B4 [gift from Dr. Neil Christensen (Dept Pathology, Penn State, PA)], linear non-neutralizing commercial mAb Camvir-1, or rabbit serum raised against HPV-16 L2. Plates were then washed again and 100 μl of alkaline phosphatase-conjugated goat antimouse (1:10000) or goat anti-rabbit (1:5000) secondary antibody added, and incubated at 37°C for 1 h. Final washes and detection were performed as described above.

#### Immunization of Mice

Animal use and care was approved by the Faculty of Health Sciences Animal Ethics Committee, University of Cape Town (FHS AEC ref.: 014/024). Forty female Balb/c mice (five mice per group) were immunized subcutaneously by injection in the left or right flank with the five plant-derived candidate cVLP vaccines: SAC 108-120, 65-81, 56-81, 17-36, and SAE 65-81, hL1 (positive HPV-16 L1 VLP control), and two negative controls (plant extract from *A. tumefaciens*-infected plants containing empty vector and PBS) (**Table 1**). Pre-bleed (PB) sera were collected 3 days prior to vaccination on Day 0. Mice were immunized on Day 0 and boosted with the same doses on Day 14 and Day 28, and a test bleed collected on Day 42 to ascertain if an additional boost was required. An additional boost was administered on Day 45 and final bleed (FB) sera were collected by cardiac puncture on Day 59.

#### Indirect Enzyme-Linked Immunosorbent Assay Detection of Anti-L1 Antibodies in Mouse Sera

Ninety-six-well Maxisorp plates were coated with 100 ng of purified HPV-16 L1 protein per well and incubated overnight at 4°C. Indirect ELISAs were performed as described above. FB sera were serially diluted 3-fold from 1:50 to 1:1350. All anti-L1 titers are stated as the reciprocal of the maximum dilution with higher absorbance readings than the corresponding PB serum at 1:50.

#### Western Blot Detection of Anti-L2 Antibodies in Mouse Sera

HPV-16 L2 was expressed in *Escherichia coli* DH5-α using pProEx™-HTb (ThermoFisher Scientific, Waltham, MA). The recombinant *E. coli* culture was inoculated in 10 ml of LB, supplemented with 100 μg/ml ampicillin and incubated for


16 h at 37°C with agitation, after which it was used to inoculate 500 ml of LB medium. The culture was incubated with agitation at 37°C until it reached an A590 of 0.5–1.0 and induced by the addition of 0.6 mM isopropylthioβ-D-galactoside (IPTG) and the culture incubated at 37°C for 2 h. The cells were harvested by centrifugation at 10000 x *g* for 10 min. Inclusion bodies were purified from the *E. coli* cell pellet using Bugbuster® (Novagen, USA) according to the manufacturer's instructions. Ten microliters were loaded into the wells of 10% SDS-PAGE gels, transferred to nitrocellulose membrane by semi-dry electroblotting, and strips probed with pooled sera from each vaccine group at 1:2000. Mouse anti-His mAb (Bio-Rad, Irvine, CA) was used as positive control antibody at 1:2000. Strips were then probed with alkaline phosphatase-conjugated anti-mouse IgG secondary antibody (1:10000).

#### Standard L1 Pseudovirion-Based Neutralization Assay

HPV PsV production, purification, and neutralization were performed as described by Buck et al. (2005) and with a few changes as described by Pineo et al. (2013). PsVs of eight different HPV types: HPV-6, 11, 16, 18, 31, 45, 52, and 58, were produced. Sera that neutralized PsVs by at least 50% were then further titrated to determine end-point titers. Neutralization titers are stated as the reciprocal of the maximum serum dilution which reduced secreted alkaline phosphatase (SEAP) activity by >50% in comparison to the PsV only control sample, which was not treated with serum/mAb.

#### RESULTS

#### Chimeric Virus-Like Particle Purification by Isopycnic Centrifugation

HPV-16 L2 peptides, 108-120, 65-81, 56-81, and 17-36 were substituted into HPV-16 L1 DE loop from position 131 to generate four SAC chimeras or into the C-terminal between the h4 and β-J structural region from position 431 to generate four SAE chimeras (**Figure 1**). Only chimeras that were shown to form higher order structures were selected for further study, namely: SAC 108-120, SAC 65-81, SAC 56-81, SAC 17-36, and SAE 65-81. HPV-16 L1:L2 cVLPs, HPV-16 hL1 VLPs, and an empty vector control were extracted and purified in a high-salt, low-pH buffer on discontinuous Optiprep™ density gradients. Purified cVLPs were visualized by TEM to determine their structural integrity prior to vaccination (**Figure 2**). Chimeras of SAC 108-120, SAC 65-81, SAC 56-81, and SAC 17-36 (**Figures 2A**–**D**, respectively) showed cVLPs that ranged in size from 50 to 60 nm (white arrows), small cVLPs (25–40 nm, blue arrows), and capsomeres (10 nm, grey arrows), with SAC 108-120 cVLPs (**Figure 2A**) being the most similar to purified HPV16-hL1 VLPs (**Figure 2F**). SAE 65-81 (**Figure 2E**) showed few cVLPs with mostly aggregates present. HPV-16 hL1 (**Figure 2F**) assembled into particles measuring 50–60 nm in size, with a few small VLPs present. cVLPs were comparable to other chimeras and VLPs purified previously in our group by density centrifugation (Varsani et al., 2003; Maclean et al., 2007), heparin chromatography (Pineo et al., 2013), and cation exchange chromatography (McGrath et al., 2013).

#### L1 and L2 Epitope Display on the Chimeric Virus-Like Particle Scaffold

The antigenicity of cVLPs and their ability to present the substituted L2 peptides were analyzed by indirect ELISA using a panel of mAbs and anti-L2 polyclonal serum. Under native conditions (**Figure 3**), conformationally dependent neutralizing mAbs H16.V5 and H16.E70 did not bind the L1:L2 cVLPs (**Figures 3A,B**), indicating the disruption or steric hindrance of the V5 and E70 neutralizing epitopes by substitution of the L2 peptides. The anti-L2 polyclonal serum reacted with all cVLPs in native form, indicating the L2 peptides were displayed on the virion surface, with strongest binding observed for SAC 108-120 (**Figure 3D**). Additionally, binding of the mAb L2-4B4, which recognizes the L2 peptide 108-120 (**Figure 3H**), showed strong binding to SAC 108-120 cVLPs. Under denaturing conditions, binding by anti-L2 polyclonal sera and L2-4B4 mAb was slightly diminished (**Supplementary Figures S1D,H**). Binding was also seen for the non-neutralizing mAb Camvir-1, that recognizes a linear epitope L1 aa 204-210 (**Figure 3C**). As expected, native hL1 VLPs were bound by H16.V5 and H16. E70 (**Figures 3A,B**), but binding was diminished under denaturing conditions (**Supplementary Figures S1A**,**B**).

To further characterize if other L1 neutralizing epitopes were displayed on the cVLPs, an additional panel of mAbs were tested. Neutralizing mAbs H16.9A (conformation specific) and H16.J4 (binds linear epitope between aa 261 and 280) bound to all native SAC cVLPs (**Figures 3F,G**), with the highest affinity for SAC 108-120. H16.U4 however showed decreased binding for SAC 108-120, SAC 56-81, and SAC 17-36 cVLPs (**Figure 3E**). No binding of these mAbs to native SAE 65-81 cVLPs (**Figures 3E**–**G**) was observed, and this may be due to the poor assembly of cVLPs as observed by TEM (**Figure 2E**) and/or disruption or steric hindrance as mentioned above. Under denaturing conditions, mAb H16. U4 and H16.9A showed no binding to all cVLPs or hL1 VLPs (**Supplementary Figures 1E,F**, respectively). Only Camvir-1 and H16.J4 mAbs showed binding affinity for denatured cVLPs (**Supplementary Figures S1C,G**, respectively).

# Determination of Anti-L1 Titers

Mice were subcutaneously injected with plant-purified antigens and boosted three times at 2-week intervals. Due to the yields obtained from cVLP purification, it was not possible to vaccinate mice with the desired dose of 5 μg; therefore, with the exception of SAC 108-120 and hL1 VLPs, all other chimeras were used at the maximum dose possible (**Table 1**). It has previously been shown that vaccine doses of 8 ng and 1 μg elicited high anti-HPV-16 L1 IgG titers and nAbs (Kim et al., 2012). Sera from individual mice were pooled and anti-L1

antibody titers determined by indirect ELISAs (**Figure 4A**). No anti-L1 response was detected for PB sera of all groups or for the FB of PBS negative control sera. A titer of 50 was observed for empty vector FB serum, which may be due to co-purification of plant proteins. SAC 108-120 and SAC 17-36 had anti-L1 titers of 1,350, with sera of SAC 65-81 and SAC 56-81 with titers of 150. The lowest titer of 50 was observed for SAE 65-81, similar to the titer of the empty vector. The highest anti-L1 titers of 6,400 were obtained for positive control hL1 sera (**Figure 4B**).

#### Anti-L2 Humoral Responses

PB and FB sera for all individual mice in the different vaccine groups were pooled and analyzed for the presence of anti-L2 antibodies by western blot using *E. coli-*purified L2 as antigen (**Supplementary Figure S2**). Of the chimera vaccine groups, an expected band of ~80 kDa (black arrow) was detected only for SAC 108-120 FB serum and a very faint band for SAC 17-36 serum. These bands were similar in size to that of the L2 positive control. No bands were observed for the negative controls (empty vector, PBS and hL1) (**Supplementary Figure S2**).

#### L1 Pseudovirion-Based Neutralization Assay

Purified PsVs of HPV types 6/11/16/18/31/45/52 and 58 were used in L1-PBNAs to test the ability of sera obtained from vaccinated mice to neutralize PsVs. These HPV types were chosen based on the HPVs the L2 epitopes are known to cross-neutralize. Pooled mouse sera were initially tested for neutralization at dilutions of 1:50 and 1:200 prior to titration (data not shown), and only sera that showed at least 50% neutralization of PsVs were titrated further. PB sera were only tested at a 1:50 dilution due to limited blood volume, and no neutralization was observed with PB sera (**Table 2**). The neutralization titers for all sera tested (**Table 2**) were low, except for hL1 neutralization of HPV-16 PsVs, with a titer ≥6,400. This was expected as no structural modifications were made to L1 VLPs in comparison to the chimeras tested. SAE 65-81 serum

H16.J4 (G), non-neutralizing mAb Camvir-1 (C), mAb to L2 peptide 108-120 (L2-4B4) (H), and polyclonal anti-L2 serum (D). Error bars indicate standard deviation.

coated with purified L1 antigen and a titration of final bleed antisera performed to determine end-point titers. Serum titers are reported for OD values greater than the mean OD of pre-bleed sera. Error bars indicate standard deviation.



*Assays were performed in duplicate due to limited blood volume. NN – No neutralization.*

neutralized HPV-11 PsVs at a titer of 50. HPV-18 PsVs were neutralized with sera from SAC 108-120, SAC 65-81, and SAC 56-81 at titers of 200, and with SAE 65-81 serum at a titer of 50. HPV-58 PsVs had a neutralization titer of 50 for SAC 65-81 sera.

#### DISCUSSION

L1 VLPs have successfully been expressed and purified from plants (Biemelt et al., 2003; Maclean et al., 2007; Fernandez-San et al., 2008). L1:L2 chimeras ranging in size from 50 to 60 nm have successfully been produced in insect cells and purified by ultracentrifugation on sucrose-PBS and CsCl-PBS density gradients (Varsani et al., 2003; Schellenbacher et al., 2009, 2013; Huber et al., 2015, 2017). In this study of plant-produced cVLPs, differences in cVLP assembly were observed (**Figure 2**). These differences may be attributed to the length and amino acid sequence of the L2 peptide used. Cys residues 175 and 428 have been shown to be critical for the formation of disulfide bonds between capsomeres for the formation of VLPs and mutations of these residues prevents the formation of VLPs (Li et al., 1998; McCarthy et al., 1998; Sapp et al., 1998; Fligge et al., 2001; Varsani et al., 2006a). Although Cys175 and Cys428 are not lost due to L2 peptide substitution, the rate of formation of disulfide bonds between neighboring L1 capsomeres may have been decreased due to a slow kinetic thio-disulfide interchange rate (Nagy, 2013). Of the four L2 peptide substitutions made, L2 108-120 is the shortest epitope, suggesting that longer epitopes may be detrimental to complete particle assembly and thus epitope display, no matter where they are substituted. This was also observed by Pineo et al. (2013) and McGrath et al. (2013) who investigated the production of cVLPs in plant and insect cell systems, respectively. In these studies, L2 peptides were substituted in the h4 helix from position 414 in the C-terminal region and although L2 108-120 substitution did not eliminate the Cys428 residue involved in disulfide crosslinking with Cys175, substitution of L2 56-81 and 17-36 did, resulting in the formation of capsomeres and aggregates. In contrast, Matic et al. (2011) produced plant-made HPV-16 L1 chimeras containing influenza virus type A M2e epitopes in the h4 helix and between the h4 and β-J region, and found that a longer epitope (23 residues) was better than a shorter epitope (eight residues) in the display of the M2e epitope. Additionally, the amino acid sequence composition can affect interaction with other residues (due to charge and size) and therefore folding. Specifically, the addition of two Cys residues in the L2 epitope 17-36 may form disulfide bonds with Cys175 or Cys428 in the L1 backbone, accounting for the particles observed in **Figure 2**. These data suggest that protein modeling of the interactions of substituted epitopes of varying length and sequence with L1 residues requires further investigation.

VLPs have been shown to be strongly immunogenic due to the repetitive display of epitopes on their surfaces, their interaction with antigen-presenting cells (APCs), and their activation of B cells (Chackerian et al., 2008; Schiller and Lowy, 2018). Several L2 aa epitopes have been inserted by others into L1 loops for the generation of cVLPs. Insertion of L2 aa 17-36, 28-31, 35-75, 69-81, 108-120, and 115-15 into BPV-1 L1 DE loop at position 133/134, elicited anti-L1 and anti-L2 responses in mice (Slupetzky et al., 2007; Schellenbacher et al., 2009). Additionally, substitution (from position 131) of L2 aa 108-120 into the HPV-16 DE loop of insect-cell produced chimeras (Varsani et al., 2003), or L2 aa 108-120, 56-81, and 17-36 in the h4 helix (from position 414) of plant-produced (Pineo et al., 2013) or insect cell-produced (McGrath et al., 2013) chimeras, also elicited anti-L1 and anti-L2 responses in mice. The ability of antibodies obtained from sera to neutralize PsVs was investigated in PBNAs. Neutralization of homologous HPV-16 PsVs was only observed with antisera of SAE 65-81 at a titer of 50 and hL1 VLPs at titer ≥6,400 (**Table 2**). The titer observed for hL1 was similar to nAb titers obtained in other studies testing plant-produced chimeras or VLPs: 500–5,000 (Pineo et al., 2013); 6,400 (Maclean et al., 2007); 400 (Paz De la et al., 2009). The low titer obtained by SAE 65–81, in addition to the low vaccination dose, may be due to partially formed cVLPs (**Figure 2E**) and the presentation of L2 on the capsid. Position 431 is located in the C-terminal arm of L1 and is not directly involved in the correct folding of VLPs, but it is close to the h4 helix region where residues 414–426 play a role in VLP assembly (Varsani et al., 2003; Bishop et al., 2007). Steric hindrance due to the substitution of residues with different charges may therefore affect correct folding. No detectable nAbs to HPV-16 were elicited by SAC 108-120, SAC 65-81, SAC 56-81, and SAC 17-36 despite forming cVLPs, suggesting that antigen bound by sera in indirect ELISA and L2 western blots (**Figure 4, Supplementary Figure S2**) were detected by non-neutralizing antibodies. Cross-neutralization of heterologous HPV PsV types was observed with neutralization of HPV-18 by SAC 108-120, SAC 65-81, SAC 56-81, and SAE 65-81 antisera. HPV-58 was cross-neutralized by SAC 108-120, and HPV-11 by SAE 65-81 antisera. All neutralizing titers observed were between 50 and 200 (**Table 2**). Unexpectedly, SAC 17-36 did not show any cross-neutralization despite this epitope being shown by Schellenbacher et al. (2013) to elicit robust anti-L2 antibodies and cross-neutralize up to 16 highrisk HPVs. Moreover, in L2-specific PBNAs (performed as described by Day et al. (2012)), no nAb titers were observed for all antisera tested. L2 PBNAs previously performed by our group with sera from plant-produced HPV-16 L1:L2 chimeras (L2 substituted in the h4 helix (Pineo et al., 2013)) showed low cross-neutralization, with nAb titers of 50 for HPV-11 (L1:L2 56-81) and − 18 (L1:L2 17-36), but no neutralization to homologous HPV-16 PsVs (unpublished data). Surprisingly, no L2 nAb titers were detected for L1:L2 108-120, even though it was found to be the best candidate vaccine as it elicited nAbs to HPV-16 and HPV-52 in L1 PBNAs (Pineo et al., 2013). These same chimeras produced in insect cells (McGrath et al., 2013) elicited sera that showed neutralization of HPV-16/18/31/52 in L2 PBNAs (unpublished data), suggesting that plant-produced chimeras may not assemble as efficiently and thus not display L2 epitopes as well. It is possible this may be the case as it has been suggested that VLP assembly is sensitive to cell type (Li et al., 1997). Overall, these data show that presentation of L2 epitopes on the plant-made L1 chimera surface was not sufficient to produce potent anti-L2 nAbs that are protective against multiple oncogenic HPV types in neutralization assays.

There are several possible explanations for why the anti-L1 responses and L1 and L2 nAb titers were lower or not observed than has previously been reported. The display of L2 epitopes in L1 loops should preserve the L1 epitopes critical for binding by mAbs. The mAb H16.V5 binding site is a major immunodominant epitope used for the assessment of integrity and antigenicity of VLPs. It has been shown to block the binding of >70% human sera (Roden et al., 1997; Wang et al., 2003) and recognizes aa 266-297 in the FG loop and aa 339-365 in the HI loop (Christensen et al., 2001). mAbs H16:V5 and H16:E70 have been extensively mapped and residues Phe50, Ala266, and Ser282 of L1 are vital for binding and the generation of potent nAbs (White et al., 1999). The residues of the DE loop (aa 110-149) are not predicted to have any impact on Phe50, Ala266, and Ser282 residues suggesting that substitution of L2 epitopes in this region should not affect H16:V5 epitope display. However, Lee et al. (2015) have recently shown that the BC (aa 181 and 184) and DE (aa 138-141) loops contribute to binding by H16.V5, with a few contact residues in the EF and HI loops. Furthermore, Bissett et al. (2016) showed that L1 epitopes necessary for the generation of cross-neutralizing antibodies are present in the DE and FG loops. The typespecific nature of L1 nAbs is due the variation found within the L1 surface loops of different HPV genotypes (Chen et al., 2000a; Carter et al., 2003). The exposed surface loops, e.g., BC and EF, show more sequence heterogeneity than the core loops, e.g., DE, seen from analysis of intra- and inter-genotype amino acid variability (Bissett et al., 2014). This variability is thought to be a mechanism in which the virus can avoid nAbs. Tyr135 and Val141 have also been shown to be critical for binding by mAb 26D1 (Xia et al., 2016), further supporting the importance of the DE loop as a cross-neutralizing epitope. Substitution at position 131 thereby replaced regions of L1 in the DE loop that have been shown to be critical epitopes for binding by mAbs and cross-neutralizing antibodies. Huber et al. (2017) have recently shown that sequence replacement of HPV-5 L1 (aa 132-145) with HPV-17 RG1 (L2 aa 14-33) resulted in low type-specific neutralizing titers to HPV-5 and antiserum was not protective against PsV challenge *in vivo*. The authors postulated that replacement of the DE loop resulted in steric hindrance of the major HPV-5 L1 neutralization epitope(s). The four SAC chimeras (108-120, 65-81, 56-81, and 17-36) assembled into cVLPs, but showed low anti-L1 titers and low nAb titers in PBNAs, potentially as a result of the disruption of the abovementioned residues in the DE loop. Additionally, although SAE 65-81 was the only candidate vaccine that neutralized homologous HPV-16, due to the disulfide bonds between Cys175 and Cys428, residues 433-443 are less accessible (Varsani et al., 2003) and therefore presentation of the L2 peptides may not have been efficient to elicit anti-L2 antibodies.

In summary, all chimeric vaccine candidates were immunogenic. Only sera from cVLPs SAC 108-120, SAC 65-81, SAC 56-81 neutralized HPV-18 PsVs at 1:200 dilutions, with PsVs of HPV-11, HPV-16, and HPV-58 neutralized at very low dilutions in L1 PBNAs. Unexpectedly, antisera did not neutralize PsVs in L2 PBNAs, despite L2 being displayed on the L1 capsid. It is important to consider L1 neutralizing epitopes when determining the display position of L2 peptides. Although L2 substitutions did not seem to drastically affect the assembly of cVLPs, misassembled or disrupted VLPs expose epitopes with limited HPV type specificity (Christensen et al., 1994, 1996). This study highlights the importance of particle assembly, peptide presentation, and yield – factors that need to be further investigated to achieve success with next-generation HPV vaccines that elicit potent anti-L1 and anti-L2 nAbs against oncogenic HPV types.

#### ETHICS STATEMENT

This study was carried out in accordance with the recommendation of the Faculty of Health Sciences Animal Ethics Committee of the University of Cape Town. The protocol was approved by the Faculty of Health Sciences Animal Ethics Committee, University of Cape Town (FHS AEC ref.: 014/024).

#### AUTHOR CONTRIBUTIONS

AC performed all experiments and wrote the manuscript. AZ supervised the project and assisted in purification experiments. ER and IH designed and conceptualized the study.

#### FUNDING

We thank the Cancer Association of South Africa (CANSA), the South African Medical Research Council (SA-MRC), and the Carnegie Corporation of New York for funding this work. AC was supported by the Carnegie Corporation of New York, the Poliomyelitis Research Foundation (PRF) Grant no. 12/37, and the University of Cape Town.

#### ACKNOWLEDGMENTS

The authors would like to thank Dr. Innocent Shuro and Mr. Mohammed Jaffer (Center for Imaging Analysis, University of Cape Town) for their technical assistance; Mr. Rodney Lucas

#### REFERENCES


and Ms. Inge Botes for the animal work (Research Animal Facility, University of Cape Town); Dr. Neil Christensen (Department of Pathology, Penn State, PA) and Prof Martin Müller (DKFZ, Heidelberg, Germany), for providing monoclonal antibodies; Prof Rainer Fischer (Fraunhofer Institute, Aachen, Germany) for the pTRAkc-rbcs1-cTP vector; Dr. John Schiller (Laboratory of Cellular Oncology, National Cancer Institute, Bethesda, MD) for supplying the HEK293TT cells and the plasmids used for the HPV PBNAs; and Dr. Hanna Seitz (Laboratory of Cellular Oncology, National Cancer Institute, Bethesda, MD) and Dr. Megan Hendrikse (Biopharming Research Unit, University of Cape Town) for training and performing the L2-PBNA, respectively.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00779/ full#supplementary-material

SUPPLEMENTARY FIGURE 1 | Characterization of cVLP epitope display by indirect ELISA. Binding of monoclonal and polyclonal antibodies to HPV-16 L1:L2 cVLPs and HPV-16 L1 VLPs under denaturing conditions were analyzed in triplicate using conformational neutralizing mAbs H16.V5 (A), H16.E70 (B), H16. U4 (E), H16.9A (F), linear neutralizing mAb H16.J4 (G), non-neutralizing mAb Camvir-1 (C), mAb to L2 peptide 108-120 (L2-4B4) (H), and polyclonal anti-L2 serum (D). Error bars indicate standard deviation.

SUPPLEMENTARY FIGURE 2 | Anti-L2 western blots using pooled mouse sera. HPV-16 L2 protein was probed with pre-bleed (PB) or final bleed (FB) sera at 1:2000 from the 8 vaccine groups. A band at 80 kDa is expected for L2 protein. Labels: M, molecular weight marker (kDa); +, L2 positive control detected with anti-His mAb (1:2000).


of a nine-valent human papillomavirus vaccine in women aged 16–26 years: a randomised, double-blind trial. *Lancet* 390, 2143–2159. doi: 10.1016/ S0140-6736(17)31821-4


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Chabeda, van Zyl, Rybicki and Hitzeroth. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

#### *Edited by:*

*Heiko Rischer, VTT Technical Research Centre of Finland Ltd, Finland*

#### *Reviewed by:*

*Mario Sergio Palma, Universidade Estadual Paulista Júlio de Mesquita Filho Rio Claro, Brazil Peng Zhang, Shanghai Institutes for Biological Sciences (CAS), China Nobuyuki Matoba, University of Louisville, United States*

*\*Correspondence:* 

*Somen Nandi snandi@ucdavis.edu*

*† These authors have contributed equally to this work*

> *‡ Present address:*

*Vally Kommineni, Catalent Pharma Solutions, Kansas City, MO, United States*

#### *Specialty section:*

*This article was submitted to Plant Metabolism and Chemodiversity, a section of the journal Frontiers in Plant Science*

*Received: 11 February 2019 Accepted: 27 May 2019 Published: 28 June 2019*

#### *Citation:*

*Xiong Y, Karuppanan K, Bernardi A, Li Q, Kommineni V, Dandekar AM, Lebrilla CB, Faller R, McDonald KA and Nandi S (2019) Effects of N-Glycosylation on the Structure, Function, and Stability of a Plant-Made Fc-Fusion Anthrax Decoy Protein. Front. Plant Sci. 10:768. doi: 10.3389/fpls.2019.00768*

# Effects of N-Glycosylation on the Structure, Function, and Stability of a Plant-Made Fc-Fusion Anthrax Decoy Protein

*Yongao Xiong1 , Kalimuthu Karuppanan1† , Austen Bernardi1† , Qiongyu Li <sup>2</sup> , Vally Kommineni3‡ , Abhaya M. Dandekar <sup>4</sup> , Carlito B. Lebrilla2,5 , Roland Faller1 , Karen A. McDonald1,6 and Somen Nandi1,6 \**

*1 Department of Chemical Engineering, University of California, Davis, Davis, CA, United States, 2 Department of Chemistry, University of California, Davis, Davis, CA, United States, 3 iBio CDMO LLC, Bryan, TX, United States, 4 Department of Plant Sciences, University of California, Davis, Davis, CA, United States, 5 Department of Biochemistry and Molecular Medicine, University of California, Davis, Davis, CA, United States, 6 Global HealthShare Initiative, University of California, Davis, Davis, CA, United States*

Protein N-glycosylation is an important post-translational modification and has influences on a variety of biological processes at the cellular and molecular level, making glycosylation a major study aspect for glycoprotein-based therapeutics. To achieve a comprehensive understanding on how N-glycosylation impacts protein properties, an Fc-fusion anthrax decoy protein, *viz* rCMG2-Fc, was expressed in *Nicotiana benthamiana* plant with three types of N-glycosylation profiles. Three variants were produced by targeting protein to plant apoplast (APO), endoplasmic reticulum (ER) or removing the N-glycosylation site by a point mutation (Agly). Both the APO and ER variants had a complex-type N-glycan (GnGnXF) as their predominant glycans. In addition, ER variant had a higher concentration of mannosetype N-glycans (50%). The decoy protein binds to the protective antigen (PA) of anthrax through its CMG2 domain and inhibits toxin endocytosis. The protein expression, sequence, N-glycosylation profile, binding kinetics to PA, toxin neutralization efficiency, and thermostability were determined experimentally. In parallel, we performed molecular dynamics (MD) simulations of the predominant full-length rCMG2-Fc glycoform for each of the three N-glycosylation profiles to understand the effects of glycosylation at the molecular level. The MAN8 glycoform from the ER variant was additionally simulated to resolve differences between the APO and ER variants. Glycosylation showed strong stabilizing effects on rCMG2-Fc during *in planta* accumulation, evidenced by the over 2-fold higher expression and less protein degradation observed for glycosylated variants compared to the Agly variant. Protein function was confirmed by toxin neutralization assay (TNA), with effective concentration (EC50) rankings from low to high of 67.6 ng/ml (APO), 83.15 ng/ml (Agly), and 128.9 ng/ml (ER). The binding kinetics between rCMG2-Fc and PA were measured with bio-layer interferometry (BLI), giving sub-nanomolar affinities regardless of protein glycosylation and temperatures (25 and 37°C). The protein thermostability was examined utilizing the PA binding ELISA to provide information on EC50 differences. The fraction of functional ER

**224**

variant decayed after overnight incubation at 37°C, and no significant change was observed for APO or Agly variants. In MD simulations, the MAN8 glycoform exhibits quantitatively higher distance between the CMG2 and Fc domains, as well as higher hydrophobic solvent accessible surface areas (SASA), indicating a possibly higher aggregation tendency of the ER variant. This study highlights the impacts of N-glycosylation on protein properties and provides insight into the effects of glycosylation on protein molecular dynamics.

Keywords: anthrax decoy protein, N-glycosylation, molecular simulation, protein stability, kinetics of protein binding

#### INTRODUCTION

Anthrax is a severe infectious disease caused by *Bacillus anthracis*. The spores can be produced easily and released in air as a biological weapon, leading to a fatality rate of 86–89% (Kamal et al., 2011). *Bacillus anthracis* secrets anthrax toxin, which is composed of a cell-binding protein, namely protective antigen (PA), and two enzymatic proteins called lethal factor (LF) and edema factor (EF). The cellular toxicity starts with the binding of PA to anthrax toxin receptors, after which the bound PA is cleaved by a furin family protease, leaving a 63 kDa fragment bound to the receptors (Wigelsworth et al., 2004). The receptor-PA complex then self-assembles into a heptamer (PA)7, allowing binding of LF and EF, which is then internalized to the cytosol through endocytosis, causing disruption to normal cellular physiology (Wigelsworth et al., 2004). Antitoxins based on receptor-decoy binding show promising advantages over an antibody-based strategy since it is difficult to engineer toxins to escape the inhibitory effect of the decoy without compromising binding to its cellular receptor. By making the extracellular domain of the main anthrax toxin receptor Capillary Morphogenesis Gene 2 protein recombinantly (rCMG2), that can be used as a prophylaxis or post-exposure treatment, to neutralize anthrax toxins in blood, preventing cell infection. Additionally, fusing an Fc domain to rCMG2 increases the serum half-life through interaction with the salvage neonatal Fc-receptor (Roopenian and Akilesh, 2007) and lowers renal clearance rate (Knauf et al., 1988). These factors make rCMG2-Fc a promising anthrax decoy protein, which retains the high binding affinity to the PA along with a longer blood circulatory half-life than rCMG2 (Wycoff et al., 2011; Xi et al., 2014; Karuppanan et al., 2017). We used a plant-based expression system for protein expression due to its rapid production rate and inherent scalability, which is critical for providing rapid response under emergency conditions. Moreover, plants rarely carry animal pathogens and are capable of post-translational modification, making them an appealing alternative to traditional protein expression systems such as mammalian cell culture or microbial fermentation (Chen and Davis, 2016).

N-glycosylation can affect protein folding, structural integrity, and function (Mimura et al., 2000; Krapp et al., 2003), which makes it an important design consideration for glycoproteinbased therapeutics. In some cases, proteins with proper glycosylation exhibit optimal efficacy. For example, Fc glycosylation is required to elicit effector functions of human IgG1 (Hristodorov et al., 2013). Thus, it should be preserved when immune defense is desired, for instance, when expressing antitumor mAbs (Strome et al., 2007). On the other hand, for drugs that treat chronic conditions, the absence of glycosylation is desired to avoid effector functions and associated inflammatory responses. Another important consideration is that glycosylated proteins are less susceptible to proteases, such as pepsin, compared with aglycosylated counterparts (Niu et al., 2016), which should be considered to maximize protein yield.

Although the impacts of protein N-glycosylation have been studied, typically only one or two aspects were studied at a time, and these studies were done on antibodies (Raju and Scallon, 2006; Kayser et al., 2011; Zheng et al., 2011). This study provides a comprehensive approach utilizing a combination of experimental and computational techniques to evaluate the effects of N-glycosylation on rCMG2-Fc fusion protein properties. In this study, the protein expression, toxin neutralization efficacy, binding kinetics, thermostability, and structural configuration were studied experimentally and compared among three rCMG2-Fc glycoform variants. In addition, we employ atomistic molecular dynamics (MD) simulation to understand the structure and dynamics of the predominant glycoform of the APO, ER, and Agly variants. Atomistic MD simulations are well-suited for the study of biomolecular systems, providing full accessibility to virtual, high-resolution, time-ordered, atomic trajectories (Dror et al., 2012). MD simulations have been used to study many different biological systems, including lipid membranes, trans-membrane proteins, and other glycoproteins (Nury et al., 2010; Delemotte and Tarek, 2012; Bernardi et al., 2017). While fully atomistic protein simulation is a powerful tool to investigate structural and functional information, it is important to recognize the current limitations of the technique. In particular, protein folding is known to occur on the order of microseconds to seconds (Dill and MacCallum, 2012), while atomistic protein simulation is generally limited to hundreds of nanoseconds due to limited computing resources. This limitation generally prohibits the straightforward simulation of protein fold transitions. The length-scale of atomistic protein simulations is also computationally restricted, allowing only one rCMG2-Fc dimer to be simulated. Despite these limitations, this work shows MD simulation data is capable of providing insight into the effects of glycosylation on protein structure, and improving our understanding and interpretation of experimental observations. To the best of our knowledge, no study has been conducted on Fc-fusion protein considering that many experimental and molecular simulation factors. This study provides an integrated experimental and computational approach to evaluate Fc N-glycosylation impacts on rCMG2-Fc properties, and potentially serves as a guideline for general glycoproteinbased therapeutic design, especially for Fc-fusion proteins.

# MATERIALS AND METHODS

### Gene Constructs

The codon optimized CMG2-Fc sequence includes the extracellular domain of CMG2 (amino acids 34–220, Genbank: AY233452), followed by two serine residues, the upper hinge of IgG2 (amino acids 99–105, Genbank: AJ250170.1), and Fc region of human IgG1 (amino acid 108–329, Genbank: AAC82527.1). The resulting sequence corresponds to the APO variant as described previously (Karuppanan et al., 2017). A SEKDEL C-terminal motif was included to make the ER variant; a point mutation of N268Q on Fc was included to make the Agly variant. The genes encoding rCMG2-Fc variants were codon-optimized for expression in *Nicotiana benthamiana*. The full construct consists of the CaMV 35S promoter, Ω leader sequence, gene encoding the Ramy3D signal peptide, followed by rCMG2-Fc gene and octopine synthase terminator (details in **Supplementary Figure S1**). *Agrobacterium tumefaciens* (*A. tumefaciens*) EHA105 with the helper plasmid (pCH32) was transfected with the resulting binary expression vectors separately *via* electroporation. A binary vector capable of expressing P19 to suppress RNAi-mediated gene silencing in *Nicotiana benthamiana* plants was co-infiltrated with the rCMG2-Fc-APO binary vector as previously described (Arzola et al., 2011).

#### Transient Protein Expression in *Nicotiana benthamiana*

Protein was produced through whole-plant agroinfiltration as described previously (Xiong et al., 2018), only differed from the plant age and the *A. tumefaciens* cell densities. Briefly, *A. tumefaciens* strains containing the rCMG2-Fc expression cassette and RNA gene silencing suppressor P19 were suspended into the infiltration buffer (10 mM MES buffer at pH 5.6, 10 mM MgCl2 and 150 μM acetosyringone, and 0.02% v/v Silwet-L-77) with a final cell density of 0.25 (A600) for each strain. Then, 5-weeks old *Nicotiana benthamiana* plants were vacuum infiltrated with the *A. tumefaciens* suspension for 1 min after vacuum pressure reaches 20 mm inches Hg. Infiltrated plants were incubated at 20°C growth chamber for 6 days allowing protein expression.

#### Plant Tissue Collection, Extraction, and Purification

Plant tissue was collected at day 6 after infiltration. To evaluate the average expression level, leaves from 10 plants were collected and stored at −80°C prior to extraction. Leaves were ground to fine powder using mortar and pestle with liquid nitrogen. The leaf powder was weighted and mixed with extraction buffer (1X PBS, 1 mM EDTA, and 2 mM sodium metabisulfite) at a leaf mass (g) to buffer volume (ml) ratio of 1:7. The mixture was incubated on a shaker at 4°C for 1 h and then centrifuged at 1,800*g* at 4°C for 1 h, followed by 0.22 μm filtration to remove insoluble particles. Filtered plant extract was loaded to protein A column and eluted with glycine-HCl buffer at pH of 3.0. Purified protein was immediately titrated to neutral pH with 1 M tris buffer, and buffer exchanged to 1X PBS through overnight dialysis at 4°C.

#### ELISA Quantification of rCMG2-Fc in Crude Plant Extracts

Expression of rCMG2-Fc in crude plant extract was quantified by a sandwich ELISA. First, ELISA microplate (Corning, Corning, NY) wells were coated with Protein A (Southern Biotech, Birmingham, AL) at a concentration of 50 μg/ml in 1X PBS buffer for 1 h, followed with plate blocking with 5% nonfat milk in 1X PBS buffer for 20 min. Crude plant extracts and purified standards (Planet biotechnology, Hayward, CA) were loaded to the plate and incubated from 1 h (starting from 0.05 μg/ml, 3-fold serial dilutions). The bound rCMG2-Fc was detected by incubating a horseradish peroxidase (HRP)-conjugated goat antihuman IgG (Southern Biotech, Birmingham, AL) at a concentration of 0.5 μg/ml for 1 h. Plates were washed three times with 1X PBST (1X PBS with 0.05% v/v of Tween20) between each of these steps. All incubation steps were done at 37°C, with an incubation volume of 50 μl. Next, 100 μl of ELISA colorimetric TMB substrate (Promega, Fitchburg, WI) was added to each well and incubated for 10 min, followed by the addition of 100 μl of 1 N HCl to stop the reaction. The absorbance at 450 nm was measured with a microplate reader (Molecular Devices, San Jose, CA). The absorbance of protein standard was plotted as a function of rCMG2-Fc concentration, and was fitted to the 4-parameter model in SoftMax Pro software. The concentration of rCMG2-Fc in crude plant extract was determined by interpolating from the linear region of the standard curve.

#### SDS-PAGE and Western Blotting

SDS-PAGE and Western blot analyses were performed on purified (protein A) rCMG2-Fc variants. Protein was denatured and reduced by treating samples at 95°C for 5 min with 5% (v/v) of 2-mercaptoethanol (Sigma-Aldrich, St. Louis, MO). For nonreducing SDS-PAGE, samples were denatured by heat treatment at 95°C for 5 min. Samples were loaded to precast 4–20% SDS-Tris HCl polyacrylamide gels (Bio-Rad Laboratories, Hercules, CA), running at 200 V for 35 min. For SDS-PAGE, the gel was washed three times with water and stained with Coomassie Brilliant Blue R-250 Staining Solution (Bio-Rad Laboratories, Hercules, CA). For Western blot analysis, samples were transferred to a nitrocellulose membrane by electrophoretic transfer using the iBlot Gel Transfer Device (ThermoFisher, Waltham, MA). For Western blot detecting the CMG2 domain, the membrane was probed with a goat anti-CMG2 polyclonal antibody (ThermoFisher, Waltham, MA) at a concentration of 0.3 μg/ml, followed by incubation of a polyclonal AP-conjugated rabbit anti-goat IgG antibody (Sigma-Aldrich, St. Louis, MO) at 1:10,000 dilution. For Western blot detecting the Fc domain, the membrane was incubated with a polyclonal AP-conjugated goat anti-human IgG antibody (Southern Biotech, Birmingham, AL) at 1:3,000 dilution. The blots were developed using SIGMAFAST BCIP/NBT (Sigma-Aldrich, St. Louis, MO) according to the product instruction.

## SDS-PAGE Densitometry

CMG2-Fc standard from Planet Biotechnology (0.5, 0.75, 1.0, 1.25, and 1.5 μg/lane) and rCMG2-Fc variants (APO, ER, and Agly) were reduced, denatured, and run on a 4–12% Bis-Tris gel (Invitrogen, Carlsbad, CA) gel at 50 mA for 1.5 h. After staining for 1 h with Coomassie Brilliant Blue R-250 staining solution (Bio-Rad Laboratories, Hercules, CA), the gel was washed with water overnight. Next morning, the gel was scanned with Gel Doc™ XR+ System (Bio-Rad laboratories, Hercules, CA), and a standard curve was established by plotting total protein mass of standards as a function of band intensity. Then, the band intensity for the ~50 kDa band of rCMG2-Fc variants was interpolated onto the standard curve to determine the mass of intact rCMG2-Fc, and calculate their concentrations.

#### Protein Sequence Identification by Liquid Chromatography-Tandem Mass Spectrometry (LC-MS/MS)

Purified rCMG2-Fc variants were subjected to protein sequence identification by mass spectrometry (LC-MS/MS). First, 10 μg of purified rCMG2-Fc variants were subjected to SDS-PAGE analyses under reducing conditions as described in SDS-PAGE and Western Blotting section. After staining the gel in Coomassie Brilliant Blue R-250 (Bio-Rad, Hercules, CA, USA) and rinsing in water, the rCMG2-Fc protein band was excised from the gel and submitted to the Proteomics Core facility of University of California, Davis for LC-MS/MS-based protein identification. Briefly, the protein was digested with sequencing grade trypsin per manufacturer's recommendations (Promega, Madison, WI, USA). Specific conditions can be found on UC Davis Proteomics Core Facility website1 ("Ingel Digestion Protocol 2"). Peptides were dried using vacuum concentrator and resolubilized in 2% acetonitrile/ 0.1% trifluoroacetic acid. Peptides were analyzed by LC-MS/MS on a Thermo Scientific Q Exactive Orbitrap Mass Spectrometer in conjunction Proxeon Easy-nLC II HPLC and Proxeon nanospray source. The digested peptides were loaded on a Magic C18 200 Å 3 U reverse phase column (75-micron × 150 mm) and eluted using a 90-min gradient with a flow rate of 300 nl/min. An MS survey scan was obtained for the m/z range 300–1,600, spectra of MS/MS were developed using a top 15 method. An isolation mass window (2.0 m/z) was used for the precursor ion selection, and normalized collision energy (27%) was used for fragmentation. Tandem MS spectra were extracted and charge state deconvoluted by Proteome Discoverer (Thermo Scientific, Asheville, NC, USA). The MS/ MS samples were analyzed using X! Tandem (The GPM, thegpm. org; version TORNADO (2013.02.01.1)). X! Tandem was set up to search UniProt-Nicotiana benthamiana\_database (20140416, 1,538 entries), the cRAP database of common laboratory contaminants2 (114 entries) plus an equal number of reverse protein sequences assuming the trypsin enzyme digestion. Scaffold Proteome Software version 4.0.6.1 (OR, USA) was used to confirm protein identifications. X! Tandem identifications required at least –Log (Expect Scores) scores of greater than 1.2 with a mass accuracy of 5 ppm. Protein identifications were accepted if they contained at least two identified peptides. Using the parameters above, the Decoy False Discovery Rate (FDR) was calculated to be 4.5% on the protein level and 1.94% on the spectrum level. Proteins that contained similar peptides and could not be differentiated based on MS/MS analysis alone were grouped to satisfy the principles of parsimony.

#### Protein N-Glycoform Analysis by Dynamic Multiple Reaction Monitoring

rCMG2-Fc protein dissolved in 50 mM NH4HCO3 was denatured with 2 μl of dithiothreitol (DTT) in a 65°C water bath for 50 min, followed by the alkylation with 4 μl of iodoacetamide (IAA) in the dark for 20 min. The protein was then digested with 1 μg of trypsin in a 37°C water bath for 18 h. After the digestion, the mixture was frozen at −20°C for 1 h to deactivate the trypsin. For N-glycosylation analysis, 2 μl of the mixture was separated with an Agilent Eclipse plus C18 column (RRHD 1.8 μm, 2.1 mm × 150 mm) coupled to an Agilent Eclipse plus C18 guard column (RRHD 1.8 μm, 2.1 mm × 5 mm), using a 10-min-gradient where solvent A with 0.1% formic acid (FA) and 3% of ACN in water, and solvent B with 0.1% of FA and 90% of ACN in water were used for separation. The analysis was conducted on an Agilent 1290 infinity ultra-high-pressure liquid chromatography (UHPLC) system coupled to an Agilent 6495 triple quadrupole (QQQ) mass spectrometer, which was operated in a dynamic multiple reaction monitoring (dMRM) mode. The glycosylation site of the protein is on the infused IgG Fc region so that the transition list used here was adapted from the dMRM method of serum IgG, the development of which was described in great details in the study conducted by Hong et al. (2013). To modify the method for rCMG2-Fc glycosylation quantitation, the plant N-glycan compositions containing xylose rather than sialic acid were used. In total, nearly 30 unique transitions for targeted glycopeptides and peptides of the rCMG2-Fc protein composed of the precursor ion, the product ion, the collision energy, and the retention time for each individual compound of the protein were developed for the dMRM method. The targeted glycopeptides were selected as precursor ions and several common oxonium fragments with m/z values as 204.08 and 366.14 were used as product ions. The software used for data analysis was Agilent MassHunter Quantitative Analysis B.05.02 software. To calculate the relative abundance of each glycopeptide, the abundance of individual glycopeptide was normalized to the abundance of the quantitating peptide.

To validate the glycopeptides quantitated with the dMRM method, glycoproteomic analysis was conducted on rCMG2-Fc proteins. After the trypsin digestion of protein samples, glycopeptides were enriched with iSPE-HILIC cartridges. Then enriched samples were dried completely and reconstitute with 30 μl of water for LC-MS/MS analysis. One microgram of sample was separated with a Thermo Acclaim PepMap RSLC C18 column using a 180-min gradient. The analysis was conducted on a Thermo UltiMate 300 nano LC system coupled

<sup>1</sup> https://proteomics.ucdavis.edu/protocols-2/

<sup>2</sup> www.thegpm.org/crap

to an Orbitrap Fusion Lumos Tribrid mass spectrometer. The collected raw data were inspected with the software FreeStyle and the MS/MS spectra were search with the software Byonic.

#### Toxin Neutralization Assay

Cell viability was determined by the MTS [3-(4,5-Dimethylthiazol-2-yl)- 5-(3-carboxymethoxyphenyl)-2-(4-sulfophenyl)-2H-tetrazolium, inner salt] assay. Mouse macrophage cells RAW264.7 (ATCC, Manassas, VA) were seeded at 2\*104 /well on 96-well cell culture plates in Dulbecco's modified Eagle medium (Corning, Corning, NY) supplemented with 5% heat-inactivated fetal bovine serum (VWR, Radnor, PA) and 2mM of GlutaMAX (Thermo Fisher, Waltham, MA) for 17 hours in a cell incubator (7% CO2, 37°C). Toxin neutralization efficacy was measured by assessing cell viability with 1.4X serial dilutions of rCMG2-Fc variants in the presence of a constant amount of LT (PA at 100 ng/ml, LF at 200 ng/ ml). rCMG2-Fc serial dilutions were incubated with LT for 30 min at 37°C, and then the mixtures were transferred to the cell plates, followed by a 4-h incubation at 37°C, 7% CO2. MTS reagent (Promega, Fitchburg, WI) was added to the wells at 20 μl/well, following with another 4-h incubation at 37°C, 7% CO2. For TNAs with Fc gamma receptors blocked, cells were first treated with 2.4G2 antibody (BD Biosciences, San Jose, CA) at the concentration of 5 or 10 μg/ml for 15 min, then proceeded to the TNA as described above. The plates were gently mixed to ensure uniform color distribution in the wells, and then the absorbance at 490 nm was read with a 96-well plate reader (Molecular Devices, San Jose, CA). The EC50s were calculated using GraphPad Prism.

#### Biolayer Interferometry Analysis

The binding between rCMG2-Fc variants and PA was measured in real time by biolayer interferometry (BLI) using an Octet RED384 instrument (ForteBio, Fremont, CA). rCMG2-Fc in 1X PBS buffer with 1 mM MgCl2 was captured on Anti-hIgG Fc Capture Biosensors (ForteBio, Fremont, CA) at the surface density resulting in a wavelength shift between 0.8 and 1 nm. PA at known concentrations (starting from 5.4 μg/ml, 2X serial dilutions) were loaded to the sample plate. Sensors loaded with rCMG2-Fc were dipped into PA solutions in the kinetics buffer (ForteBio, Fremont, CA) for 300 s, and then switched to the kinetics buffer (ForteBio, Fremont, CA) for 600–900 s allowing for dissociation. The sensorgrams were fitted with 1:1 binding model using ForteBio Data Analysis software (ForteBio, Fremont, CA).

# Functional rCMG2-Fc (PA Binding) ELISA

The microplate (Corning, Corning, NY) wells were coated with 100 μl of PA of anthrax (BEI resources, Manassas, VA) at concentration of 2.5 μg/ml for 1 h in 1X PBS buffer, and then the plate was blocked with 5% nonfat milk for 30 min. The rCMG2-Fc samples in 1X PBS with 1 mM MgCl2 (controls and 37°C incubated samples) were loaded to the plate, 100 μl per well (starting from 2.5 μg/ml, 2.5X serial dilutions). The functional rCMG2-Fc bound to the PA was detected with an HRP-conjugated goat anti-human IgG (Southern Biotech, Birmingham, AL) at concentration of 0.5 μg/ml for 1 h. Plates were washed for three times with 1X PBST between steps above, and all the steps were done at room temperature. The 37°C incubation was eliminated to avoid potential effects on protein activity. The plate was developed with 100 μl of TMB substrate (Promega, Fitchburg, WI) and stopped by the 100 μl of 1 N HCl. The absorbance at 450 nm was read with a microplate reader (Molecular Devices, San Jose, CA).

#### Molecular Dynamics Simulation of rCMG2-Fc Glycoforms: Construction of Initial Protein Configurations

The atomic coordinates of the von Willebrand factor A domain of anthrax toxin receptor capillary morphogenesis protein 2 (CMG2) were obtained from the protein data bank with 4.3 Å resolution X-ray crystallography (PDB ID: 1TZN) (Lacy et al., 2004). The atomic coordinates of the fragment crystallizable (Fc) region of human Immunoglobulin G were obtained from the protein data bank with 2.2 Å (PDB ID: 3SGJ) (Ferrara et al., 2011). Missing residues, de-mutations, and fusion of the two crystal structures with the IgG2 hinge linker was performed with Modeller 9.16 (Webb and Sali, 2014). All four linker cysteines were modeled to participate in symmetric inter-chain disulfide bonds. Aglycosylated (Agly), GnGnXF, and MAN8 glycoforms of the chimeric dimer CMG2-Fc were simulated for 100 ns. In molecular simulations, we used modified nomenclatures to specify the simulated glycoform as only the predominant glycoform from each variant was simulated.

## Glycan Attachment

Glycans for the GnGnXF, and MAN8 glycoforms were attached to Asn268 for both monomers. Glycans were attached using the glycam.org glycoprotein builder (Woods, 2005). The glycans were subsequently sequentially energy minimized through rigid rotation about three bonds: Nγ prot-C1 glycan, Cβ prot-Cγ prot, and Cα prot-Cβ prot using the minimization scheme outlined in Bernardi et al. (2017). Energy was calculated with GROMACS 5.1.4 (Bekker et al., 1993; Berendsen et al., 1995; Abraham et al., 2015) using the AMBER ff14SB (Maier et al., 2015) and GLYCAM06-j (Kirschner et al., 2008) force fields. The lowest energy rotational conformer was selected for the initial coordinates of the molecular dynamics simulation for each glycoform.

#### Simulation Setup

About 100 ns simulations of Agly, MAN8, and GnGnXF rCMG2-Fc glycoforms were performed in GROMACS with the AMBER ff14SB and GLYCAM06-j force fields. The AMBER topology files were exported to GROMACS format using ACPYPE (da Silva and Vranken, 2012) with updated modifications which enable simulations with the GLYCAM forcefield in GROMACS (Bernardi et al., 2019). The 100 ns production simulations were first preceded by energy minimization in vacuum, solvation, solvated energy minimization, a 100 ps NVT equilibration, and finally a 100 ps NPT equilibration. Both energy minimizations were terminated with a maximum force tolerance of 1,000 kJ mol−1 nm−1. Each glycoform was solvated with explicit water with a minimum distance of 1.2 nm between the glycoprotein and the edge of the periodic box. The solvated systems were then neutralized with either sodium or chloride ions, and then concentrated to 0.155 M NaCl. The velocity-rescale thermostat (Bussi et al., 2007) was used with a reference temperature of 310 K and a time constant of 0.1 ps. The isotropic Parrinello-Rahman (Parrinello and Rahman, 1981) barostat was used with a reference pressure of 1 bar, a time constant of 2 ps, and an isothermal compressibility of 4.5 × 10−5 bar−1. All nonbonded interactions employed a short-range cutoff of 1 nm, with vertically shifted potentials such that the potential at the cutoff range is zero. The Particle-Mesh Ewald method (Darden et al., 1993) with cubic interpolation was used to model long range electrostatic interactions. All non-water bonds were constrained with LINCS (Hess, 2008), while water bonds were constrained with SETTLE (Miyamoto and Kollman, 1992). A 2 fs timestep was used with a sampling interval of 0.1 ns, for a total of 1,000 data points per 100 ns simulation.

#### RESULT

#### Transient Expression of rCMG2-Fc Variants

Recombinant CMG2-Fc variants were transiently expressed in *Nicotiana benthamiana* whole plants *via* agroinfiltration under identical conditions, and the expression levels were determined in crude leaf extract at 6 days post infiltration (dpi) with a sandwich ELISA detecting the Fc domain of rCMG2-Fc. The expression level rankings on the leaf fresh weight basis (LFW) from high to low were: APO (578 mg/kg LFW), ER (430 mg/kg LFW), and Agly (148 mg/kg LFW) variants (**Figure 1**). Both APO and Agly variants, which only differ in the N-glycosylation

The expression level rankings from high to low are: APO, ER, and Agly variants. Error bars represent standard error of two infiltration batches (Leaves from 10 plants per batch).

generated by a point mutation of N268Q, were targeted to plant Apoplast. The only N-glycosylation site within rCMG2-Fc is located at the CH2 domain of IgG1 Fc (N297 in IgG1, N268 in rCMG2-Fc). The significantly higher expression of the APO variant with respect to the Agly variant might be due to stabilizing effects of N-glycans on protein accumulation *in planta*. This observation is consistent with previous studies, where proteins are more susceptible to protease cleavage after deglycosylation (Liu et al., 2008; Zheng et al., 2011). In many cases, targeting proteins to the ER will result in a greater protein yield compared with targeting to the cytosol or apoplast (Pan et al., 2008; Sainsbury and Lomonossoff, 2008; Pillay et al., 2014). The plant apoplast is usually not a preferred location for recombinant protein accumulation given the abundance and poor specificity of proteases (Benchabane et al., 2008; Pillay et al., 2014). However, in this case, the APO variant resulted in a high accumulation, which indicates that rCMG2-Fc is stable even in a protease-rich environment when glycosylated. Comparing the APO to ER variants, the expression levels are similar. Besides the contribution from the high stability of rCMG2-Fc in apoplast, it is also possible that the ER variant was under-extracted due to the additional ER membrane barrier considering that no detergent was used in the extraction buffer.

In **Figure 2A**, purified rCMG2 of all variants showed a dominant band around ~50 kDa, corresponding to the rCMG2-Fc monomer, the faint bands below were likely a Fc-containing fragment. This observation is consistent with the hypothesis that N-glycosylation stabilizes protein *in planta*, as less degradation was found in APO and ER variants than in the Agly variant. It was hypothesized that all three variants were degraded at the same site(s) as proteolytic cleavage shows some degree of site specificity. By N-terminal sequencing of the Fc-containing fragment, three cleavage sites were identified within or near the linker of rCMG2-Fc, and those sites were shared among variants (**Supplementary Figure S2**). It is not surprising that cleavage occurred near the linker region because the linker in rCMG2-Fc is flexible and proteases tend to cleave at solvent-exposed, flexible and less structured regions (Song et al., 2012). The SDS-PAGE under nonreducing conditions (**Figure 2B**) reveals that the expressed rCMG2-Fc primarily formed a homodimeric species (~100 kDa). The lower bands were dimerized Fc-containing fragments. The band at 250 kDa in the ER variant sample might represent protein aggregate, which was not observed in APO or Agly variants, suggesting that the ER variant is prone to aggregation more than the other two variants. Western blots detecting CMG2 (**Figure 2C**) and Fc (**Figure 2D**) were conducted to confirm the presence of both domains. These two blots confirm the existence of both domains, and also confirm that the lower band in **Figure 2A** only contains an Fc fragment. No detectable CMG2 fragment in **Figure 2C** confirms that the protein degradation happened during *in planta* production, and rCMG2-Fc remained intact during storage once being purified from crude plant extract.

#### rCMG2-Fc Amino Acid Sequence Determination

To confirm the amino acid sequence of rCMG2-Fc variants, purified variants were subjected to LC–MS/MS analysis. The

Agly (purple).

N-glycosylation site is highlighted in yellow, and the sequences not clearly detected are represented with dashes (**Figure 3**). The point mutation (N268Q) in Agly variant was detected as predicted. The sequence coverage with respect to the control (theoretical) sequence for the APO, ER, and Agly variants were 95.7, 91.7, and 90.6%, respectively. The high sequence coverage with both N- and C-terminal predicted sequences confirmed the production of full length rCMG2-Fc variants in the plants (**Figure 3; Supplementary Figure S3**).

#### Mass Spectrometry Analysis of rCMG2-Fc N-Glycosylation Profile

To determine the glycoform profile, rCMG2-Fc variants were subjected to LC-MS/MS analysis for N-glycosylation identification. For the APO variant, 99% of N-glycoforms were plant complextype, with the most abundant structure of GnGnXF (**Figure 4; Supplementary Figure S4**), indicating that protein went through the secretory pathway and was fully glycosylated as expected. For the ER variant, the relative abundance of mannose-type N-glycans was 50%, with MAN8 (18%) as the most abundant mannose-type structure (**Figure 4; Supplementary Figure S4**). Overall, a complex-type N-glycan (GnGnXF) was the most predominant N-glycan (34%). To validation the dMRM methods, both APO and ER variants were subjected to glycoproteomic analysis. Compounds quantitated in dMRM methods were identified in glycoproteomic analysis, and several representative full MS spectra for the APO and ER variants are shown in **Supplementary Figures S5A,B**, respectively, with the compounds assigned to peaks. Comparing the APO to ER variant, a significant shift from plant complex-type to mannose-type N-glycans was observed, where about half of rCMG2-Fc was retained in ER upon the addition of C-terminal ER retention sequence SEKDEL. Although the retention is not perfect, the glycoform profiles of APO and ER variants are distinct. This incomplete ER retention agrees with previous studies (He et al., 2012; Roychowdhury et al., 2018), where proteins can sometimes escape the ER retention signal and progress to downstream N-glycosylation processes. The glycan MS data was used to select representative glycoforms for molecular dynamics simulations.

#### Toxin Neutralization Assay

To test the toxin neutralization efficacy of rCMG2-Fc in a biologically relevant environment, a cell-based toxin neutralization assay (TNA) was developed using a mouse macrophage cell line (RAW264.7). Since toxin concentrations reported in the literature vary, the concentrations and ratio of PA and LF were optimized by toxin titration. A PA concentration of 100 ng/ml and an LF concentration of 200 ng/ ml were chosen, as these concentrations resulted in almost complete cell killing (97%, **Supplementary Figure S6**). Once the lethal toxin (LT, the combination of PA and LF) concentration was fixed, the toxin neutralization efficacy of rCMG2-Fc variants was analyzed over a range of rCMG2-Fc concentrations. The concentration of intact rCMG2-Fc was determined by SDS-PAGE densitometry (**Supplementary Figure S7**). The dose-response curves are shown in **Figure 5A**, from which the EC50 values were determined. The average EC50 values from low to high are, 67.6 ng/ml for the APO variant, 83.15 ng/ml for the Agly variant, and 128.9 ng/ml for the ER variant, where the EC50 of the ER variant was statistically different (**Figure 5B**) from the Agly and APO variants (*p* < 0.05).

This difference in EC50 could have resulted from the toxin neutralization that depended on the Fc gamma receptors (FcγR) on macrophages, where the interaction between Fc and FcγR on the cell surface contributes to toxin neutralization, which in turn lowers the EC50 (Verma et al., 2009; Ngundi et al., 2010). Thus, TNAs with FcγR blocking were performed to examine the possible contribution of the FcγR. Cells were pre-treated with anti-FcγR antibody 2.4G2 for 15 min prior to the TNA. The resulting dose-response curves are shown in **Figure 5C; Supplementary Figure S8**, where the blocked FcγR curves overlap with the control curve (no antibody),

FIGURE 4 | N-glycosylation composition of rCMG2-Fc variants (ER and APO). The relative abundance of each glycoform is represented as the percentage of total signal. The schematic representation and the name of each glycan are listed to the right of the N-glycosylation profile figure, with background color matching with the corresponding glycan.

demonstrating that the FcγR-Fc interaction did not contribute to toxin neutralization. The differences in EC50 values could result from differences in binding kinetics between rCMG2-Fc variants and PA, stability of rCMG2-Fc variants during cell culture incubation, or rCMG2-Fc variant conformation. Additional experiments including BLI, functional ELISA and MD simulation were conducted to evaluate these possibilities.

#### Binding Kinetics Between rCMG2-Fc and PA

The binding kinetics between rCMG2-Fc variants and PA were measured in real-time with BLI. The measurements were taken at room temperature (25°C) and body temperature (37°C) to provide information on how binding kinetics are affected by temperature, and potentially explains the difference in EC50 values. Comparing binding kinetics between rCMG2-Fc variants and PA at the same temperature, we found the association rate constant (*k*a) and the dissociation rate constant (*k*d) to be similar regardless of protein N-glycosylation (**Figure 6**). BLI sensorgrams and fitting curves are shown in **Supplementary Figure S9**. As temperature was increased from 25 to 37°C, higher *k*a and *k*d values were obtained (**Figure 6**) as expected, since both binding and dissociation will happen faster at an elevated temperature. The equilibrium dissociation constant (*K*D) at both temperatures is on the same order (100 pM–1 nM), with slightly lower *K*D at 37°C, demonstrating a desired strong binding between rCMG2-Fc and PA (**Table 1**). These results show that all rCMG2-Fc variants are functional with very high binding affinities to PA, and the binding equilibrium is not strongly affected by temperature. Comparing with previously reported binding kinetics of rCMG2 to PA (Wigelsworth et al., 2004), both *k*a and *k*d (**Figure 6**) are in

FIGURE 5 | Neutralization of anthrax lethal toxin by rCMG2-Fc variants. (A): rCMG2-Fc variants at various concentrations were incubated with 100 ng/ml of PA and 200 ng/ml of LF for 30 min at 37°C prior to adding to the mouse macrophage cell culture (RAW264.7). Cell viability was measured with an MTS assay. Data shown are from one representative experiment out of three replications. Each data point corresponds to the mean of duplicate wells on the sample plate, and error bars represent standard error of the mean. (B): TNA EC50 values for rCMG2-Fc variants (\*: *p* < 0.05, ns: *p* > 0.05; Tukey's test performed after observing a significant difference (*p* = 0.0009) in one-way ANOVA.) Each bar represents the mean from three separate experiments with duplicate well measurements, with error bars representing standard deviation from the means. (C): Toxin neutralization assay using the APO variant with FcγR blocked. Control: without anti-FcγR antibody treatment. Two concentrations of anti-FcγR antibody were tested: 5 and 10 μg/ml, which were incubated with macrophages for 15 min prior to rCMG2-Fc addition. EC50 values for all three conditions were comparable.

agreement with our BLI results. Moreover, protein N-glycosylation showed no significant impact on binding kinetics of rCMG2-Fc to PA, as the average deviations in *K*D values were observed to be 2% (25°C) and 26% (37°C), which is considered to be within the experimental error (30%) of this method (Kamat and Rafique, 2017).

#### Thermostability of rCMG2-Fc

Protein stability is an important property of biopharmaceuticals, as it is often required that a protein remains stable and active during both storage and circulation in the target system after injection. To assess the stability of rCMG2-Fc variants and understand the EC50 differences observed in TNAs, a functional ELISA was developed to measure the amount of active rCMG2-Fc (able to bind to PA) after incubation at 37°C for a range of time periods. Four time periods were tested: 1, 2, 3 h and overnight (20 h). Since all three variants showed similar binding kinetics in BLI experiments, we hypothesized that the differences in EC50 may result from rCMG2-Fc stability differences between variants during the 8-h cell incubation period at 37°C. With a less stable variant, the fraction of functional rCMG2-Fc decays over time, together with the fact that CMG2 and PA binding is reversible (Wigelsworth et al., 2004), resulting in a higher EC50 compared to more stable variants.

The observed binding to PA was the same before and after incubation at 37°C for APO and Agly variants (**Figures 7A,C**). For the ER variant, the fraction of functional rCMG2-Fc decayed over time, and a significant drop was observed for the overnight incubation sample (**Figure 7B**), which explains the higher EC50 for the ER variant in the TNA (**Figure 5B**). The thermostability


*\*Published KD (Wigelsworth et al., 2004) between rCMG2 and PA is included in the last row for comparison.*

result is in agreement with the TNA results, with stability rankings from high to low: APO/Agly then ER variants.

#### Molecular Dynamics Simulation of rCMG2-Fc Glycoforms

Molecular dynamics (MD) simulations of the respective predominant glycoform in the three rCMG2-Fc variants were performed to obtain high resolution structural and dynamical information. The three simulated glycoforms were GnGnXF, MAN8, and Agly. These glycoforms were selected according to the mass spectrometry analysis, shown in **Figure 4**. GnGnXF is the predominant glycoform of both the APO and ER variants. Therefore, the MAN8 glycoform, the second-most expressed glycan in the ER variant, was simulated to elucidate differences between the APO and ER variants.

# Macrostructural Analysis

GnGnXF, MAN8, and Agly rCMG2-Fc glycoforms were each independently simulated for 100 ns. **Figure 8** shows the initial and final conformations of all glycoforms from the simulations. The images at *t* = 0 are all aligned by the Fc CH2 domain; the images at *t* = 100 ns are each independently aligned to best illustrate macrostructural orientation. For all simulations, we see the core structure of the protein remains intact. However, there is significant variability in the macrostructural orientation of the final structures' CMG2 and Fc domains. The final GnGnXF and Agly structures exhibit significantly contracted linkers with respect to the final MAN8 structure, where the linker is fully extended. All glycoforms retain accessibility of the PA binding site after simulation. This is in agreement with the BLI (**Figure 6**), where all three variants have similar *k*<sup>a</sup> and *k*d, and the functional ELISA (**Figure 7**), where all three variants have control curves with similar absorbance level.

To quantitatively characterize the macrostructural differences between the GnGnXF, MAN8, and Agly glycoforms, we report the center of mass (COM) distance between the CMG2 and Fc domains in **Figure 9**. Among the glycoforms, MAN8 has the highest COM distance with the narrowest spread at around 7.2 nm, GnGnXF and Agly have lower COM distances with wider spreads, roughly centered around 6 nm, with the Agly COM spread being the widest The significantly higher COM of MAN8 than GnGnXF and Agly is visually consistent with

were smoothed with an averaging window of 0.2 nm.

the final conformations in **Figure 8**. The large spread in Agly rCMG2-Fc COM distance is largely due to refolding of the flexible residues in and near the LNK region. The COM distance temporal profile in **Supplementary Figure S10** displays a gradual increase in Agly COM distance, which tapers off around 80 ns. Thus, the larger spread in Agly COM is due to conformational changes during the simulation, not a single conformation with increased flexibility. The root mean square fluctuation (RMSF) data in **Supplementary Figure S11** show increased Agly RMSF in and around the LNK region during the first 50 ns and a reduction in Agly RMSF during the latter 50 ns, which is consistent with the trending exhibited in the Agly COM distance.

# Backbone RMSD of Ordered Domains

The backbone root mean square deviation (RMSD) of ordered domains referenced from the initial and final conformations of the CMG2 and Fc regions for all three simulated glycoforms is shown in **Figure 10**. A low RMSD indicates protein folding transition is minor and the protein structure is stable. For each monomer, the ordered domain of the CMG2 region were defined as residues 10–181, while the ordered domains of the Fc region were defined as the CH2 and CH3 regions: residues 210–308 and 315–414, respectively. Each domain's RMSD was fit and referenced independently, and subsequently averaged

FIGURE 10 | Backbone RMSD of ordered domains referenced from the initial or final conformations of the CMG2 (top) and Fc (bottom) regions of the GnGnXF, MAN8, and Agly glycoforms.

to produce the plots in **Figure 10**. Two RMSD profiles were averaged to generate the CMG2 RMSD, and four RMSDs were averaged to generate the Fc RMSD for each glycoform.

All RMSDs are below 1.6 Å, indicating a generally conserved fold for all the ordered domains of each glycoform. This is further confirmed by the secondary structure data, provided in **Supplementary Figure S12**. In general, the RMSD from the final structure is lower than that of the initial, indicating conformational convergence is progressing throughout the simulations. The low RMSD and conserved secondary structure in all three simulated glycoforms indicate that there were no major refolding events; thus, the reduced activity in the TNA (**Figure 5A**) and the functional ELISA (**Figure 7B**) of the ER variant is likely not due to reduced fold stability. The Fc RMSD is noticeably lower than the CMG2 RMSD for the Agly and GnGnXF glycoforms, indicating minor fold transitions in the CMG2 domains of these glycoforms.

#### Hydrophobic Solvent Accessible Surface Areas

The hydrophobic solvent accessible surface area distributions are reported in **Figure 11**. Hydrophobic SASA is defined as SASA that is associated with hydrophobic amino acid residues (ALA, GLY, ILE, LEU, MET, PHE, PRO, and VAL). The MAN8 glycoform has the highest amount of hydrophobic SASA in the simulation, while the aglycosylated has the lowest. Increased hydrophobic SASA likely yields a greater aggregation propensity (Fink, 1998).

To elucidate which residues contribute most to the hydrophobic SASA difference between MAN8 and Agly, the average per-residue hydrophobic SASA was calculated, and the five residues with the greatest positive difference of MAN8 minus Agly hydrophobic SASA were obtained. These residues were Pro199, Pro201, Leu205, and Pro300 of one monomer, and Leu206 of the other monomer. All of these residues are located in close proximity to the N-terminal region of the Fc domain, shown in **Figure 12**. We see that these residues are highly spread out in the MAN8

averaging window of 3 nm2 .

structure, moderately spread out in the GnGnXF structure, and closely associating in the Agly structure. The increased SASA in the MAN8 glycoform is consistent with the existence of the high molecular weight band in the ER variant at ~250 kDa in the SDS-PAGE results (**Figure 2B**).

#### DISCUSSION

In this paper, experimental and computational techniques were employed to study the effects of N-glycosylation on the

(disulfide bonds are indicated with solid gray lines).

expression, structure, function, and stability for anthrax decoy protein rCMG2-Fc.

#### rCMG2-Fc Expression

N-glycosylation was found to strongly stabilize rCMG2-Fc *in planta* as both APO and ER variants have over two-fold higher expression level than the Agly variant, shown in **Figure 1**. The increase in expression level of glycosylated variants could be attributed to a decreased susceptibility of the APO and ER variants to proteases, where the steric hindrance of oligosaccharides inhibits proteolytic degradation of glycosylated rCMG2-Fc. From a manufacturing standpoint, producing glycosylated rCMG2-Fc would require less than half the production capacity of the aglycosylated form. Thus, when glycosylation is not detrimental, preserving natural N-glycosylation sites can enhance protein production. Alternatively, retaining the aglycosylated protein in the ER can also help improve yield, since the ER has fewer types of proteases than the apoplast (Doran, 2006). Furthermore, unlike the apoplast, the ER has chaperone proteins to provide folding support (Doran, 2006). It is also possible to enhance a glycoprotein's function *via* modification of the N-glycosylation profile during expression. This can be achieved using subcellular targeting and/or the co-expression of glycan-processing enzymes or addition of enzyme inhibitors in the agroinfiltration buffer. For example, high mannose Fc N-glycosylation has been shown to enhance antibody-dependent cell-mediated cytotoxicity (ADCC), which can be achieved by targeting the protein to the ER or addition of mannosidase I inhibitor to the agroinfiltration buffer or cell culture medium (Yu et al., 2012; Xiong et al., 2018; Kommineni et al., 2019). It is worth noting that N-glycosylation of Fc is not strictly required when Fc is fused to the target protein for the purpose of increasing circulation half-life (Souders et al., 2015).

SDS-PAGE and Western blot confirmed that intact rCMG2-Fc was produced, with bands near 50 kDa (**Figure 2A**). There is also a band around 250 kDa in the non-reducing SDS-PAGE for the ER variant (**Figure 2B**), which may correspond to a high molecular weight protein aggregate. The hypothesis of increased aggregation propensity of the ER variant is supported by the hydrophobic SASA predicted from our MD simulations. MAN8 was found to have significantly higher hydrophobic SASA, with the five strongest contributing residues in the N-terminal region of the Fc domain. Protein aggregation might reduce PA binding capacity, as it is evident in the functional ELISA (**Figure 7**), which is later discussed in the rCMG2-Fc function section.

#### rCMG2-Fc Function

The ability for rCMG2-Fc variants to sequester anthrax PA and prevent cell death was assessed using a cell-based TNA. Results are shown in **Figure 5**, where the ER variant has a statistically higher EC50 value than APO and Agly variants. The possibility of FcγR dependent toxin neutralization (Verma et al., 2009; Abboud et al., 2010; Ngundi et al., 2010) was ruled out as the EC50 values did not change upon FcγR blocking. This is likely because previous studies used antibodies or serum against PA that bind to PA but not necessarily blocks the its binding site to the anthrax cellular receptor CMG2. Thus, when incubated with cells, the antibody-bound PA can still bind to CMG2 and form prepore, resulting in LF and EF endocytosis. In this situation, Fc and FcγR could form an immune complex that is then sorted and degraded in the lysosome (Abboud et al., 2010). In our experiments, rCMG2-Fc competitively inhibits binding between cellular receptor CMG2 and PA. This completely eliminates LF and EF internalization.

The binding kinetics between rCMG2-Fc variants and PA were determined by BLI, results shown in **Figure 6**. The 37°C *K*D values were slightly lower than the 25°C *K*D values, but no appreciable difference in binding kinetics as a function of glycosylation was observed at 25 or 37°C. Considering the CMG2 domain is linked to Fc through a flexible linker, it is not surprising that the glycosylation of the Fc domain has minimal impact on the binding kinetics of the CMG2 domain with PA. Moreover, the sub-nanomolar affinity reported in this work is consistent with previous work on rCMG2 and PA binding kinetics (Wigelsworth et al., 2004), which is direct evidence that neither the fused Fc domain nor its glycosylation interferes with CMG2/PA binding kinetics. Even though the kinetics of all three variants were unaffected by glycosylation, it is possible that the fraction of functional protein changes over time. The BLI experiments only characterized the interaction kinetics, which are independent of the fraction of functional rCMG2-Fc on the sensor tip.

The hypothesis that the fraction of functional protein at 37°C is glycosylation-dependent was confirmed with the functional rCMG2-Fc ELISA, where the ER variant lost the most activity overnight, consistent with the ER variant having the highest EC50. However, the MD simulation data exhibit high fold stability in all glycoforms, according to the RMSD of ordered domains (**Figure 10**) as well as the secondary structure (**Supplementary Figure S12**). Thus, we hypothesize that the reduction in activity is not due reduced fold stability. Moreover, the MAN8 had the highest hydrophobic SASA among the three simulated glycoforms, indicating a higher aggregation propensity. The residues that contributed the most to the decreased hydrophobic SASA in the simulated Agly from the MAN8 were located in the N-terminal region of the Fc domain, just after the C-terminal region of the linker (**Figure 12**). This region is also more extended in the MAN8 glycoform, as shown in the final conformation (**Figure 8**) and the COM distance distribution (**Figure 9**). This exposure of hydrophobic residues in the thinly extended N-terminal region of the Fc domain could facilitate the aggregation with other rCMG2-Fc or protein fragments. This could explain the reduced activity for ER variant in the functional ELISA (**Figure 7**) and the 250 kDa band observed in the nonreducing SDS-PAGE gel for the ER variant (**Figure 2B**). Fusion protein aggregation has been observed in the literature for another Fc fusion protein ALK1-Fc, where a high abundance of MAN5 glycoform was found as high molecular weight aggregates (Strand et al., 2013). It is worth noting that a recent study on high-mannose type IgG showed a decrease in protection factors of backbone amide nitrogen in the CH2 domain (Fang et al., 2016), which could also contribute to the reduced activity of the ER variant. Meanwhile, Lu et al., found that highmannose glycans have no detrimental effect on antibody stability and aggregate rate (Lu et al., 2012). The antibodies used in their study were IgG1 and IgG2, which has a molecular weight ~150 kDa. Since both the protein size and structure can affect protein stability, it is not surprising to see diverse protein stability results performed on different molecules. Within Fc-fusion proteins, structure can still vary depending on the fusion partner size and structure. However, we do expect Fc-fusion proteins with similar structure, molecular weight and conserved glycosylation site as rCMG2-Fc to likely exhibit similar behaviors.

The APO and Agly variants had no significant difference in fraction of functional protein after being incubated overnight at 37°C. This result is in agreement with a previous study where human IgG1s (complex N-glycan and aglycosylated) stored at 37°C for 21 days had no difference in aggregation or fragmentation (Hristodorov et al., 2013), suggesting the absence of glycans had no major impact on stability under physiological temperature. A previous study of rCMG2-Fc using a different linker exhibited variants with oligomannose glycan or no glycan and had similar TNA EC50 values for both variants (Wycoff et al., 2011), while our data indicate a significant increase in EC50 for the ER (oligomannose) variant over the APO and Agly variants. The linker used in Wycoff et al.'s (2011) study and this study were two serine residues followed by the upper hinge of IgG1 (SSEPKSCDKTHT) and IgG2 (SSERKCCVE), respectively. These linkers differ in both length and sequence. This study utilizes the IgG2 hinge to enhance protein dimerization due to the two additional cysteine residues for inter-hinge disulfide bond formation. Since both linker sequence and length can affect stability and function of fusion proteins (Chen et al., 2013; Lee et al., 2013), thus, it is not surprising to see a difference in toxin neutralization ability.

In this study, the expressed N-glycosylation variants were not found to functionally affect rCMG2-Fc/PA binding. However, protein expression, integrity and thermostability were affected by glycosylation. The APO variant showed the best overall performance with a high expression level, high protein integrity and thermostability. However, the debate on plant complextype N-glycosylation is ongoing. It has been shown that plant complex N-glycans are immunogenic by the detection of anti-plant glycoepitopes antibodies in human sera (Gomord et al., 2005), despite that no adverse effects were observed when plant-made pharmaceuticals (PMPs) with complex N-glycans were applied to patients with IgE against plant glycoforms (Ma et al., 1998; Zeitlin et al., 1998; Mari et al., 2008). In addition, the plant β1,2-xylose and α1,3-fucose moieties can potentially induce rapid clearance from circulation due to the presence of IgE against those epitopes. Similarly, variants containing mannose-type N-glycans can lead to a shorter circulation half-life compared to the Agly variant, due to the presence of mannose receptor in serum (Goetze et al., 2011). However, the shorter half-life of rCMG2-Fc variants (APO and ER) can turn into an advantage when using as a postexposure treatment (dosing is not limited), resulting in fast blood clearance of decoy-toxin complex. For prophylaxis, the Agly variant is likely the best option, considering its longer circulation half-life than other two variants.

#### Summary and Future Perspectives

Glycosylation variants of an anthrax decoy protein rCMG2-Fc were successfully produced in *Nicotiana benthamiana* plants with distinct N-glycosylation patterns. The expression levels were in the range of 148–578 mg/kg LFW. The N-glycosylation profiles, characterized by mass spectrometry, were 50% highmannose type for the ER variant and 99% complex-type for the APO variant. The rCMG2-Fc variants were all functional with sub-nanomolar dissociation rate constants regardless of N-glycosylation pattern. The higher EC50 of the ER variant compared with the APO and Agly variants was likely due to the loss of activity during the 37°C incubation condition used in the TNA assay. The loss of activity could be explained by increased aggregation of the ER variant, consistent with the SDS-PAGE and MD simulation results. To better assess the effects of N-glycosylation on protein properties, *in vitro* enzymatic glycan modification can be employed to express more uniform glycoforms. This avoids glycan heterogeneity, allowing a more accurate comparison between experimental and MD simulation data. Moreover, other proteins, especially Fc-fusion proteins can be studied using the methodology provided in this work to assess the applicability of our findings and to optimize glycoprotein therapeutic design.

# REFERENCES


#### DATA AVAILABILITY

All datasets generated for this study are included in the manuscript and/or the **Supplementary Files**.

## AUTHOR CONTRIBUTIONS

YX conceptualized, led, designed, and performed most of the experiments and wrote and edited the manuscript. KK and AB contributed equally, designed, and performed experiments, wrote and edited the manuscript. QL perform the N-glycan analysis and wrote a part of the manuscript. VK, AD, CL, RF, KM, and SN designed experiments, reviewed data, results, and interpretations, and edited the manuscript. All authors read, revised, and approved the manuscript.

### FUNDING

This work was supported by the Defense Threat Reduction Agency (HDTRA1-15-1-0054). The funding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, and in the decision to publish the results. Its contents are solely the responsibility of the authors and do not necessarily represent the official views of DTRA.

# ACKNOWLEDGMENTS

We would like to acknowledge Planet Biotechnology, Inc. for providing the CMG2-Fc standard. We also like to acknowledge BEI Resources for providing the protective antigen and lethal factor of anthrax.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/article/10.3389/fpls.2019.00768/ full#supplementary-material


**Conflict of Interest Statement:** The data analyses, results presented, outcomes of this study are personal views of independent authors (YX, KK, AB, QL, VK, AMD, CBL, RF, KAM and SN). The outcomes do not reflect any financial or commercial interest of either the University of California, Davis or iBio CDMO LLC, Bryan, TX.

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Xiong, Karuppanan, Bernardi, Li, Kommineni, Dandekar, Lebrilla, Faller, McDonald and Nandi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Non-target Effects of Hyperthermostable α-Amylase Transgenic Nicotiana tabacum in the Laboratory and the Field

Ian Melville Scott\*, Hong Zhu, Katherine Schieck, Amanda Follick, L. Bruce Reynolds and Rima Menassa

London Research and Development Centre, Agriculture and Agri-Food Canada, London, ON, Canada

#### Edited by:

Anneli Ritala, VTT Technical Research Centre of Finland Ltd., Finland

#### Reviewed by:

Ralf Alexander Wilhelm, Julius Kühn-Institut, Germany Vinay Kumar, Central University of Punjab, India Kerstin Sabine Schmidt, BioMath Gmbh, Germany

> \*Correspondence: Ian Melville Scott Ian.Scott2@canada.ca; ian.scott@agr.gc.ca

#### Specialty section:

This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science

Received: 31 August 2018 Accepted: 20 June 2019 Published: 09 July 2019

#### Citation:

Scott IM, Zhu H, Schieck K, Follick A, Reynolds LB and Menassa R (2019) Non-target Effects of Hyperthermostable α-Amylase Transgenic Nicotiana tabacum in the Laboratory and the Field. Front. Plant Sci. 10:878. doi: 10.3389/fpls.2019.00878 Thermostable α-amylases are important enzymes used in many industrial processes. The expression of recombinant Pyrococcus furiosus α-amylase (PFA) in Nicotiana tabacum has led to the accumulation of high levels of recombinant protein in transgenic plants. The initial steps to registering the transgenic tobacco at a commercial production scale and growing it in the field requires a risk assessment of potential non-target effects. The objective of this study was to assess the effect of feeding on transgenic tobacco with 2 indigenous insect species commonly associated with wild and commercial tobacco involving plants grown and evaluated under laboratory and field conditions. The highest levels of PFA ranged from 1.3 to 2.7 g/kg leaf fresh weight produced in the field-grown cultivars Con Havana and Little Crittenden, respectively. These two cultivars also had the highest nicotine (ranging from 4.6 to 10.9 mg/g), but there was little to no negative effect for either tobacco hornworm Manduca sexta L. or aphid Myzus nicotianae (Blackman). Both laboratory and field trials determined no short term (5 days) decrease in the survival or fecundity of the tobacco aphid after feeding on PFA transgenic tobacco compared to non-transgenic plants. In the field, tobacco hornworm larvae showed no differences in survival, final larval weights or development time to adult stage between transgenic lines of four cultivars and their corresponding wild type controls. Laboratory studies confirmed the field trial results indicating the low risk association of PFA expressed in tobacco leaves with tobacco hornworms and aphids that would feed on the transgenic plants.

Keywords: Pyrococcus furiosus α-amylase, Nicotiana tabacum, risk assessment, non-target effects, tobacco aphid Myzus nicotianae, tobacco hornworm Manduca sexta, molecular farming, genetically modified plant

# INTRODUCTION

Evaluating the impact of transgenic or genetically modified (GM) crops to the environment poses unique challenges to the traditional risk assessment process. As more acreage is planted with transgenic crops there will be further interaction of these plants with insects creating new environmental and pest management concerns. The amount of acreage growing transgenic crops

increased a 100-fold over the decade between 1996 and 2006, and made up 134 MHa in 25 countries by 2009 (Ahmad et al., 2012), with GM soybeans grown on one of the largest cultivated areas (Imura et al., 2010). A major emphasis of genetic engineering has been to improve crop yield by overcoming abiotic stresses, for example heavy metals, salt, cold and drought. Plants have also been engineered to overcome biotic stresses, to provide defenses against insects, fungi, bacteria and other diseases. The most common insecticidal trait is Bacillus thuringiensis (Bt) crystal protein endotoxin that is toxic to lepidopterans (caterpillars), dipterans (flies) and coleopterans (beetles). Other proteins with insecticidal activity are under development, including lectins, protease inhibitors, antibodies, wasp and spider toxins, microbial insecticides and insect peptide hormones (Ahmad et al., 2012). In the United States, stringent rules and regulations have been developed by the National Institutes of Health (NIH), the Department of Agriculture (USDA), the Animal and Plant Health Inspection Service (APHIS), the Environmental Protection Agency (EPA) and the Food and Drug Administration (FDA) that provide guidelines for testing and commercial release of GM crops. The classification of species that may be affected by GM crops include: (1) beneficial, pollinators, and natural enemies; (2) non-target herbivores; (3) soil organisms; (4) species of conservation concern and (5) species of local biodiversity importance (Andow and Zwahlen, 2006). In the European Union, Environmental risk assessment (ERA) of effects on non-target organisms has been adopted for those countries wanting to introduce GM crops for cultivation purposes. Two new guidance documents introduce initiatives that examine ecological principles underlying ERA, such as the requirement of evaluating the whole GM plant, as well as the introduced traits, since the genetic modification may cause unintended changes to the plant's phenotype that in turn might affect non-target organisms (Arpaia et al., 2017). This is particularly important for new GM crops expressing new traits such as novel proteins as there can be altered plant metabolic pathways (e.g., starch production, oil composition, semiochemicals, etc.) that could cause possible secondary or indirect effects.

The sales from enzymes for biotechnology in the United States increased from \$1.3 billion in 2002 to US \$5.1 billion in 2009 (Sarrouh, 2012). Fermentation was used traditionally to produce enzymes, but this has been improved with recombinant technology so that microorganisms can produce greater amounts of more active enzymes. Protein engineering can also modify the properties of enzymes (Demain and Vaishnav, 2009). This has led to the successful increased production of thermostable amylase, among other important enzymes (Haki and Rakshit, 2003). To solve the problems associated with recombinant enzymes expressed in E. coli, such as inclusion bodies and protein folding issues, yeasts, fungi, plants and animals offer alternative hosts.

Plant expression systems have many positive attributes, but there has been a strong reluctance among regulators to permit agricultural-scale cultivation of transgenic plants expressing foreign proteins (Twyman et al., 2003). Tobacco remains one of the strongest candidates for the commercial production of recombinant proteins, and arguably has the longest history as a successful system for molecular farming (Tremblay et al., 2010). Biosafety and ethical concerns are also satisfied because tobacco is neither a feed nor food crop, eliminating the risk that transgenic tobacco will contaminate animal or human food chains. Of particular interest for this type of study are plant expression systems that produce enzymes, for example amylases, due to the current ability to produce high enzyme concentrations in plants using recombinant technology.

Amylases are important enzymes of great significance in present-day biotechnology. Amylase application includes starch saccharification in the textile, food, brewing, and distilling industries. Since α-amylases need to be active at the high temperatures of gelatinization (100–110◦C) and liquefaction (80– 90◦C) to economize processes, there has been much interest in identifying novel thermostable amylases. The advantages of using thermostable amylases in industrial processes include the decreased cost of external cooling, a better solubility of substrates, and a lower viscosity (de Souza and de Oliveira Magalhaes, 2010). The development of saccharifying amylolytic enzymes such as α-amylase to use with production processes at higher temperatures requires a new process design and improved knowledge of thermophilic microorganisms. One of the most studied thermophilic microorganisms, Pyrococcus furiosus, is an anaerobic marine heterotrophic archeon with an optimal growth temperature of 100◦C. Alpha-amylase activity has been reported in the cell homogenate and growth medium of P. furiosus, and recombinant P. furiosus α-amylase (PFA) was expressed and purified from E. coli (Dong et al., 1997).

The expression and yield of recombinant PFA in a variety of Nicotiana hosts was investigated by transient expression. When plant leaves were infiltrated with Agrobacterium carrying codon-optimized PFA DNA sequence, the expression level of PFA was observed to vary among the hosts (Conley et al., 2011). This result suggested that by testing different tobacco cultivars, the accumulation level of recombinant protein could be largely increased. Stable transgenic plants were created with 16 Nicotiana cultivars and a total of 438 independent transgenic plants were generated and evaluated (Conley et al., 2011). In Canada, the Canadian Food Inspection Agency (CFIA) regulates the release of transgenic crops. Therefore, two CFIA-approved confined field trials were conducted over two growing seasons to evaluate PFA transgenic plants for crop growth, leaf yield, soluble protein yield, stability of gene expression, the yield of recombinant PFA and alkaloid concentration under field conditions (Zhu et al., 2017). The highest level of PFA measured was 3.4 g/kg fresh weight in field tobacco, which was purified using both chromatography and heating methods. This study is an investigation of the non-target effects of PFA transgenic tobacco that involved both greenhouse and field-grown plants that were evaluated under laboratory and field conditions. Two indigenous insect species commonly associated with wild and commercial tobacco, the tobacco aphid Myzus nicotianae (Blackman) (Hemiptera: Aphididae) and the tobacco hornworm Manduca sexta L. (Lepidoptera: Sphingidae) were selected to test non-target effects of recombinant PFA on the population of either insect through inhibiting or promoting their growth and development.

# MATERIALS AND METHODS

# Tobacco Plants

fpls-10-00878 July 6, 2019 Time: 12:41 # 3

Four cultivars and lines of T1 GM tobacco containing a single insertion of the pfa transgene (Genbank accession number AF001268) were chosen based on the level of α-amylase expression and nicotine content previously determined (Conley et al., 2011) to provide 4 different combinations as follows: high α-amylase, high nicotine (Con Havana 38 line 7F8); low α-amylase, high nicotine (Little Crittenden, line 9F28); high α-amylase, low nicotine (81V9, line 10F18); low α-amylase, low nicotine (TI95, line 3F5). As nicotine is one of the most toxic secondary metabolites in tobacco, it was predicted that the combination of nicotine and the introduced amylase protein might lead to negative effects to herbivores feeding on the transgenic tobacco. Non-transgenic lines of the same cultivars were grown for comparison with the same transgenic lines. The pCaMterX plant expression vector (Harris and Gleddie, 2001) was used for PFA expression. The double enhanced 35S promoter and Nos terminator were the control elements driving gene expression. Tobacco secretory signal peptide Pr1b targeted the protein to the secretory pathway and the KDEL tag retrieved PFA to the ER (Zhu et al., 2017). Activity of PFA was determined by Zhu et al. (2017) by zymography.

#### Tobacco Leaf Alkaloid Analyses

Alkaloids were extracted and nicotine quantified using samples from tobacco leaves of each cultivar and line. The leaves were sampled when the plants were green with no senescing leaves. All the leaves were green and healthy (**Supplementary Figures S1A–E**). Freeze-dried samples were ground in a Wiley mill and the ground material passed through a 2 mm screen and samples were extracted and analyzed by GC-MS using a method previously developed (Kaldis et al., 2013). Briefly, to each 1.0 g sample of ground tobacco, 10 mL of distilled water, 5 mL of dichloromethane (DCM) 5 mL of aqueous 10% sodium hydroxide and 5 mL of DCM containing the internal standard, anethole (Sigma-Aldrich, St. Louis, MO, United States), were added and the mixture shaken for 10 min and centrifuged. An aliquot of the DCM layer was then filtered and analyzed by GC-MS (Hewlett-Packard 5890 Series II GC and 5971A MS detector). A DB-5 capillary column was used (J and W, 60 m × 0.25 mm, 0.25 µm film thickness) (Agilent Technologies, Santa Clara, CA, United States) with the following conditions: 220◦C injector temperature; 2.4 mL/min helium flow rate; initial column temperature 50◦C for 0.5 min; increased at 5◦C/min to 125◦C; increased at 2◦C/min to 155◦C; held for 8 min; increased at 25◦C/min to 260◦C and held for 8 min. The MS transfer line temperature was 290◦C. A selected ion of m/z 162 was used for the nicotine analyses with nicotine standards (Sigma-Aldrich, St. Louis, MO, United States).

#### Insects

Tobacco hornworm M. sexta were purchased from Carolina Biological (Burlington, NC, United States) as eggs. The eggs were placed on leaves of the tobacco lines that the hornworm larvae would be fed over the course of the laboratory and field trials. Hornworm were kept in an insectary at 25 ± 2 ◦C, 50 ± 5% RH and 16:8 L:D. Tobacco aphid M. nicotianae originally collected from field tobacco were maintained as a colony on commercial cultivars of tobacco grown in growth cabinets at 25 ± 2 ◦C, 50 ± 5% RH and 16:8 L:D.

## Laboratory Trials

#### Hornworm Bioassays on Whole Plants

All tobacco cultivars and lines used in the insect lab trials were started from seed, grown in the greenhouse for up to 30 days, and used in the laboratory trials as required. Hornworm larvae hatched from eggs onto leaves from the GM and NGM tobacco cultivars and fed until 2nd instar. Ten 2nd instar hornworm larvae per plant were held on individual leaves using a mesh bag (Delnet <sup>R</sup> Apertured Film pollination bags, 10<sup>00</sup> × 1200) tied closed around the stem with a twist-tie close to the main stalk. Larvae were moved to a fresh leaf on the same plant when they had consumed most of the leaf. When larvae reached the size where they were consuming greater than 1 leaf per day they were moved to a plastic tub (32 × 10 × 17 cm) half to three quarters full of potting soil and fed with fresh tobacco leaves from the same cultivar until they pupated in the soil. The weight of larvae was taken on days 0, 5, 10 and each day prior to pupating. The number of days required to pupate and the date of adult eclosion were recorded. Two replicate trials were performed under growth cabinet conditions of 25 ± 2 ◦C, 50 ± 5% RH and 16:8 L:D.

# Field Trial

The tobacco plants were grown as described by Zhu et al. (2017). The 4 cultivars with both a GM and corresponding NGM line were grown in the field. One and a half to 2 months after transplanting of 6 week-old seedlings, 3 aphid trials of 5 days duration were performed in the field plots. Each trial was set up in a separate block where 4 NGM and 4 GM plants of each cultivar were enclosed by individual wire-framed, meshcovered cages. Ten < 1 week old aphid adults were held on the top 3rd, 5th, and 7th leaves of each plant using the mesh bags described above. After 5 days, the leaves were cut off the plant at the stalk, and the aphids on each leaf were counted. Plants and aphids were then autoclaved and disposed of in the solid waste.

Two months after planting, a hornworm field trial was set up with 10 2nd instar larvae individually caged per leaf on each plant for 4 NGM and 4 GM plants and enclosed as previously described. After 7 days, the mesh bags with enclosed leaf and larva were removed, and the survival and weight of each were assessed. In the lab, larvae were then placed in individual soil tubs and fed leaf material removed from field plot plants in the same cultivar that they had fed on while in the field. The final larval weight, the number of days as pupae, and the date of adult eclosion were recorded. The hornworm field trials were replicated twice in separate blocks. Plants and hornworms were autoclaved and disposed of in the solid waste.

## Statistics

fpls-10-00878 July 6, 2019 Time: 12:41 # 4

The nicotine and PFA concentrations from greenhousegrown and field grown tobacco plants were compared across cultivar and lines using a two-tailed T-test (Microsoft Excel 2016).

Survival curves for tobacco hornworm lab and field trials were calculated by the Kaplan-Meier method with comparisons performed based on the log-rank test using IBM SPSS Statistics 20.0 (IBM Corp., United States). ANOVA and unprotected Tukey tests were used to determine if aphid fecundity, and the weight gain and development time of hornworm larvae fed on transgenic (GM) tobacco lines were significantly different from those fed on non-transgenic (NGM) plants (2001; SAS Institute, Cary, NC, United States). Three-way ANOVA and Tukey tests were used to determine if there was an effect of trial in the cases where two trials were used to assess aphid fecundity and hornworm development time and growth on field grown tobacco. If the interaction between trial x line was significant (P < 0.05), then the trials were analyzed separately. A two way ANOVA with PROC MIXED was used to determine if there were main effect interactions in the experiments where there was a random plant effect: aphid fecundity over 5 days on field grown tobacco; tobacco hornworm larval day 5 weights on greenhouse grown tobacco; and tobacco hornworm larval day 7 weights on field grown tobacco, as the insects remained on different leaves of the same plant during the feeding period. All other development and growth periods were analyzed by two-way ANOVA with PROC GLM as larvae had been removed from a single plant, held in a soil tub, and were fed leaves from multiple plants of the same cultivar and GM/NGM type until they reached the prepupal stage.

#### RESULTS

The cultivars and lines selected for the study were based on PFA and alkaloid levels measured in earlier research (Conley et al., 2011). The changes that occur when grown in the field did show a similar trend in alkaloid levels (**Table 1**) but there were differences in the low and high PFA (**Table 2**) for the selected lines (Zhu et al., 2017), even though the numbers were not exactly the same as reported in Conley et al. (2011). The PFA levels in the greenhouse grown plants were not measured, however, the transgenic lines were homozygous and reflect relative protein levels as those grown in the field and reported in the Zhu et al. (2017).

#### The Level of Alkaloids in Tobacco

In greenhouse grown tobacco, the alkaloid levels were consistent between the 2 sets of plants: C. Havana and L. Crittenden consistently had higher alkaloid levels than TI95 and 81V9 (**Table 3**). The same pattern held true for field-grown tobacco and the alkaloids levels are similar between GM and NGM plants within same cultivar (**Table 1**).



The concentration of nicotine is presented as the mean ± SE of four field plot replicates. Mean concentrations of GM lines with the same upper case letter are not statistically different (Two-tailed T-test, P > 0.05). Mean concentrations of NGM lines with the same lower case letter are not statistically different (Two-tailed T-test, P > 0.05). An<sup>∗</sup> indicates there was a significant difference between nicotine concentration for 2 NGM lines between first and second harvest (Two-tailed T-test, P < 0.05).

TABLE 2 | The concentration of PFA (g/kg fresh tobacco leaf) from the field trial.


The concentration of PFA is presented as the mean ± STDEV of measures from 20 individual plants. Data was summarized from Zhu et al. (2017).

TABLE 3 | Nicotine concentration (mg/g) of greenhouse-grown tobacco plants.


The concentration of nicotine is presented as the mean ± SE of measures from five individual plants. Mean concentrations with the same upper case letter are not statistically different (Two-tailed T-test, P > 0.05).

# Effect of Recombinant PFA on Aphid Survival and Fecundity

To determine if PFA has an effect on aphid survival and fecundity, a trial was conducted where GM and NGM tobacco lines grown in the field were used to rear tobacco aphids. This trial indicated that there was no short term (5 days) effect on the survival or fecundity of the tobacco aphid after feeding on PFA transgenic tobacco compared to the NGM counterpart (Two-way ANOVA: main effects (PROC MIXED) interaction; tobacco line x type; P = 0.3346) (**Supplementary Table S1**), although in one case the NGM C. Havana and had significantly fewer aphids than NGM L. Crittenden (Tukey test, P < 0.0001) (**Figure 1**).

# Effect of Recombinant PFA on Tobacco Hornworm

#### Survival

To determine if PFA or any other modification introduced during the transformation procedure has any effect on survival of the tobacco hornworm, survivorship assays were conducted on GM and NGM tobacco where the development and survival of hornworm larvae was followed from 2nd instar to the adult stage. There was no significant difference between GM and NGM tobacco (P > 0.05) in hornworm survival from larvae reared to the adult stage on greenhouse grown plants (**Figures 2A–D**).

In the field, 2nd instar hornworm larvae were caged for 7 days on GM and NGM tobacco plants grown in the field and then transferred to the growth chamber and fed leaves cut from the same field grown plants until they reached the pre-pupal stage. Hornworm survival was not significantly different (P > 0.05) between the GM and NGM plant lines (**Figures 2E–H**).

#### Development Time

Other than survival, the length of time a tobacco hornworm larva takes to go through its various life cycle stages can be indicative of a positive or negative effect of PFA.

The number of days for the larvae on each transgenic cultivar to reach the pre-pupal and adult stage was no different between cultivars (Two-way ANOVA: main effects (PROC GLM) interaction: tobacco line x type; P = 0.7417 and P = 0.3827, respectively) for the greenhouse-grown tobacco (**Supplementary Table S2**). However, the development time was significantly less to reach pupal stage and adult stage (Tukey test, P < 0.05) for larvae fed on GM than on the NGM lines of the same cultivar (**Figure 3A**). The GM and NGM plants were grown and tested under identical environmental conditions but at different times (3 months apart) which may have contributed to the observed differences. Analysis of covariance with time as the covariant could not be tested in this case as the treatments were different during each trial (NGM lines in first trial and GM lines in the second). A two way ANOVA was applied based on two types of plants (GM vs. NGM) and 4 cultivars. In contrast, there was no difference in the length of time for the larvae to reach the pre-pupal stage or the time until adult emergence (Two-way ANOVA: main effects (PROC GLM) interaction: tobacco line x type; P = 0.7417 and P = 0.3827, respectively) (**Supplementary Table S3**) for all larvae on fieldgrown tobacco cultivars in both trials (**Figure 3B**). As the GM and NGM plants were tested at the same time under field conditions, these results are more illustrative of the lack of effect GM tobacco cultivars have on hornworm development time. These results also reinforce the importance of using appropriate controls and conducting all experiments simultaneously, as the varying physiological conditions of the plants and/or the insects may influence experimental outcomes.

#### Growth

A third factor that can inform us on any effects of PFA is weight gain of hornworm larvae while feeding on GM and NGM plants. To determine whether growth on the GM lines with both higher nicotine and PFA would affect survival to the pupal and adult stages, the hornworm larvae were held on the plants over the entire larval period.

The 2nd instar larvae fed on greenhouse plants until the pre-pupal stage, there was no significant main effect interaction at Day 5 for larvae fed on GM compared to NGM cultivars (Two-way ANOVA: main effect (PROC MIXED) interaction; tobacco line x type; P = 0.1297) (**Supplementary Table S4**), however, GM TI95, L. Crittenden and 81V9 had significantly greater larval weight than those fed on the corresponding NGM plants (Tukey test, P < 0.05) (**Figure 4A**). Similarly on Day 10, there was no significant main effect interaction for larvae fed on GM compared to NGM plants (Two-way ANOVA: main effect (PROC GLM) interaction; tobacco line x type; P = 0.2113) for each cultivar (**Figure 4A**). However, all larvae fed on GM plants had significantly greater weights than those fed on the corresponding NGM plants (Tukey test, P < 0.05). By the pre-pupal stage, larvae feeding on NGM plants had caughtup in weight, and only hornworms that fed on GM 81V9 had a significantly greater (Two-way ANOVA: main effect (PROC GLM) interaction; tobacco line x type; P = 0.1397) final weight than those on the NGM 81V9 (Tukey test; P < 0.05) (**Figure 4B**). These results indicate that PFA has a somewhat positive effect on weight gain (**Figure 4A**) and development time (**Figure 3A**) of hornworm larvae fed greenhouse grown tobacco. This could be due to conversion of starch to sugar, making the plants more palatable to larvae, but this effect seems to be transient, since by the time they stop feeding, most larvae are in the same weight range whether fed GM or NGM plants.

In the field, there was a significant main effect interaction of trial x line (P = 0.0403) (**Supplementary Table S5**), therefore the effect of the tobacco line and type were analyzed separately for the 2 trials. There was no significant larval weight differences observed at 7 days between GM and corresponding NGM cultivars in Trial 1 (Two-way ANOVA: main effect (PROC MIXED) interaction; tobacco line x type; P = 0.2260) (**Figure 5A**),

FIGURE 2 | PFA has no effect on tobacco hornworm survival. Survivorship of hornworm from larvae to adult stage on GM vs. NGM tobacco. (A–D) Greenhouse conditions and (E–H) field conditions (N = 20 larvae/cultivar). No significant differences between GM and NGM plants were found (Kaplan-Meier, P > 0.05).

but in Trial 2 there were weight differences between GM and NGM cultivars (Two-way ANOVA: main effect (PROC MIXED) interaction; tobacco line x type; P = 0.0441), L. Crittenden and 81V9 (Tukey test, P < 0.05). Larvae fed field plants from Day 7 until they reached the pre-pupal stage under growth chamber conditions showed no differences in final larval weights among

the same GM and NGM cultivars in both Trial 1 and 2 (Twoway ANOVA: main effects (PROC GLM) interaction: line x type; P = 0.6805 and P = 0.4650, respectively) (**Figure 5B**).

#### DISCUSSION

In general, the combination of high nicotine and high PFA were not found to increase development time or reduce survival for hornworm fed in situ on field leaves for 7 days and then field grown leaves until the larvae reached pre-pupal and adult stages. Significant differences between larval weights during the hornworm development were observed at the mid-larval stage, but under both laboratory and field conditions these differences were no longer observed by the time the larvae reached the pre-pupal stage.

Neither insect species was negatively affected by 4 different PFA transgenic lines, even though differences were observed

at times during certain points in the hornworm life-stages. Laboratory studies confirmed what was seen in the field and indicate that short-term bioassays provided appropriate predictions of the response of tobacco hornworms and aphids to recombinant proteins. Based on the current levels of PFA produced in the higher nicotine cultivars, little negative impact should be expected within 1 generation of tobacco hornworm or aphid exposed to GMO tobacco production. Since nicotine and alkaloid levels can vary with tobacco leaf age (Zhang et al., 2018), insects in this study would be exposed to different levels depending on where they feed on the plant. For this reason 3 or more leaves on each plant were selected to position the aphids and in the case of the hornworms, 10 leaves per plant were used in order not to bias the exposure. In addition, all tobacco plants sampled within each trial were the same age, and leaves were removed, stacked on top of each other, and a core taken

that would include a sample from all leaves combined. Duplicate samples were obtained from each plant for the alkaloid analyses.

The selection of non-target arthropods (NTAs) is critical to be able to test the hypothesis that GE crops, arthropod-active or not, do not cause adverse effects to valued NTAs under field exposures (Romeis et al., 2013). In this study the arthropods meet the selection criteria since tobacco hornworm and aphids are most likely to ingest the PFA based on their biology, mode of feeding and will be exposed to high levels of the protein for an extended duration.

Tobacco hornworm can consume tobacco leaves in their entirety thus ensuring that any produced transgenic proteins would be ingested during feeding and processed within the insect's digestive tract. Any adverse effects from α-amylase recombinant protein ingestion could manifest themselves in the form of decreased survival, reduced larval weights or increased development times to pupal and adult stages, but this was not observed with any of the transgenic tobacco tested. Fecundity of tobacco hornworm could also be affected but would require a long term, multi-generational study to determine if such effects were present. Fecundity of the tobacco aphid was assessed on the transgenic and non-transgenic tobacco based on the number of aphid nymphs produced, and no differences were observed. However, aphids feed by piercing the leaf epidermis to access the phloem, and the amount of protein present relative to the amount sequestered in cells is not known. Longer term, multigenerational studies could also be done with this species to determine if there are any chronic effects of lower PFA exposure.

Concerns were addressed over the limitations of sample size used in this study by pointing out that hundreds of tobacco aphids and hornworms were exposed to the transgenic plants in the course of the laboratory and field trials. The initial aphid numbers placed on plants were 10 per leaf, but as aphids reproduce quickly, the populations on all leaves and plants increased by 2- to 3-fold within 5 to 7 days. Overall, 720 aphids were tested under field conditions. This breaks down to at least 30 aphids/plant and 3 plants/treatment. Fewer tobacco hornworm could be tested as growing larvae consumed large amounts of leaf tissue and would require more tobacco plants than time and resources would allow for greenhouse trials. Twenty to thirty larvae were tested per greenhouse grown tobacco cultivar for a total of 160. The numbers of larvae tested under field conditions was also 160.

A great deal of the literature on the non-target effects of transgenic plants focuses on B.t. insecticidal proteins in various crops (Han et al., 2016) and those that have examined the impact of herbicide-tolerant crops on arthropods (Imura et al., 2010). In the latter case, the impact of GM plants was assessed from the perspective of how the plants and their new cropping practices affect the ecosystem of the field versus that of conventional cropping system. A 2 years field study determined that the GM plant itself was not responsible for changes in the arthropods of the soybean fields, but the modified weed management strategies were likely responsible by reducing the types of arthropods present. In the present study, little attention was paid to the diversity of arthropods (insects and others) found on the wild-type versus PFA producing tobacco. Only a few species were collected from within the field plots during the study, and would not provide a significant analysis of nontarget effects or impacts to biodiversity. Typically, the tobacco crop would have been treated with insecticide sprays in order to manage tobacco pests such as the hornworm and aphid, but this was not done in order to complete the feeding trials. Based on these results it can be estimated that the recombinant protein does not pose a risk to the insects, but neither are the transgenic plants more susceptible to the insects. Since the amount of leaf area consumed was not calculated, the only evidence was the level of insect growth and time to complete larval development, and these were not significantly different by the end of the larval stage. However, the significantly higher growth during the mid-lifecycle of the larvae fed on the PFA plants may indicate that greater amounts of leaf material are consumed relative to the wild-type plants. A PFA producing plant may be a more nutritious food source for the hornworms, leading to faster growth of the insects, and greater feeding damage on the PFA plants. As was mentioned previously, the conversion of higher levels of starch in PFA plants would provide greater amounts of sucrose, a compound consistently shown to be one of the most stimulating of sugars to insects (Hervé et al., 2014).

The evaluation of non-target effects from transgenic plants often employs indicator species, but this can also include nontarget species that are common within the local area where the crops are grown (Andow and Zwahlen, 2006). A more judicious selection process for choosing an indicator or nontarget species to assess GM plant risks would include a survey of published literature on the attributes of each non-target invertebrate and the transgenic plant of interest (Todd et al., 2008). The most pertinent outcome of the screening process is the increased confidence in the risk analysis, knowing that all species in the ecosystem have been considered. Such a screening method might include assessment of effects on beneficial insects that do not feed directly on the GM plant, but would feed on herbivores that do consume the plant (Kalushkov and Nedved, 2005). The use of meta-analyses to measure direct toxic effects on non-target organisms is also recommended, as long as the studies are designed to separate direct and indirect effects (Schmidt et al., 2009). Unfortunately, metaanalysis for transgenic PFA- producing tobacco is not possible since this was a preliminary evaluation of this protein in plants. Therefore, evidence can only come from lab and field studies as were conducted in the present study. For example, an investigation of 4 transgenic potato lines that produce cyanophycin, a polymer of aspartic acid and arginine, between 0.8 and 7.5% dry weight in the tubers were compared to a nontransgenic potato variety (Emmerling et al., 2012). Similar to the present study, the residues containing cyanophycin had no effect on subtle behavior, growth or reproduction of earthworms based on endpoints such as loss of potato residue at the soil surface, earthworm biomass, cocoon production and earthworm hatching numbers over an 80 days period. The majority of studies have similarly indicated that non-target species were not affected by transgenic crops; however, the consensus is that each GM plant should be assessed separately based on the gene expressed by the plant.

An opposing view is that the risks associated with the development of transgenic crops expressing recombinant proteins may be assessed with sufficient confidence without most of the ecotoxicological studies required for pesticidal transgenic crops. In fact many of the risks could be evaluated by using information from the scientific literature on the mode of action, taxonomic distribution and environmental fate of the recombinant proteins (Raybould et al., 2010). However, there is potential concern regarding any pleiotropic effects from recombinant traits, for example with the un-intended effects of protease inhibitor gene-expressing crops (Schluter et al., 2010). Un-intended effects within the modified plant would negate the sole use of previous scientific/toxicity data approach and suggest that an empirical assessment on a case-by-case basis is still required.

#### CONCLUSION

fpls-10-00878 July 6, 2019 Time: 12:41 # 11

Thus far, at the current level of the PFA produced by tobacco, the field trials indicate a low risk to non-target insects. However, future projects should assess the impact of these proteins to the third trophic level, or to predators and parasitoids of the insects that feed on tobacco. Those organisms may not have the same ability to metabolize or detoxify exogenous proteins that are encountered when preying on the herbivores that ingest the proteins.

#### INFORMED CONSENT

IS has provided his informed consent for the identifiable image shown in **Supplementary Figure S1C**.

# AUTHOR CONTRIBUTIONS

IS, LR, and RM designed the research. IS, HZ, KS, and AF conducted the research. IS, HZ, and KS analyzed the data. IS,

#### REFERENCES


HZ, and RM wrote the manuscript with contributions from all authors. All authors read and approved the final version of the manuscript.

## FUNDING

Funding was provided by the Agriculture and Agri-Food Canada A-Base Project # 115.

## ACKNOWLEDGMENTS

We appreciate the contributions and technical assistance of Mohammad Hossain, Angelo Kaldis, Dale McArthur, Brian McGarvey, Alex Molnar, Hooman Namin, and Bob Pocs to the research. We would also like to acknowledge the extensive contribution of our late co-author, colleague, and friend, L. Bruce Reynolds. Without his extensive experience and advice this field project would not have been possible.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00878/ full#supplementary-material



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Scott, Zhu, Schieck, Follick, Reynolds and Menassa. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Plant-Made Nervous Necrosis Virus-Like Particles Protect Fish Against Disease

Johanna Marsian<sup>1</sup> , Daniel L. Hurdiss<sup>2</sup>† , Neil A. Ranson<sup>2</sup> , Anneli Ritala<sup>3</sup> , Richard Paley<sup>4</sup> , Irene Cano<sup>4</sup> and George P. Lomonossoff<sup>1</sup> \*

<sup>1</sup> John Innes Centre, Norwich Research Park, Norwich, United Kingdom, <sup>2</sup> Astbury Centre for Structural Molecular Biology, School of Molecular and Cellular Biology, University of Leeds, Leeds, United Kingdom, <sup>3</sup> VTT Technical Research Centre of Finland, Espoo, Finland, <sup>4</sup> Centre for Environment Fisheries and Aquaculture Science (Cefas), Weymouth Laboratory, Weymouth, United Kingdom

#### Edited by:

Flavia Guzzo, University of Verona, Italy

#### Reviewed by:

Tomas Macek, University of Chemistry and Technology, Prague, Czechia Audrey Yi-Hui Teh, St George's, University of London, United Kingdom

#### \*Correspondence:

George P. Lomonossoff george.lomonossoff@jic.ac.uk

#### †Present address:

Daniel L. Hurdiss, Department of Infectious Diseases and Immunology, Virology Division, Faculty of Veterinary Medicine, Utrecht University, Utrecht, Netherlands

#### Specialty section:

This article was submitted to Plant Metabolism and Chemodiversity, a section of the journal Frontiers in Plant Science

Received: 27 February 2019 Accepted: 20 June 2019 Published: 09 July 2019

#### Citation:

Marsian J, Hurdiss DL, Ranson NA, Ritala A, Paley R, Cano I and Lomonossoff GP (2019) Plant-Made Nervous Necrosis Virus-Like Particles Protect Fish Against Disease. Front. Plant Sci. 10:880. doi: 10.3389/fpls.2019.00880 Virus-like particles (VLPs) of the fish virus, Atlantic Cod Nervous necrosis virus (ACNNV), were successfully produced by transient expression of the coat protein in Nicotiana benthamiana plants. VLPs could also be produced in transgenic tobacco BY-2 cells. The protein extracted from plants self-assembled into T = 3 particles, that appeared to be morphologically similar to previously analyzed NNV VLPs when analyzed by high resolution cryo-electron microscopy. Administration of the plant-produced VLPs to sea bass (Dicentrarchus labrax) showed that they could protect the fish against subsequent virus challenge, indicating that plant-produced vaccines may have a substantial future role in aquaculture.

Keywords: Atlantic cod nervous necrosis virus, virus-like particles, transient expression, sea bass, protective immunity

#### INTRODUCTION

Nervous necrosis virus (NNV) is the causative agent of viral nervous necrosis (VNN) that is a major viral pathogen of fish known to affect over 40 cultured marine fish species worldwide (Tan et al., 2001; Walker and Winton, 2010). Nodavirus infection leads to extensive economic losses to the aquaculture industry each year (Lin et al., 2007). The virus replicates in cells of the brain, spinal cord, and retina causing lesions leading to viral encephalopathy and retinopathy (VER), followed by skin darkening, abnormal swimming behavior and massive mortality (Grotmol et al., 1997; Bovo et al., 1999; Thiery et al., 1999). VNN often causes mortality rates up to 100% in larvae and early juvenile stages and can result in rapid loss of the hatchery; adult fish are also susceptible, but mortality rates are lower (Lin et al., 2007). Thus, the development of a cost- effective and viable vaccine is of great interest to the aquaculture industry.

Nervous necrosis virus is a member of the family Nodaviridae that contains two genera: Alphanodavirus, which usually infects insects, and Betanodavirus, also known as piscine nodavirus (Mori et al., 1992). Betanodavirus is currently classified into four genotypes: striped jack NNV (SJNNV), tiger puffer NNV (TPNNV), barfin flounder NNV (BFNNV) and redspotted grouper NNV (RGNNV) (Nishizawa et al., 1997) with a fifth, turbot NNV (TNNV), proposed (Johansen et al., 2004) and a further three unclassified viruses known (Sahul Hameed et al., 2019). The four genotypes fall into three serotypes with RGNNV and BFFNV sharing

serotype C. Each of these genotypes infects a number of different fish species but all share similar features (World Organisation for Animal Health [OIE], 2018).

Virions of the Nodaviridae, including NNV, are nonenveloped, 25–30 nm in diameter and contain 180 copies of a single type of coat protein subunit arranged with T = 3 symmetry. The particles contain two molecules of positive-sense RNA, RNA1, and RNA2. RNA1 (3.1 kb) encodes the RNA-dependent RNA polymerase (102 kDa), critical for RNA transcription and replication, and the non-structural protein B (10 kDa), expressed from a subgenomic RNA (Friesen and Rueckert, 1982; Guarino et al., 1984; Saunders and Kaesberg, 1985) that is involved in inhibition of host RNA interference (Mori et al., 1992; Guo et al., 2003; Chao et al., 2005). RNA2 (1.4 kb) encodes the coat protein precursor alpha (43 kDa) and has an additional function in regulating the synthesis of RNA3 (Dasgupta et al., 1984; Friesen and Rueckert, 1984; Mori et al., 1992). The coat protein precursor self-assembles into T = 3 provirions (Friesen and Rueckert, 1981; Gallagher and Rueckert, 1988; Cheng et al., 1994). Provirions mature by spontaneous autocatalytic cleavage of the coat protein alpha post-assembly, producing proteins beta (38 kDa) and gamma (5 kDa) which remain part of the mature virion (Hosur et al., 1987).

Virus-like particles (VLPs), which resemble authentic virions but lack the infectious genome, have proven to be effective immunogens against a variety of diseases (Grgacic and Anderson, 2006; Bachmann and Jennings, 2010). Several potential vaccines against NNV based on VLPs have been investigated but none has been marketed yet. For example, Lin et al. (2001) used Sf21 insect cells and a recombinant baculovirus vector to express the capsid protein from Malabar grouper (Epinephelus malabaricus) NNV (MGNNV) and showed that the protein assembled into VLPs. These had a similar size and geometry to the native virus and, as well as the coat protein, were found to contain random hostderived RNA of different sizes (Lin et al., 2001). Using Escherichia coli-based expression, Lu et al. (2003) and Liu et al. (2006) produced VLPs for Dragon grouper (Epinephelus lanceolatus) nervous necrosis virus (DGNNV) and showed that the particles were able to block attachment of native virus to the surface of fish nerve cells in culture and to raise antibodies in vaccinated fish. Lai et al. (2014) also showed that E. coli-produced orange-spotted grouper nervous necrosis virus (GNNV) VLPs could stimulate the production of high titre antibodies in fish and Chen et al. (2015) determined the crystallographic structures of the E. coliexpressed VLPs to 3.6 Å resolution. Most recently, Xie et al. (2016) solved the structure of GNNV VLPs to 3.9 Å by cryoelectron microscopy and identified sites on the assembled capsids for insertion of foreign peptides; they also showed that the VLPs could elicit an antibody response in Asian sea bass. Moreover, E. coli-expressed VLPs have been shown to elicit a strong cellular and innate immune response in orange-spotted grouper, in particular the complement system, an important component of the innate immune system and anti-viral immunity in teleost (Lai et al., 2014).

In the last 20 years, plant-based expression systems have become serious competitors to bacteria, insect cells, yeast or mammalian cells as production systems for pharmaceutical materials (Lomonossoff and D'Aoust, 2016). They have the potential to produce vaccines cheaply, a major consideration if they are to be deployed in aquaculture. Plants also appear to be particularly suitable for the production of VLPs (Thuenemann et al., 2013a; Marsian and Lomonossoff, 2016), some of which have been shown to be capable of providing protective immunity (Thuenemann et al., 2013b; Marsian et al., 2017). We therefore, examined whether plant systems could be used to produce a candidate vaccine for deployment in aquaculture. Here, we show that both transient expression of the coat protein of Atlantic cod nervous necrosis virus (ACNNV) in Nicotiana benthamiana leaves or in transgenic tobacco BY-2 cells leads to the production of VLPs. Cryo-EM analysis of the particles produce in N. benthamiana leaves resulted a 3.7 Å resolution reconstruction, which confirmed ACNNV LPs have a similar structure to those previously reported for GNNV. When administered either intraperitoneally or intramuscularly to sea bass, the VLPs were shown to confer partial protection against subsequent challenge with ACNNV, indicating that plants can be used to produce effective vaccines for use in aquaculture.

# MATERIALS AND METHODS

#### Plasmid Constructs

The gene sequence for the ACNNV coat protein (GenBank Accession No. EF617326.1) was codon-optimized for N. benthamiana and ordered for synthesis from GeneArt (Life Technologies) with flanking AgeI and XhoI sites. The gene was cloned into an AgeI/XhoI- digested pEAQ-HT (Sainsbury et al., 2009) to produce pEAQ-HT-NNV. Agrobacterium tumefaciens LBA4404 were transformed with the construct by electroporation.

# Transient Expression in N. benthamiana

Agrobacterium tumefaciens containing pEAQ-HT-NNV was grown to stable phase at 28◦C in Luria–Bertani medium supplemented with 50 µg/ml kanamycin and 50 µg/ml rifampicin. The culture was then pelleted by centrifugation at 2500 × g and re-suspended in MMA buffer (10 mM MES, pH 5.6, 10 mM MgCl2, 100 µM acetosyringone) to an OD<sup>600</sup> of 0.4. The bacteria were left at room temperature for 0.5–3 h prior to infiltration. The suspensions were pressure infiltrated into the leaves of 3-week-old N. benthamiana plants as described by Thuenemann et al. (2013b).

# Extraction and Purification of ACNNV VLPs From Leaves

Infiltrated leaf tissue was weighed and homogenized using a Waring (Torrington, CT) blender with 3× volume of extraction buffer (0.1 M sodium phosphate, pH 7.0) plus added protease inhibitor according to the manufacturer's instructions (Roche, Welwyn Garden City, United Kingdom) and then filtered through Miracloth (Calbiochem). The crude extract was centrifuged at 9,500 × g for 15 min at 4◦C and the supernatant was then centrifuged through a sucrose cushion (1 ml 70% (w/v)

and 5 ml 25% (w/v)) at 167,000 × g for 2.5 h at 4◦C and the lower fraction retrieved. Following removal of residual sucrose by dialysis, the sample was further purified by centrifugation through a sucrose gradient (20–60% (w/v)) at 167,000 × g for 3 h at 4◦C. VLPs were collected by piercing the bottom of the tube with a needle and retrieving each fraction. After dialysis to remove the sucrose, the VLPs were concentrated by pelleting in the TH641 ultracentrifuge swing-out rotor (Sorvall) for 1, 5 h at 197,819 × g at 4◦C. The pellets were resuspended in a small volume of either PBS (140 mM NaCl, 15 mM KH2PO4, 80 mM Na2HPO4, 27 mM KCl, pH 7.2) or TBS (50 mM Tris-Cl, 0.3 M NaCl, 1 mM EDTA, pH 8.5).

# Stable Transformation of Tobacco BY-2 Cells

Two to three day-old BY-2 cell cultures were incubated with 0.25 mM acetosyringone and mixed with a suspension of A. tumefaciens carrying pEAQ-HT-NNV to an OD600 of 1.0. Both were mixed together and left to incubate at 28◦C for 2–3 days. Afterwards the BY-2 cells were transferred onto solid Murashige and Skoog (MS) basal media with appropriate antibiotics (25 ppm kanamycin, 500 ppm carbenicillin and 500 ppm vancomycin) and left in the darkness at 28◦C. Fifteen calli positive for the NNV coat protein gene were identified by PCR and further analyzed for protein expression by western blot analysis using an anti-NNV antibody (Abcam ab26812). 500 ml of MS media was inoculated with BY-2 cell line 16 and after 3.5 days the cells were harvested. Cells were homogenized using a mortar and pestle and the VLPs purified as described above.

# SDS-PAGE and Western Blot Analysis

Protein extracts were analyzed by electrophoresis on 4–12% (w/v) NuPAGE Bis-Tris gels (Life Technologies). Western Blot analyses were performed using a monoclonal primary antibody against the NNV coat protein (Abcam ab26812) followed by detection with a goat anti-rabbit secondary antibody conjugated to horseradish peroxidase and developed using the chemiluminescent substrate Immobilon Western (Millipore).

#### Transmission Electron Microscopy

Virus-like particles were adsorbed onto plastic and carbon-coated copper grids, washed with several drops of water and then stained with 2% (w/v) uranyl acetate for 15–30 s. Grids were imaged using a FEI Tecnai G2 20 Twin TEM with a bottommounted digital camera.

# Cryo-EM Data Collection

Cryo-EM grids were prepared by placing 3 µl of 0.9 mg/ml ACNNV-LPs onto 200 mesh grids with 2-µm holes (Quantifoil R2/2, Quantifoil Micro Tools, GmbH, Germany). Grids were glow-discharged for 30 s prior to plunge-freezing in liquid ethane cooled by liquid nitrogen, using a Leica-EM GP at 85% relative humidity. Data was collected on a FEI Titan Krios (ebic, Oxford, United Kingdom) transmission electron microscope at 300 kV, with a total electron dose of ∼45e<sup>−</sup> per Å<sup>2</sup> and a final object sampling of 1.06Å per pixel. A total of 2359 exposures were recorded using the EPU automated acquisition software on a Gatan K2 Summit energy-filtered direct detector (Gatan, Inc.). Each exposure movie had a total exposure of 2 s and contained 20 images.

## Image Processing and 3D Reconstruction

Drift-corrected averages of each movie were created using MOTIONCOR2 (Zheng et al., 2017) and the contrast transfer function of each micrograph was determined using Gctf (Zhang, 2016). Automated particle picking was performed using Gautomatch<sup>1</sup> , before several rounds of reference-free 2D classification were carried out in the RELION2.0 pipeline (Scheres, 2012; Kimanius et al., 2016). After each round, the best classes were taken to the next step of classification. Icosahedral symmetry was imposed during 3D auto-refinement and postprocessing was employed to appropriately mask the model, estimate and correct for the B-factor of the maps. The final resolutions were determined using the "gold standard" Fourier shell correlation criterion (FSC = 0.143). Local resolution was estimated in RELION2.0 which also generated a map filtered by local resolution. The ACNNV-LP cryo-EM map was deposited in the Electron Microscopy Data Bank under ID code EMD-4899.

# Atomic Model Building and Refinement

As a starting point for model building, a single asymmetric unit, comprising only the shell domain, of the GNNV VLP X-ray structure (PDB 4WIZ; Chen et al., 2015), was fitted into the ACNNV-LP EM density map using UCSF Chimera (Pettersen et al., 2004). In coot, the protein sequence was mutated to account for sequence differences between ACNNV and GNNV (**Supplementary Figure S2**) and fitted using the "real space refinement tool" (Emsley et al., 2010). The resulting model was then symmetrized in UCSF Chimera to generate the capsid and subject to refinement in Phenix (Headd et al., 2012). Iterative rounds of manual fitting in coot and refinement in phenix were carried out to improve non-ideal rotamers, bond angles and Ramachandran outliers (**Supplementary Table S1**). The coordinates for the ACNNV-LP asymmetric unit were deposited in the Protein Data Bank under the ID code PDB 6RJ0. Figures were generated using UCSF Chimera and PyMOL (The PyMOL Molecular Graphics System, Version 2.0 Schrödinger, LLC).

#### Immunization of Fish and Subsequent Challenge

Unvaccinated sea bass (Dicentrarchus labrax) fry were obtained from a commercial hatchery (Gravelines, France) with no history of NNV and grown on to average size of 30.5 g (12.09 cm) in UV treated seawater. Fish were fed commercial pellet diet (Gemma Diamond 1.8, Skretting). Duplicate experimental tanks of 420 l volume each containing 140 fish were established and fish acclimated for 2 weeks before vaccination. Ten stock fish were terminated prior to the start of the study using an approved method and blood sampled then serum used to

<sup>1</sup>http://www.mrc-lmb.cam.ac.uk/kzhang/

assess baseline immune status by enzyme linked immunosorbent assay (ELISA). Approximately 2 ml of purified plant-expressed ACNNV VLPs diluted in phosphate-buffered saline (PBS) were used for immunization. Protein concentration was determined by UV spectrophotometry at 280 nm (Nanodrop) and bicinchoninic acid (BCA) assay (Pierce) as per manufacturer's instructions.

Fish were starved for 24 h prior to vaccination, anesthetized with 0.01% Tricaine Methane sulphonate (MS222) then injected with 50 µl of VLP diluted in PBS (5 µg/fish) or 50 µl PBS (controls). 35 fish were vaccinated by intramuscular (IM) injection with VLPs, 35 fish by intraperitoneal (IP) injection with VLPs, 35 fish IM injected with diluent (PBS) and 35 fish IP injected with diluent (PBS) in each tank. Fish were held at 25◦C for 4 weeks (700◦ days) to allow development of any potential immune response. At this time, five fish from each group in each tank were terminated using an approved method and blood samples taken through the caudal vein. Sera were used to assess development of specific immune response by ELISA to detect anti-betanodavirus antibodies in serum. The ELISA test was performed as previously described by Taylor et al. (2010) modified with gradient purified RGNNV 378/102 as coating antigen and the mouse anti-European sea bass IgM monoclonal antibody (Aquatic Diagnostics, Stirling, United Kingdom).

Betanodavirus isolate 378/102 (RGNNV genotype) was propagated in the SSN-1 cell line (ECACC, 96082808) in Liebovitz's L15 media supplemented with 2 mM glutamine, 10% fetal bovine serum (FBS), 100 units penicillin/ml and 100 µg streptomycin/ml with incubation at 25◦C. Clarified cell culture supernatant containing virus was harvested and the titre determined by tissue culture infectious dose limiting dilution in SSN-1 cells. Virus harvest was diluted to 5 × 10<sup>5</sup> TCID50/ml for injection challenge. At 35 days post-vaccination (875◦ days) fish were challenged by IM injection of 0.1 ml of cell culture supernatant containing betanodavirus (isolate 378/102) at 5 × 10<sup>5</sup> TCID50/ml and held for a further 4 weeks of observation (**Supplementary Figure S4**). Fish showing clinical signs consistent with VER caused by nodavirus infection (darkening, separation from shoal, inability to maintain station, corkscrew swimming) were removed and terminated by an approved method when moribund. Whole brain and blood samples were taken from moribund terminated fish for subsequent serum analysis and confirmation of specific mortality (presence of nodavirus). Brain samples were collected throughout and stored at −80◦C until analysis. Blood samples were allowed to clot overnight at 4◦C serum removed and stored at −20◦C until analysis. For virus isolation and confirmation of specific mortality, whole brains were thawed, homogenized in 10 volumes (w/v) cell culture medium in microfuge tubes with glass beads using a FastPrep 24 benchtop homogeniser (MPBio). Homogenates were clarified by centrifugation at 3000 g for 5 min and supernatants inoculated onto SSN-1 cells at 1:100 and 1:1000 dilution in 24 well cell culture plates. Cultures were incubated at 25◦C for 7 days and observed for cytopathic effect by phase contrast light microscopy.

Survival probability of the different vaccinated and mock vaccinated groups during experimental challenge was represented using a Kaplan–Meier survival plot with 95% confidence intervals. Survival distribution was compared with a log-rank test to determine if statistical differences existed between the groups. A pairwise log-rank comparison was then performed between groups. Survival analysis was performed on replicate and then combined tank data per group.

# RESULTS

## Expression of NNV VLPs in Whole Plants

The leaves of N. benthamiana plants infiltrated with pEAQ-HT-NNV were harvested 6 days post-infiltration (dpi) by which time they started to show slight signs of chlorosis. After homogenization of the tissue and initial purification, the plant extract was centrifuged through a 20–60% (w/v) sucrose step gradient and fractions analyzed by SDS-PAGE and western blot analysis. This revealed that the majority of the protein with the expected size of the uncleaved alpha peptide (43 kDa) occurred in the 30 and 40% sucrose fractions (**Figure 1A**) and that this material specifically reacted with anti-ACNNV antibodies (**Figure 1B**). The stained gel also revealed that these fractions contained significant amounts of additional host-derived proteins.

To further purify the putative NNV VLPs, the 30 and 40% sucrose fractions were combined, dialysed and any VLPs pelleted by ultracentrifugation. The resulting pellets were resuspended in either PBS buffer or TBS buffer to examine the stability of the VLPs under different conditions. SDS-PAGE and western blot analysis suggested that both samples were very pure with few, if any contaminating proteins present (**Figure 1A**). A slight difference in the samples resuspended in the different buffers could be observed in the western blot B) where an additional lower molecular mass band could be seen in sample resuspended in PBS (lane 1). Most likely this protein, which was detected by the anti-NNV antibody, is the processed, beta, form of NNV coat protein (Hosur et al., 1987; Schneemann et al., 1992). As this band is not seen in the samples resuspended in TBS (**Figure 1B**, lane 2), it is likely that the TBS buffer, with its higher pH, stabilizes the VLPs and/or hinders maturation. TEM visualization of the sample of VLPs resuspended in TBS showed the presence of abundant particles of the size (25–30 nm diameter) expected for ACNNV VLPs (**Figure 2**). A yield of 10 mg/kg fresh weight of the highly purified VLPs was calculated based on the protein content of the sample.

# Expression of NNV VLPs in Tobacco BY-2 Cells

Since previous studies had shown that the pEAQ vectors were suitable for expression in transgenic tobacco BY-2 cells (Sun et al., 2011), we investigated the use of such cells as an alternative to expression in whole N. benthamiana plants. Stable transformation of tobacco BY-2 cells resulted in the production of several calli expressing the ACNNV coat protein as shown by western blot analysis using anti-NNV antibody. BY-2 cell line 16 displayed the highest concentration

FIGURE 1 | Purification of NNV VLPs. Clarified plant extract containing ACNNV VLPs was run through a sucrose gradient followed by pelleting. (A) Instant blue-stained SDS-PAGE gel and (B) western blot probed with anti-NNV (ab26812) antibody of the various stages of purification. The % sucrose in each fraction is indicated. S/N represents the sample before application to the sucrose gradient. The 30 and 40% sucrose fractions harboring the NNV VLPs were combined, dialyzed and then pelleted. Two different buffers were used for resuspension. Lane 1 = PBS buffer. Lane 2 = TBS buffer. M = SeeBlue Plus 2 with molecular weights indicated. Arrows indicate ACNNV coat protein alpha (α) or mature protein beta (β) in the western blot shown in (B) and the red boxes indicate the position of the ACNN.

of NNV coat protein and thus was chosen for larger scale VLP production. To this end, a 500 ml culture of line 16 was produced and VLPs were isolated from the cells and purified through a discontinuous sucrose cushion. Western blot analysis of the sucrose gradient fractions suggested that the coat protein assembled into VLPs and there was evidence of the cleavage of the alpha to the beta protein (**Figure 3A**). TEM analysis of the purified material indicated that VLPs had been formed (**Figure 3B**) but these were only present in small amounts compared with the levels produced in whole plants.

# Cryo-EM Analysis of Plant-Produced ACNNV VLPs

For a detailed structural analysis of the plant- produced ACNNV VLPs, cryo-EM was performed on the sample prepared from N. benthamiana leaves. Images of unstained, frozen-hydrated

ACNNV VLPs were imaged using an electron microscope fitted with a direct electron detecting camera (**Supplementary Figure S1A**). Single-particle image processing was carried out in Relion, which allowed us to determine the solution structure of the VLP to a global resolution of 3.7 Å (**Supplementary Figure S1B**). The ACNNV VLP is an isometric particle with T = 3 symmetry (**Figures 4A,B**), consistent with previously reported structures of the grouper nervous necrosis virus (GNNV) VLPs which share ∼85 % sequence identity (Cheng et al., 1994; Chen et al., 2015). The shell domain displays the highest local resolution (**Supplementary Figure S1C**) which allowed the building of an atomic model of the capsid shell (**Figures 4C,D**). The viral asymmetric unit is composed of three quasi equivalent copies of the coat protein which have a characteristic jelly roll topology, with clear side-chain density (**Figure 4E**). Three P-domains of the capsid protein interact and form spikes on the particle surface (**Figure 5A**). The P-domains are attached to the rest of the capsid protein by a linker region (**Figure 5B**) and display lower local resolution (**Supplementary Figure S1D**), presumably as a result of flexibility. Overall, the cryo-EM analysis confirmed that the structure of the plant-produced VLPs is very similar that of material produced in other expression systems.

## Plant-Made NNV VLPs Protect Sea Bass Against Viral Challenge

To investigate the immunogenicity of the plant-produced ACNNV VLPs, sea bass (D. labrax) were vaccinated with purified VLP preparations. Serum recovered from mockvaccinated control and ACNNV VLP-vaccinated fish injected by either the IP or IM route showed no statistically significant levels of specific anti-betanodavirus antibodies by ELISA (**Supplementary Figure S3**). Despite a lack of detectable humoral response in vaccinated fish, the challenge was undertaken based on the potential for protection as a result of cellular (i.e., antibody-dependent cell-mediated cytotoxicity) immunity and the possibility that the ELISA using heterologous capture antigen (RGNNV) was not informative for ACNNV VLP- vaccinated fish despite these genogroups sharing serogroup C.

The first clinical signs of inappetence and reduced activity were observed on the 5th day post-exposure in a small number of individuals in both tanks. This was followed a day later by loss of equilibrium and spiral swimming leading to mortality or moribundity, and removal from the experiment of a total of 5 and 11 fish from each tank, respectively, all of which were sham-vaccinated controls. Henceforth the term mortality will be used to refer to actual mortalities and humanely terminated moribund animals, the latter of which represent the majority of removed animals.

Survival probabilities by group are represented with the Kaplan–Meier survival plot (**Figure 6**). Pairwise comparison demonstrated no significant difference between replicate tanks (data not shown), thus survival analysis is shown in **Figure 6** utilizing combined data. Mortality in the VLP vaccinated groups was significantly lower (p < 0.05) than in the sham vaccinated controls. Pairwise comparison showed no significant difference in vaccination route.

Cumulative mortality in the IP and IM sham vaccinated controls reached between 57.1 and 60.8% (**Table 1**). First challenge mortality in vaccinated fish was delayed by approximately 2 or 4 days for IP-vaccinated and IM-vaccinated fish, respectively. In contrast, cumulative mortality in the IP VLP vaccinated groups reached only 20.7 and 20.8% in the duplicate tanks, and cumulative mortality in the IM VLP vaccinated groups reached only 7.7 and 13.0% in the duplicate tanks, respectively (**Table 1**). This represents a relative percent survival at the end of the study [(RPSend) Amend, 1981] for vaccinated fish ranging from 63.6 to 86.5% with the IM route being the more effective (**Table 1**). The data

density containing the refined atomic model.

indicate that the plant-made VLPs, even in the absence of immune promoting adjuvant, can significantly protect sea bass against NNV.

## DISCUSSION

Several vaccine candidates against NNV have been developed but none has been commercialized yet. However, the expanding world population and overfishing of the oceans has led to a growing demand for farmed fish, making it a very high value industry. NNV causes mortality rates up to 99% particularly in juvenile stages and is a significant impediment to expanding aquaculture and food security, particularly in the Mediterranean region, thus the interest in a prophylactic treatment such as a vaccine, is high (Lin et al., 2007). The aim of this study was to investigate the feasibility of making a plant-produced NNV VLP vaccine. Initial experiments involved transiently expressing ACNNV VLPs in N. benthamiana plants and optimizing their expression level, purification and stability. The transiently expressed NNV VLPs accumulate to relatively high levels in plants within 4-6 dpi and yields of 10 mg purified VLPs/Kg fresh weight leaf tissue could be obtained. As an alternative, the use of tobacco BY-2 cells in culture as a production system was also investigated. The results showed

TABLE 1 | Cumulative mortality (%) and relative percent survival (RPSend) for IP and IM administered VLP vaccine.


that it is possible to produce lines of cells expressing the ACNNV coat protein and that the protein assembled into VLPs. However, as the yield was relatively low, subsequent characterisation was carried out on the VLPs purified from N. benthamiana leaves.

The relatively simple situation where only one protein that self-assembles into an icosahedral structure of 180 copies makes them an attractive model for studies of VLP production in plants. The high-resolution structural studies revealed that the plant-made ACNNV VLPs are of isometric shape with T = 3 symmetry, consistent with previously reported structures and authentic-looking when compared to the wild-type NNV. These data confirm the previous conclusion that plants are an effective way of producing VLPs (Marsian and Lomonossoff, 2016). Protection against NNV was carried out in sea bass with a view to determining whether the plant-produced VLPs could serve as a candidate vaccine in target animals. Despite a lack of a detectable humoral response, the plant-produced VLPs, in the absence of immune stimulating adjuvant, were shown to confer moderate to strong protection in viruschallenged fish. Hodgins et al. (2017) demonstrated that a plantexpressed VLP vaccine protected mice from highly virulent influenza H1N1 despite an absent of humoral responses. The authors showed that cellular immune responses contributed to protection in the vaccinated animals, suggesting that the mechanism of protection may be antibody-dependent cellmediated cytotoxicity. In the present study, the level of protection afforded is consistent with that observed in other experimental vaccines for NNV (Yamashita et al., 2005; Thiery et al., 2006) and established vaccines for other viral diseases limiting aquaculture (Gudding et al., 2014; Munang'andu and Evensen, 2019). This is encouraging and suggests that plantbased vaccines have a future for deployment in aquaculture, however, more investigations are needed to understand the mechanisms of protection.

The results presented here were achieved through parenteral administration of the candidate vaccine. However, when working with a large number of fish and at juvenile stages immersion or oral vaccination would have significant advantages. Possibilities of incorporating plant material harboring NNV VLPs into feed pellets and vaccine efficacy thereof should be investigated. Such an approach would be simple, inexpensive and rapid as it does not require purification of the VLPs. The use of cells, such as BY-2, in culture may be particularly suited to such an approach though it would be essential to optimize yields.

# DATA AVAILABILITY

The datasets generated for this study can be found in wwPDB, EMD-4899, PDB ID 6RJ0.

# ETHICS STATEMENT

All experimental animal use was performed in accordance with UK Home Office Regulations under the Animals (Scientific Procedures) Act 1986, with scrutiny and approval by the Cefas Weymouth Laboratory local Animal Welfare Ethical Review Body (AWERB).

# AUTHOR CONTRIBUTIONS

GL, AR, RP, and NR conceived and planned the study. JM, AR, and DH performed the VLP expression and characterisation experiments, and analyzed the data. RP and IC planned and performed the experiments and sample analysis for immunization and challenge in fish. All authors contributed key ideas, analyzed the data, and wrote the manuscript.

# FUNDING

At the John Innes Centre (JIC) this work was supported by the UK Biotechnological and Biological Sciences Research Council (BBSRC) grant BB/M027856/1, Institute Strategic Programme Grants "Understanding and Exploiting Plant and Microbial Secondary Metabolism" (BB/J004596/1) and "Molecules from Nature – Enhanced Research Capacity" (BBS/E/J/000PR9794) and the John Innes Foundation. At VTT the work was supported by Ministry of Agriculture and Forestry (2145/312/2014) and VTT. We thank the Wellcome Trust for Ph.D. Studentship support to DLH (102572/B/13/Z).

# ACKNOWLEDGMENTS

The very skillful technical assistance of Jaana Rikkinen at VTT is highly appreciated. We thank Drs Daniel Clare and Alistair Siebert at the Electron Bioimaging Centre at the Diamond Light Source where the cryo-EM data set was collected.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00880/ full#supplementary-material

FIGURE S1 | Cryo-EM analysis of ACNNV-LPs. (A) Typical micrograph from the ACNNV-LP dataset (scale bar = 50 nm). (B) The plot of the Fourier shell coefficient (FSC). Based on the 0.143 criterion for the gold standard comparison of two

independent data sets, the resolution of the reconstruction is 3.7 Å. (C) An isosurface representation (3σ) of the 3.7 Å ACNNV-LP structure viewed down an icosahedral two-fold axis and colored according to local resolution. (D) An isosurface representation (1.2σ) of the 3.7 Å ACNNV-LP structure viewed down an icosahedral twofold axis and colored according to local resolution. The local resolution coloring scheme is shown in angstroms.

FIGURE S2 | Alignment of the amino acid sequences of the ACNNV and GNNV coat proteins.

#### REFERENCES


FIGURE S3 | ELISA results: Box and whisker plots showing log transformed average optical density (n = 10) of serum samples from preimmune, sham control (PBS) and nodavirus VLP vaccinated fish by IM and IP administration routes. Positive controls are known positive sera from previous challenges.

FIGURE S4 | Schematic diagram showing vaccination and challenge schedule. d: days; DD: degree days; IP: intraperitoneal; IM: intramuscular.

TABLE S1 | Data-collection parameters and model statistics.

phenix.refine to improve macromolecular refinement at low resolution. Acta Crystallogr. D Biol. Crystallogr. 68, 381–390. doi: 10.1107/S0907444911047834



**Conflict of Interest Statement:** AR is employed by VTT, a non-profit liability company. GL declares that he is a named inventor on granted patent WO 29087391 A1 which describes the transient expression system used in this manuscript to express the ACNNV VLPs.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Marsian, Hurdiss, Ranson, Ritala, Paley, Cano and Lomonossoff. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Seasonal Weather Changes Affect the Yield and Quality of Recombinant Proteins Produced in Transgenic Tobacco Plants in a Greenhouse Setting

#### *Matthias Knödler1,2, Clemens Rühl1, Jessica Emonts1 and Johannes Felix Buyel1,2\**

#### *Edited by:*

*Anneli Ritala, VTT Technical Research Centre of Finland Ltd, Finland*

#### *Reviewed by:*

*Rima Menassa, Agriculture and Agri-Food Canada, Canada Qiansi Chen, Zhengzhou Tobacco Research Institute of CNTC, China*

#### *\*Correspondence:*

*Johannes Felix Buyel johannes.buyel@rwth-aachen.de; johannes.buyel@ime.fraunhofer.de*

#### *Specialty section:*

*This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science*

*Received: 27 May 2019 Accepted: 06 September 2019 Published: 08 October 2019*

#### *Citation:*

*Knödler M, Rühl C, Emonts J and Buyel JF (2019) Seasonal Weather Changes Affect the Yield and Quality of Recombinant Proteins Produced in Transgenic Tobacco Plants in a Greenhouse Setting. Front. Plant Sci. 10:1245. doi: 10.3389/fpls.2019.01245*

*1 Bioprocess Engineering, Fraunhofer Institute for Molecular Biology and Applied Ecology IME, Aachen, Germany, 2 Institute for Molecular Biotechnology, RWTH Aachen University, Aachen, Germany*

Transgenic plants have the potential to produce recombinant proteins on an agricultural scale, with yields of several tons per year. The cost-effectiveness of transgenic plants increases if simple cultivation facilities such as greenhouses can be used for production. In such a setting, we expressed a novel affinity ligand based on the fluorescent protein DsRed, which we used as a carrier for the linear epitope ELDKWA from the HIVneutralizing antibody 2F5. The DsRed-2F5-epitope (DFE) fusion protein was produced in 12 consecutive batches of transgenic tobacco (*Nicotiana tabacum*) plants over the course of 2 years and was purified using a combination of blanching and immobilized metal-ion affinity chromatography (IMAC). The average purity after IMAC was 57 ± 26% (n = 24) in terms of total soluble protein, but the average yield of pure DFE (12 mg kg−1) showed substantial variation (± 97 mg kg−1, n = 24) which correlated with seasonal changes. Specifically, we found that temperature peaks (>28°C) and intense illuminance (>45 klx h−1) were associated with lower DFE yields after purification, reflecting the loss of the epitope-containing C-terminus in up to 90% of the product. Whereas the weather factors were of limited use to predict product yields of individual harvests conducted for each batch (spaced by 1 week), the average batch yields were well approximated by simple linear regression models using two independent variables for prediction (illuminance and plant age). Interestingly, accumulation levels determined by fluorescence analysis were not affected by weather conditions but positively correlated with plant age, suggesting that the product was still expressed at high levels, but the extreme conditions affected its stability, albeit still preserving the fluorophore function. The efficient production of intact recombinant proteins in plants may therefore require adequate climate control and shading in greenhouses or even cultivation in fully controlled indoor farms.

Keywords: batch reproducibility, environmental correlation, fluorescent protein carrier, greenhouse cultivation, plant molecular farming, protease activity

# HIGHLIGHTS


#### INTRODUCTION

Plants have been developed as expression systems for the production of recombinant proteins including biopharmaceuticals (Hiatt et al., 1989), some of which are entering clinical trials (Ma et al., 2015), and a few are already on the market (Mor, 2015). Plant-based expression systems offer pharmaceutical companies several advantages compared to traditional mammalian cell culture platforms, including lower upstream production costs, better intrinsic safety, and greater scalability (Buyel et al., 2015; Sack et al., 2015; Spiegel et al., 2018). The scalability of plants is especially appealing if agricultural infrastructure can be used because this would provide sufficient capacity to produce several tons of purified protein per year (Stoger et al., 2002; Buyel et al., 2017). However, the risk of contamination is elevated by the abundance of pathogens, animals, and agrichemicals in the open field, so fully contained facilities have been designed that allow the controlled cultivation of plants on a medium to large scale (Wirz et al., 2012; Holtz et al., 2015). Such facilities require substantial upfront investment and operate under complex and thus error-prone process control systems, which may offset some of the cost savings achieved by switching from mammalian cells to plants. Greenhouses offer an attractive compromise because they achieve sufficient containment with only moderate infrastructure costs, as shown by the use of greenhouse facilities to cultivate plants expressing a monoclonal antibody that was purified and used in phase I clinical trials (Ma et al., 2015; Sack et al., 2015).

One drawback of greenhouse cultivation is the incomplete control of environmental conditions such as temperature and light, but the effects of these parameters on recombinant protein expression have not been considered in detail. Here, we describe the results of a long-term study in which transgenic tobacco (*Nicotiana tabacum* cv. Petit Havana SR1) plants expressing a recombinant fusion protein were cultivated in 12 consecutive batches in a greenhouse setting over the course of 2 years. The fusion protein comprised the fluorescent marker protein DsRed (Baird et al., 2000) with a C-terminal extension featuring a linear epitope (ELDKWA in the one-letter amino acid code) from the HIV-neutralizing antibody 2F5 (Muster et al., 1993; Parker et al., 2001), a His6 affinity tag, and a KDEL tag for retrieval of the

protein to the endoplasmic reticulum. The product was named DFE, for **D**sRed-2**F**5-**E**pitope (**Figure 1A**) (Rühl et al., 2018). We monitored the accumulation of DFE in 12 batches at several growth stages and also recorded the absolute product yield, the recovery after purification, and seasonally dependent protease activity reflecting the changing climate in the greenhouse. We discuss the impact of these environmental parameters on the production of recombinant proteins in plants cultivated in a greenhouse setting.

#### MATERIALS AND METHODS

#### Plant Material and Cultivation

Transgenic tobacco plants (*N. tabacum* cv. Petit Havana SR1) expressing DFE were generated as previously described (Buyel et al., 2014) and bred to the fifth generation (T5) by self-pollination to produce seeds for further experiments. The T5 plants were cultivated in a greenhouse at the Fraunhofer Institute for Molecular Biology and Applied Ecology IME, Aachen, Germany (50°47′07.1″N 6°03′00.5″E) from 21 December 2015 to 6 February 2018. The plants were grown in soil and were irrigated with 0.1% (m/v) Ferty 2 MEGA (Gärtnereibedarf Kammlott GmbH, Erfurt, Germany). The temperature in the greenhouse was set to a maximum of 27°C and a minimum of 22°C (day) and 20°C (night) with artificial auxiliary lighting (400 W) provided by MASTER HPI-T Plus quartz metal halide and MASTER Agro highpressure sodium lamps (Koninklijke Philips, Amsterdam, Netherlands) distributed so that 0.75 lamps of each type were provided per m2 cultivation area (9.5 klx, corresponding to ~180 μmol s−1 m−2 according to the manufacturer's data; λ = 400–700 nm). The lights were activated if illuminance fell below 50.0 klx during the 16-h photoperiod. The relative humidity was set to 50% with a control range of 30–70% and measured with digital hygrometers inside and outside of the greenhouse. Additionally, outside wind velocity (m s−1) and daily precipitation (mm) were measured using a digital anemometer and a digital hyetometer, respectively, both positioned outside at the greenhouse gable. The plants were cultivated for 36–63 days depending on their size before harvest. A single harvest of 10 plants was conducted for the first six batches, and their leaves were subjected to extraction. For batches 7–12, additional harvests of up to 10 plants were conducted 1 week before and 1 week after the main harvest, increasing the total number of plants up to 30 per batch.

#### Protein Extraction and Clarification

Total soluble protein (TSP) was extracted from 3 to 10 kg of leaves after blanching (Rühl et al., 2018) by maceration in a blade-based homogenizer containing 3 L of extraction buffer (50 mM sodium phosphate, 500 mM sodium chloride, 10 mM sodium bisulfite, pH 8.0) per kilogram of wet biomass, as previously described (Buyel and Fischer, 2014a). The extract was clarified by passage **Abbreviations:** IMAC, immobilized metal-ion affinity chromatography; TSP, total

soluble protein.

FIGURE 1 | Climate data and plant growth during the production of DFE. (A) Three-dimensional structure of a DFE tetramer. The DsRed part is shown in red, and the 2F5 epitope of the fusion part consisting of the 2F5 epitope, His6 tag, and KDEL tag is highlighted in gold. (B) Temperature and illuminance data for the duration of this study. Batch durations are marked with double-T lines and numbers (blue = winter, red = summer). Vertical lines indicate the transition between seasons (blue = winter, red = summer). Dashed horizontal lines indicate temperature control limits. (C) Correlation between integrated temperature >28°C and integrated illuminance >45 klx approximated by a two-parameter exponential model. (D) Heat map of ~20,000 temperature–illuminance data points recorded with a frequency of one measurement per hour during 12 batches shown in (B) The dotted triangle marks data well within the temperature control range whereas the dashed triangle indicates a region with measurements outside the control limits. (E) Plant height according to plant age observed during cultivation in winter or summer for transgenic plants expressing DFE as well as wild-type controls. Error bars indicate the standard deviation (n ≥ 10); individual lines correspond different batches. (F) Average plant biomass at the time of harvest according to the age of transgenic plants expressing DFE. Six data points were obtained from a first set of six batches, whereas 18 data points were collected from a second set of six batches for which plants were harvested at three time points each. Lines indicate linear regression model on the data for winter (n = 9, r = 0.89) and summer (n = 15, r = 0.91) harvests.

through a series of bag, depth, and sterile filters (Buyel and Fischer, 2014).

#### Purification of DFE by Chromatography

DFE was purified on an ÄKTApure system (GE Healthcare, Uppsala, Sweden) fitted with an XK-26 column containing 53 ml of chelating Sepharose fast-flow-immobilized metal-ion affinity chromatography (IMAC) resin loaded with nickel ions and pre-conditioned with extraction buffer lacking sodium bisulfite. After loading the clarified extract, unbound proteins were washed through with 10 column volumes of the same buffer, and bound proteins were then eluted in the same buffer supplemented with 300 mM imidazole at a flow rate of 50 cm h−1. The protein and nucleic acid concentrations were monitored at 280 and 260 nm, respectively.

#### Protein and Product Quantitation

The TSP concentration was determined using a microtiter plate version of the Bradford method (Buyel and Fischer, 2014), and the protein composition was analyzed by lithium dodecylsulfate (LDS) polyacrylamide gel electrophoresis (PAGE) followed by gel staining with Coomassie Brilliant Blue (Menzel et al., 2016). DFE was quantified by fluorescence spectroscopy against DsRed standards (Buyel and Fischer, 2012). Briefly, DsRed fluorescence in the clarified extracts was measured using a Synergy HT microplate reader (BioTek Instruments, Winooski, Vermont, USA) fitted with 530/25 (excitation) and 590/35 (emission) nm filter sets. A standard curve was prepared with DsRed dilutions in the range 0–225 mg L−1, and the protein accumulation level per gram wet biomass was calculated as described elsewhere (Gengenbach et al., 2018). The presence of the C-terminal 2F5-epitope (ELDKWA) and His6 tag was confirmed by immunoblotting using an in-house preparation of the human monoclonal antibody 2F5 (Rühl et al., 2018) and a monoclonal rabbit anti-His6 antibody (BioVision, Milpitas, California, USA), respectively. These primary antibodies were detected using polyclonal goat-anti-human and goat-antirabbit immunoglobulin secondary antibodies, respectively, each conjugated to alkaline phosphatase (Jackson ImmunoResearch, Cambridge, UK).

#### Measurement of Protease Activity

The protease activity in plant extracts after blanching heat treatment and in untreated controls was determined using a colorimetric protease assay (Thermo Fisher Scientific, Waltham, Massachusetts, USA) according to the manufacturer's instructions. Samples were diluted in the range 1:5–1:40 to adjust the TSP concentrations to the same order of magnitude and were then measured in triplicate in 96-well plates. Six trypsin standards with concentrations of 0.005–500 mg L−1 were used to build duplicate standard curves. For each sample and standard, 100 µl of succinylated casein solution was pipetted into one well and 100 µl of working buffer into another as a blank, before transferring 50 µl of the sample to both. The plates were incubated for 20 min at 22°C before adding 50 µl of working solution to each well and were then incubated for an additional 20 min at 22°C.

The absorbance in each well was measured twice at 450 nm using an EnSpire multimode plate reader, and the data were exported in EnSpire Manager v4.13 (Perkin Elmer, Waltham, Massachusetts, USA).

#### Data Analysis

Key figures were calculated for each weather factor (Kwf) for the entire growth period as well as for individual growth phases [germination up to 0–14 days post-seeding (dps), sprouting 15–24 dps, growth 25–38 dps, and maturation 38 dps to harvest] as shown in Equation 1, where *m* is the number of data points (one value for every hour during cultivation), *wfact* is the actual value of the weather factor at time point *j*, and *wfset* is the set point or critical value of the weather factor.

$$K\_{\ast f} = \sum\_{j=1}^{m} \left( \nu f\_{\text{act}} - \nu f\_{\text{set}} \mathbf{1}(\nu f\_{\text{act}} > \nu f\_{\text{set}}) \right) \tag{1}$$

Averages of the weather factors and their key figures were calculated for the total cultivation period and for the maturation phase.

The sample Pearson correlation coefficient *rXY* (Equation 2) was used to describe the correlation between two variables, where {(*x*1, *y*1), (*x*2, *y*2), …,(*x*n, *y*n)} are *n* given data pairs, and *x* and *y* are the sample averages.

$$r\_{XY} = \frac{\sum\_{i=1}^{n} \left(\chi\_i - \overline{x}\right) \left(\chi\_i - \overline{\chi}\right)}{\sqrt{\sum\_{i=1}^{n} \left(\chi\_i - \overline{x}\right)^2} \sqrt{\sum\_{i=1}^{n} \left(\chi\_i - \overline{\chi}\right)^2}},$$

$$\overline{x} = \frac{1}{n} \sum\_{i=1}^{n} x\_i, \overline{\chi} = \frac{1}{n} \sum\_{i=1}^{n} \chi\_i \tag{2}$$

The significance of the null-hypothesis ρ*XY* = 0 was tested based on variable *t*, which is characterized by Student's *t*-distribution with *n* − 2 degrees of freedom (Equation 3).

$$t = \frac{r\_{XY} \cdot \sqrt{n - 2}}{\sqrt{1 - r\_{XY}^2}} \tag{3}$$

Thus, Student's *t*-distribution provides the probability (*p*-value) to observe a value for *t* that is at least as large as a critical *t* value for a specific significance level α (here, α = 0.05). This is equivalent to the probability of finding a correlation coefficient *rXY* at least as large as that used for the underlying calculation of *t* given that the true correlation is zero. We considered *p*-values <5% as interesting and *p*-values <1% as significant.

Partial correlations between two variables *X* and *Y* without the effect of a third variable *U* were found by first establishing a linear regression between *X* or *Y* and *U* and then calculating the correlation between the residues (Equation 4).

$$r\_{XY \cdot U} = \frac{r\_{XY} - r\_{XU} \cdot r\_{YU}}{\sqrt{1 - r\_{XU}^2} \sqrt{1 - r\_{YU}^2}} \tag{4}$$

We calculated the significance of the partial correlation using the test statistic *t\** (Equation 5), which is also characterized by Student's *t*-distribution, but this time with n − 3 degrees of freedom.

$$t^\* = \frac{r\_{XY \cdot U} \cdot \sqrt{n - 3}}{\sqrt{1 - r\_{XY \cdot U}^2}} \tag{5}$$

A multiple linear model was used to describe dependent variable *Y* by *q* independent variables *X*1, *X*2, …, *X*q by means of a linear function with disturbance ϵ (Equation 6).

$$Y = \beta\_0 + \beta\_1 X\_1 + \dots + \beta\_q X\_q + \epsilon \tag{6}$$

The model coefficients were estimated by minimizing the sum of squared differences between the predicted and observed values (ordinary least squares, Equation 7) with {(*x*11, …, *x*q1, *y*1), (*x*12, …, *x*q2, *y*2), …,(*x*1n, …, *x*qn, *y*n)} *n* data-tuples.

$$\sum\_{i=1}^{n} \left( \wp\_i - \beta\_0 + \beta\_1 \varkappa\_{1i} + \dots + \beta\_q \varkappa\_{qi} \right)^2 \to \min\_{\beta\_0, \beta\_1, \dots, \beta\_q} \tag{7}$$

The coefficient of determination and the adjusted coefficient of determination for the (multiple) linear regression models were calculated using Equations 8 and 9, respectively.

$$R^2 = 1 - \frac{\sum\_{i=1}^{n} \left(\boldsymbol{\wp}\_i - \boldsymbol{\hat{\wp}}\_i\right)^2}{\sum\_{i=1}^{n} \left(\boldsymbol{\wp}\_i - \overline{\boldsymbol{\wp}}\right)},\tag{8}$$

with , *y y* observed and *y y* , predicte <sup>1</sup> … … *n i <sup>n</sup>* , ˆ ˆ d values, *<sup>y</sup> <sup>n</sup> <sup>y</sup> i n* = *<sup>i</sup>* = ∑ 1 1

$$\text{Load}.\,\,R^2 = R^2 - \left(1 - R^2\right) \times \frac{n-1}{\left(n-q-1\right)}\tag{9}$$

The slopes of two linear regression functions were compared using statistic t\*\* (two-sided test, α = 0.05) (Equation 10), where b1 and b2 are the function slopes, s.e.(b1–b2) is the standard error of the difference between the slopes, Sxx is the sum of squared differences between the independent variable and its mean value, SSE is the sum of squared errors, and s2 is the pooled estimator of variance.

$$\mathbf{t}^{\*\*} = \frac{b\_1 - b\_2}{s.e.(b\_1 - b\_2)},\\ s.e.(b\_1 - b\_2)\mathbf{s}^2 = \sqrt{\mathbf{s}^2 \times \left[\frac{1}{s\_{\infty, 1}} + \frac{1}{s\_{\infty, 2}}\right]},$$

$$\mathbf{s}^{\*\*} = \frac{\text{SSE}\_1 + \text{SSE}\_2}{n\_1 + n\_2 - 4},\\ \text{SSE} = \sum\_{i=1}^n \left(y\_i - \hat{y}\_i\right)^2,\\ \mathbf{S}\_{\mathbf{x}\mathbf{c}} = \sum\_{i=1}^n \left(\mathbf{x}\_i - \overline{\mathbf{x}}\right)^2 \tag{10}$$

The relative yield of an individual harvest from one batch was calculated using Equation 11, where ryi is the relative harvest yield with i denoting the harvest time (1—first harvest, 2—second harvest, 3—third [final] harvest), yi is the DFE yield of the harvest in mg kg−1 biomass, and *y* is the average yield of one batch.

$$\eta \gamma\_i = \frac{\overline{\nu\_i}}{\sum\_{i=1}^n \overline{\nu\_i}} = \frac{\overline{\nu\_i}}{\overline{\nu}} \tag{11}$$

#### RESULTS AND DISCUSSION

#### Greenhouse Climate Control Can Be Insufficient to Maintain Homogeneous Plant Growth During Changing Seasons

Transgenic plants expressing the 28.4-kDa recombinant protein DFE (**Figure 1A**) were cultivated in an initial set of six batches over a period of 12 months covering all seasons of the year. Between April and September (hereafter termed "summer"), intense illuminance and high temperatures externally (average outdoor temperature 15.4°C) resulted in average values of 25.0°C and 8.4 klx inside the greenhouse, compared to 22.7°C and 5.9 klx between October and March (hereafter termed "winter," average outdoor temperature 6.1°C) (**Figure 1B**). Within the temperature control range of 20 ± 2°C during the night and 25 ± 3°C during the 16-h photoperiod, the temperature correlated with the illuminance (**Figures 1C**, **D** and **S1A**; adj. R2 = 0.89). There were days during the summer when climate control was insufficient to maintain the temperature within the specification limits (**Figures 1B**, **D**). These out-of-specification temperatures of more than 28°C were associated with an illuminance of >45 klx in 72% of instances (**Figures S1B, C**), indicating that intense sunlight resulted in a greenhouse effect, increasing temperatures in the cultivation area beyond the capabilities of the climate control system.

Plant growth was accelerated under the warm summer conditions reaching a threshold height of 500 mm as early as 40 dps, compared to 47 dps for batches cultivated during the winter (**Figure 1E**). We did not observe any statistically significant differences in growth between the transgenic plants expressing DFE and corresponding *N. tabacum* wild-type controls based on a slope comparison for plant height development [α = 0.05; n = 24 (winter), or n = 15 (summer)]. The biomass yield was positively correlated with plant age (**Figure 1F**), and the slope of this correlation was significantly higher for summer compared to winter batches according to a slope comparison. Our results agreed well with previous reports claiming that the optimal growth temperature for tobacco is in the range 18.5–28.5°C (Parups and Nielsen, 1960; Yamori et al., 2010; Yang et al., 2018).

#### Product Recovery by IMAC Purification Is Reduced in Summer Batches

We did not observe any correlation between biomass and recombinant protein yields, but we found that the recovery and yield of intact full-length DFE after IMAC purification (measured by detecting the presence of the C-terminal epitope and His6-tag) varied substantially—for example, falling within the range 3.9–30.0 mg kg−1 in the course of 1 year (**Figures 2** and **S1B–D**). In contrast, the yield of a monoclonal antibody expressed in transgenic tobacco plants was previously shown to increase by 25–150% during the summer season (Sack et al., 2015). Interestingly, the specific recovery during the IMAC capture step decreased from ~35% for winter batches to <10% for summer batches (**Figure S2G**), and we observed substantial fluorescence in the flow-through fraction (**Figure S2E**). One possible explanation was that the DFE conformation was altered in the summer batches such that the His6 tag was no longer accessible by the IMAC resin, as reported for other proteins such as erythropoietin (Debeljak et al., 2006). But given that we were unable to detect either the C-terminal 2F5 epitope or His6 tag in the IMAC flow-through fraction by western blot analysis even when the samples were denatured (which would expose any linear epitopes hidden by conformational changes), proteolytic degradation potentially triggered by illumination and/or heat stress appeared a more likely explanation (Jutras et al., 2018). Also, compared to the short but intense heating during blanching which causes permanent protease inactivation (Menzel et al., 2018), the heat stress during cultivation was a moderate but long-term (several days) effect which can result in endogenous protease expression. We therefore screened the DFE sequence, especially in the linker region of the fusion protein, for known protease cleavage sites. Using PROSPER prediction software (Song et al., 2012), we found two cleavage sites for cysteine protease cathepsin K close to the C-terminus of the protein at positions 243 and 244 of the 267-amino-acid sequence (N-fragment size = 30.8 kDa) with scores of 1.20 and 1.06, respectively (scores > 0.8 are considered interesting). Proteases of this class have previously been associated with the degradation

FIGURE 2 | DFE recovery and yield over the changing seasons. (A) DFE recovery following IMAC purification. The recovery is defined as the ratio of DFE in the elution fraction (after purification) and the DFE amount in the load (before purification). (B) Overall DFE yield after purification per kilogram of fresh plant biomass. Labels in B (numbers) indicate the allocation to summer (red) or winter (blue) batches, and the black vertical dotted lines separate the first (left) and second (right) set of batches. Horizontal lines indicate the average batch recovery (A) and yield (B) calculated based on all harvests of one batch, whereas colored point-scatter plots correspond to the individual harvest-specific recovery (A) and yield (B) values in the second batch set. Vertical colored dotted lines mark the transitions between the growth phases in each batch (left line = germination to sprouting, middle line = sprouting to growth, and right line = growth to maturation) analyzed in Table 2. Light green areas mark the inside temperature in a greenhouse whereas gray columns correspond to the integrated illuminance.

of recombinant proteins in plants (Niemer et al., 2016). Sites for other proteases were identified in the central region of the protein but cleavage would also abolish the fluorescence of the DsRed parent protein. The detection of fluorescence notwithstanding the loss of the C-terminal epitopes indicated that these sites were not cleaved in our plants.

#### Product Yield Is Negatively Correlated With Light Intensity Above 45 klx and Temperature Above 28°C

We then analyzed a second set of six batches based on weather [temperature in °C, illuminance in klx, daily precipitation (rain) in mm, outside wind velocity in m s−1, and relative humidity in %], cultivation (plant age and biomass) factors (**Figure S3**), as well as biochemical responses (protease activity, TSP, DFE accumulation, recovery, and yield) from plant samples harvested at three different time points for each batch (**Figures S4** and **S5**). In a first step, we treated the 18 data points (six batches with three harvests each) for each response as independent results, not grouping them by batch.

The protease assay was sensitive to most endoprotease types including serine, acidic, sulfhydryl, and metalloproteases, but we did not observe any correlation between their activity and DFE accumulation levels or yield (**Figure S5**, first column, fourth row). One potential explanation is that the assay sensitivity varies for different protease types, as stated in the manufacturer's notes, and therefore a change in the activity of one class of proteases might remain unnoticed. Plant age and biomass showed a moderate positive correlation with DFE yield (r = 0.51 and 0.43, respectively) (**Figure S4**, first column, fourth and fifth rows). There was no apparent correlation between internal air humidity and DFE yield, and only a moderate correlation with wind (**Table 1**). A substantial drop in product recovery and yield was observed for increasing temperature and illuminance.

Specifically, the DEF yield was on average 50% lower in the summer (6.9 ± 4.2 mg kg−1, n = 12) compared to winter batches (14.1 ± 6.6 mg kg−1, n = 6), which was a significant difference based on a two-sided two-sample *t*-test (n = 18, α = 0.05, p = 0.045; **Figure 2B**). Interestingly, rain and external humidity were positively correlated with DFE yield to a similar extent as the negative correlation with illuminance. Our interpretation is that these parameters describe the cultivation area shading as a common underlying phenomenon—for example, due to clouds.

Exploratory data analysis revealed that the DFE yield decreased in line with more frequent, longer, or stronger deviations of illuminance or temperature from the target climate setting. We therefore integrated the temperature above the control threshold of 28°C (Temp≥28) and the illuminance above a threshold of 45 klx (Ill≥45), which correlated with temperatures above the control threshold (**Figure 1C**) to derive an additional set of key figures. The integration considered either the entire cultivation or different phases, and we correlated the integrated weather factors with the DFE yield (**Table 2**). In all cases, high temperatures and intense light were correlated with a lower DFE yield, and this correlation was significant in some cases—for example, illuminance during germination and sprouting. The correlation of factor averages (normalized for the duration of a phase) was higher than the mere integral. The strongest correlations with DFE yield were found for the average internal temperature over the entire cultivation period (average total, r = −0.709, **Table 1**) and the integrated illuminance >45 klx during sprouting (Ill≥45,S, r = −0.716, **Table 2**). The latter implied that events early during cultivation can ultimately have a strong impact on the product yield and thus process performance. Others have identified anthesis as the most heat-sensitive phase in the plant life cycle (Zinn et al., 2010; Zhou et al., 2017; Opole et al., 2018), but this is of limited relevance in molecular farming because plants are typically harvested and processed before flower development (Ma et al., 2015; Sack et al., 2015).


*aDaily time between 06:00:00 and 22:00:00; bdaily time between 22:00:00 and 06:00:00.*

TABLE 2 | Correlation between integrated light intensity ≥45 klx or integrated temperature ≥28°C and DFE yield during different growth phases. Average values have been normalized for the duration of the interval between phases.


It was not possible to definitively link the DFE yield to either illuminance or temperature due to the high intercorrelation of the two environmental parameters (**Figure 3**). Controlled environments or modified greenhouse settings may help to resolve this collinearity between factors in the future. We extracted additional key figures from the weather factors, such as parameter extrema, number of days outside control ranges, and threshold and mean deviation from the specifications. However, no significant correlations remained after subtracting the effect of Ill≥45,S or Temp≥28,Av.tot. by calculating partial correlations between the weather key figures and DFE yield (**Table S1**).

#### Increasing Levels of TSP and Protease Activity Are Observed Under Warm Conditions

We also analyzed the key figures described above in the context of DFE accumulation (before purification), DFE recovery, TSP and the protease level, as well correlations among these parameters (**Figure S5**). DFE recovery showed the same correlations as DFE yield (reaching a maximum of 0.48 kg kg−1), but DFE accumulation was not significantly correlated to any of the weather key figures. Instead, DFE accumulation (after blanching) was positively correlated with plant age, and biomass and positive correlations were observed between TSP and both illuminance and temperature, increasing the TSP in homogenates to 16 g kg−1 biomass. These observations were consistent with a previous study which reported higher protein expression in summer batches based on plant maturation up to a certain point (Sack et al., 2015). Others have reported the effects of light regimes (Goulet et al., 2019) and that short-term heat exposure (37°C) can boost transient protein expression (Norkunas et al., 2018).

Correlations (|r| > 0.70) were also observed between protease activity and internal relative humidity (negative correlation), as well as Temp≥28 and Ill≥45,Av.tot. (positive correlations) (**Table S2**), which is consistent with the reported heat-induced induction of plant protease activity (Bita and Gerats, 2013). These results suggested that, like the DFE yield, TSP was affected by the weather throughout cultivation but especially during sprouting, whereas protease activity was more sensitive to weather influences at the end of the cultivation and shortly before harvest. Interestingly, the amount of TSP varied without observable order and without correlation (r = −0.03) to DFE yield (**Figure S5**).

#### Weather Data Cannot Explain Intra-batch Differences in Yield

The biochemical responses including DFE yield varied not only between batches but also across the three harvest times within one batch that were spaced by 1 week at the end of each cultivation period (**Figure 2B**, **Table S3**, **Figure S2**). We therefore analyzed the responses, such as DFE yield, for each batch individually, focusing on the final cultivation stage. We calculated the average values for all of the weather factors on the last day, over the last 2 days, over the last week, and over the last 2 weeks before harvest. For each batch, the DFE accumulation before and after blanching increased with subsequent harvests with only one exception (**Figure 2B**). We also correlated the weather factors with the relative yield at each harvest time. We defined this relative yield as the quotient of the yield at a given harvest time and the average yield of all harvests of the associated batch in order to normalize the yields across the different batches (second set of six batches, Equation 11). In three batches, the yield increased with each successive harvest point. For two batches, the second harvest was the most prolific, whereas most DFE was obtained from the first harvest in the remaining batch, and the second and third harvests yielded nearly the same quantities of product. These data suggest that the yield may increase up to an optimal harvest time and then decline (**Figure S6**), a similar pattern was also observed for recombinant monoclonal antibody 2G12 produced in transgenic tobacco plants (Sack et al., 2015). However, none of the weather factors correlated with this behavior around the time of harvest. We concluded that the differences in DFE yield between harvest times were not only caused by abiotic weather

factors but probably also biotic changes such as the onset of anthesis or senescence. Accordingly, the variable DFE yield across different harvests probably reflected our limited ability to select a cultivation schedule that could compensate for seasonal effects on plant development in a greenhouse setting (**Figures 1E**, **F**) by adjusting the harvest time in the 37–63 dps range.

#### A Linear Model Can Predict Average Batch Yields

We were interested in the generalizability of our findings and built a set of regression models linking the weather (e.g., Ill≥45) and cultivation (e.g., biomass) factors to DFE yield. Because scatterplots between weather and cultivation factors and yield showed a linear dependency, we selected a linear model that we limited to a maximum of two independent variables due to the small number of data points. A multilinear model based on Ill≥45 and plant age performed best (**Table 3**). The low values of the determination coefficients (<0.60) reflected the substantial scattering between different harvest times within batches, as discussed above.

We therefore calculated the average DFE yields for each batch and found a strong correlation between the weather factors and the average DFE yield per batch—for example, r = −0.96 for Ill≥45,S. As expected, the corresponding *p*-values increased (reduced significance) compared to the model with individual harvests due to the lower number of data points (6 instead of 18) (**Table 1**). When the regression models were updated using the batch average DFE yield, they gave notably higher R2 and adjusted R2 values (**Table 3**). We used the best model trained with the data from the second set of six batches to calculate the DFE yield for the first set of six batches, which were not included in model training but achieved a poor prediction (R2 = 0.11). However, the correlation between the predicted and actual values of DFE yield was high (r = 0.87). The average yield of the first six batches was 1.69-fold higher than the average of the second set of six TABLE 3 | Multilinear regression models for DFE yield trained on all data or batch averages of the second set of six batches.


batches (**Figure S1D**). Using this value as a correction factor, we obtained a substantially higher coefficient of determination (R2 = 0.65) (**Figure 3C**). We assume that the offset between the two sets of batches was a sample treatment artifact because we froze the plants in the second set of batches allowing us to process them at the same time, whereas in the first set of batches, the plants were processed without freezing. Therefore, a simple linear model based on Ill≥45 and plant age can facilitate an *a priori* prediction of DFE yield from transgenic tobacco in a greenhouse setting.

#### CONCLUSION

We observed a substantial effect of seasonal weather changes on the yield of the fusion protein DFE (based on the detection of an intact C-terminus) whereas its accumulation (based on fluorescence) was not affected by the greenhouse climate. Hence, care should be taken when assessing the suitability of growth conditions for the production of recombinant proteins in plants. In the future, controlled cultivation environments such as vertical farms (Wirz et al., 2012; Holtz et al., 2015) may help to reduce such seasonal effects and ensure consistent yields across batches. We found that high illuminance and/or high temperature, especially during the sprouting phase, reduced the yield of DFE, and this could also apply to other recombinant proteins. It was possible to predict the yield based on a simple model using illuminance and plant age. However, this model was not sufficient to calculate the effect of the harvest time on product yield and should thus be augmented to improve its predictive power—for example, by including additional factors that can describe the physiological development status of the plants.

#### DATA AVAILABILITY STATEMENT

The datasets generated for this study are available on request to the corresponding author.

#### AUTHOR CONTRIBUTIONS

MK conducted the experiments and collected the data. CR conducted the experiments and collected the data. JE analyzed the data and wrote the manuscript. JB devised the experiments, conducted the data analysis and wrote the manuscript.

#### FUNDING

This work was funded in part by the Fraunhofer-Gesellschaft Internal Programs under Grant No. Attract 125-600164 and the state of North-Rhine-Westphalia under the Leistungszentrum grant no. 423 "Networked, adaptive production." This work was supported by the Deutsche Forschungsgemeinschaft (DFG) in the framework of the Research Training Group "Tumor-targeted Drug Delivery" grant 331065168 and the European Research Council Advanced Grant "Future-Pharma," proposal number 269110.

#### ACKNOWLEDGMENTS

The authors acknowledge Ibrahim Al Amedi for cultivating the plants used in this investigation and Dr. Thomas Rademacher for providing the pTRA vector. We are grateful to Markus Sack for fruitful discussions on the DFE ligand structure. The authors have no conflict of interest to declare.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.01245/ full#supplementary-material

TABLE S1 | Correlation between climate factors during plant growth and DFE yield (as in Table 1) corrected for the influence of illuminance and temperature.

TABLE S2 | Correlation coefficients between protease activity and weather key figures.

TABLE S3 | Average DFE yields and standard deviations for the second set of six batches.

FIGURE S1 | Climate and batch accumulation data for DFE. (A) Representative 10-day period during the cultivation of transgenic tobacco in a greenhouse setting (September 2016) illustrating the collinear course of illuminance and temperature. (B) Temperatures ≥28°C (zaxis) plotted against batch duration (x-axis) and DFE yield at the final harvest (y-axis). (C) Plot as in B but temperature replaced with illuminance ≥45 klx. (D) Box-plot of DFE yields in the first (2016) and second (2017) set of batches. Small open boxes indicate the set average. Boxes indicate the 25 and 75 quartiles, and whiskers mark the 5 and 95 percentiles.

FIGURE S2 | Temperature (green) and illuminance (gray) curves recorded during the course of this study overlain with batch durations and dependent biochemical and product parameter results. The recovery is defined as the ratio of DFE in the elution fraction (after purification) and the DFE amount in the load (before purification). Horizontal lines indicate the average parameter value for that batch calculated based on all harvests of one batch, whereas colored point-scatter plots correspond to the individual harvest-specific values in the second batch set. Vertical colored dotted lines mark the transitions between the growth phases in each batch (left line = germination to sprouting, middle line = sprouting to growth, and right line = growth to maturation).

FIGURE S3 | Correlations and cross-correlations between independent cultivation parameters observed for a second set of six batches (2.1–2.6). Three harvests spaced 1 week apart were conducted per batch resulting in a total of 18 data points (dots). Dots are colored according to the DFE yield after purification. Lines represent linear regression models for the parameters in the corresponding row and column and are colored according to their *p*-value: green = *p* < 0.01, orange = 0.01 < *p* < 0.05 and gray = *p* ≥ 0.05. Histograms in the diagonal of panels represent the distribution of the parameter defined by the corresponding row/column.

FIGURE S4 | Correlation between selected independent cultivation parameters and dependent biochemical and product parameters observed for a second set of six batches (2.1–2.6). Three harvests spaced 1 week apart were conducted per batch resulting in a total of 18 data points (dots). Dots are colored according to the DFE yield after purification. Lines represent linear regression models for the

#### REFERENCES


parameters in the corresponding row and column and are colored according to their *p*-value: green = *p* < 0.01, orange = 0.01 < *p* < 0.05 and gray = *p* ≥ 0.05.

FIGURE S5 | Correlation between dependent biochemical and product parameters observed for a second set of six batches (2.1–2.6). Three harvests spaced 1 week apart were conducted per batch resulting in a total of 18 data points (dots). Dots are colored according to the DFE yield after purification. Lines represent linear regression models for the parameters in the corresponding row and column and are colored according to their p-value: green = p < 0.01, orange = 0.01 < p < 0.05 and gray = p ≥ 0.05. Histograms in the diagonal of panels represent the distribution of the parameter defined by the corresponding row/column.

FIGURE S6 | Relative yield in dependence of harvest time. (A) Relative yield (ry) of DFE calculated using Equation 11 for each harvest of the second set of plant batches. (B) Relative yield of DFE with the harvest time adjusted so that the maximum yield was at time zero. Interestingly, no "U"-shaped sequence of yields was observed. The plot may be used to identify optimal cultivation times in dependence of the season and weather conditions. For example longer cultivation may result in higher DFE yields for batch 2.2 (> 55 dps), 2.4 (> 52 dps) and 2.5 (> 63 dps), whereas an optimal harvest was identified for 2.1 (50 dps) and 2.5 (56 dps).


assimilation rate in tobacco leaves. *Plant Cell Environ.* 33, 332–343. doi: 10.1111/j.1365-3040.2009.02067.x


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Knödler, Rühl, Emonts and Buyel. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Plant-Made Bet v 1 for Molecular Diagnosis

*Mattia Santoni1, Maria Antonietta Ciardiello2, Roberta Zampieri1, Mario Pezzotti1, Ivana Giangrieco2,3, Chiara Rafaiani4, Michela Ciancamerla4, Adriano Mari3,4\* and Linda Avesani1\**

*1 Department of Biotechnology, University of Verona, Verona, Italy, 2 Institute of Bioscience and BioResources, CNR, Naples, Italy, 3 ADL (Allergy Data Laboratories) S.r.l., Latina, Italy, 4 Associated Centre for Molecular Allergology, Rome, Italy*

Allergic disease diagnosis is currently experiencing a breakthrough due to the use of allergenic molecules in serum-based assays rather than allergen extracts in skin tests. The former methodology is considered a very innovative technology compared with the latter, since it is characterized by flexibility and adaptability to the patient's clinical history and to microtechnology, allowing multiplex analysis. Molecular-based analysis requires pure allergens to detect IgE sensitization, and a major goal, to maintain the diagnosis costeffective, is to limit their production costs. In addition, for the production of recombinant eukaryotic proteins similar to natural ones, plant-based protein production is preferred to bacterial-based systems due to its ability to perform most of the post-translational modifications of eukaryotic molecules. In this framework, Plant Molecular Farming (PMF) may be useful, being a production platform able to produce complex recombinant proteins in short time-frames at low cost. As a proof of concept, PMF has been exploited for the production of Bet v 1a, a major allergen associated with birch (*Betula verrucosa*) pollen allergy. Bet v 1a has been produced using two different transient expression systems in *Nicotiana benthamiana* plants, purified and used in a new generation multiplex allergy diagnosis system, the patient-Friendly Allergen nano-BEad Array (FABER). Plant-made Bet v 1a is immunoreactive, binding IgE and inhibiting IgE-binding to the *Escherichia coli*  expressed allergen currently available in the FABER test, thus suggesting an overall similar though non-overlapping immune activity compared with the *E. coli* expressed form.

#### Keywords: allergen, molecular farming, transient expression, IgE, structure homology

# INTRODUCTION

Allergic diseases, defined as abnormal responses of the human body when in contact with an allergen, have become a common health problem worldwide, with a rising incidence both in adults and children (Lau et al., 2018). It is rather difficult to obtain exact epidemiology data on this topic but it has been estimated that 25% of the general population suffer from these diseases (Sampson, 2004).

The primary diagnostic tool currently used for allergic disease diagnosis is skin prick testing (SPT), which evaluates the presence and degree of cutaneous reactivity against a surrogate marker of sensitization, composed of protein extracts from allergenic sources. However, this method is hampered by several problems. For instance, it cannot be used in patients who have extensive eczema, dermographism or urticaria or who are taking antihistamines and/or other medications.

#### *Edited by:*

*Jussi Joonas Joensuu, VTT Technical Research Centre of Finland Ltd, Finland*

#### *Reviewed by:*

*Mario Sergio Palma, Paulista State University Júlio de Mesquita Filho Rio Claro, Brazil Anna Gieras, University Medical Center Hamburg-Eppendorf, Germany*

#### *\*Correspondence:*

*Adriano Mari adriano.mari@caam-allergy.com Linda Avesani linda.avesani@univr.it*

#### *Specialty section:*

*This article was submitted to Plant Metabolism and Chemodiversity, a section of the journal Frontiers in Plant Science*

*Received: 04 February 2019 Accepted: 12 September 2019 Published: 10 October 2019*

#### *Citation:*

*Santoni M, Ciardiello MA, Zampieri R, Pezzotti M, Giangrieco I, Rafaiani C, Ciancamerla M, Mari A and Avesani L (2019) Plant-Made Bet v 1 for Molecular Diagnosis. Front. Plant Sci. 10:1273. doi: 10.3389/fpls.2019.01273*

**276**

More importantly, it frequently fails to detect specific IgE because the composition of the surrogate markers can be very variable (Giangrieco et al., 2012). This high variability makes it impossible to have standardized extracts with a constant allergen composition, since this may be influenced by several factors, such as differences among plant cultivars, post-harvest treatments, specific phenological stages, and extraction protocols (Ciardiello et al., 2009; Pasquariello et al., 2012). Molecule-based diagnosis may overcome all these drawbacks, given the possibility of detecting the specific IgE against individual allergenic molecules (Giangrieco et al., 2012; Ciardiello et al., 2013).

Molecular-based diagnosis, furthermore, may be used to predict biopharmaceuticals with a potential use in therapy for tolerance induction to allergens in a drug-companion diagnostic strategy.

The allergens used for molecular diagnosis may be extracted from the original sources or may be produced recombinantly in different platforms. A plant-based platform, in this framework, represents both a cost-effective and speedy system for producing allergens and, in particular, for the expression of plant allergens. In fact, it provides an environment useful to obtain features, such as post-translational modifications and protein folding similar to those present in the molecules of the authentic source.

In this study, we have transiently transformed *Nicotiana benthamiana* plants for the production of one of the major allergens associated with birch pollen allergy, Bet v 1a (UniProtKB accession number P15494) (Radauer et al., 2008). Bet v 1a is a 17-kDa protein which shares epitopes with the major pollen allergens of trees belonging to the Fagales order and with some plant-derived foods (Niederberger et al., 1998). Bet v l represents a target for IgE antibodies of more than 95% of patients allergic to birch pollen, and almost 60% of them are exclusively sensitized to Bet v 1 (Jarolim et al., 1989).

Here we report the setup of a plant-based system for allergen production that was tested with the expression of recombinant Bet v 1a. For this purpose, two different transient systems in terms of yield and timeframe for protein upstream processing were used, and the results were compared. In addition, the features of the recombinant product were characterized. In particular, the folding of the plant-made Bet v 1a (pBet v 1a) was investigated by circular dichroism measurements, whereas the evaluation of its immunological reactivity (IgE binding) was analyzed with the FABER multiplex system by direct testing and experiments of IgE binding inhibition.

#### MATERIALS AND METHODS

#### Vectors and Plant Transformation

The DNA sequence encoding the allergen Bet v 1a was designed with the following modifications: the codon usage was optimized for *N. benthamiana* and a poly-Histidine tag, a Flag-tag, and a linker (GPGP) were added at the N-terminus. The synthetic gene (Invitrogen GeneArt Gene Synthesis) was then inserted into the pENTR™/D-TOPO vector, following the manufacturer's instructions, and sequenced to assess the absence of errors. The resulting vector was recombined by Gateway™ LR Clonase™ II Enzyme mix (ThermoFisher) in the two destination vectors pK7WG2 (Karimi et al., 2002) and pG PVX GATEWAY(A) (Avesani et al., 2007).

The final result consisted of two vectors, pK7WG2.Betv1 and pGPVXGATEWAY(A).Betv1, that were inserted into the *Agrobacterium tumefaciens*, EHA105 and GV3101 strains, respectively, by electroporation.

#### pBet v 1a Transient Expression in *N. Benthamiana*

*Nicotiana benthamiana* plants were grown from seeds and cultivated in a growth chamber at 25°C with a light/dark cycle of 16 h/8 h and a relative humidity of 20% to 40%.

A. tumefaciens cells, both EHA105 and GV3101, carrying pK7WG2.Betv1 and pGPVXGATEWAY(A).Betv1 were seeded into a lysogeny broth (LB) medium containing 50 µg/ml of rifampicin, 300 µg/ml of streptomycin, and 100 µg/ml of spectomycin for pK7WG2.Betv1 or 50 µg/ml of rifampicin, 50 µg/ ml of kanamycin, and 5 µg/ml of tetracyclin for pGPVXGAT(A). Betv1. Empty vectors were used as negative controls. For syringe agroinfiltration, performed as described in Gecchele et al. (2015), overnight bacterial cultures were collected by centrifugation at 4500g, re-suspended in the infiltration buffer (10 mM MES pH 5.5, 10 mM MgSO4, and 100 µM acetosyringone) at an optical density of 0.8 at 600 nm. Following a 3-h incubation, the culture was used for the syringe infiltratation of 4- to 5-week-old *N. benthamiana* plants.

After the infiltration, the plants carrying the pK7WG2. Betv1 vector were sampled from the third day post-inoculation (dpi) to the 14th dpi; the plants infiltrated with the pG PVX GATEWAY(A).Betv1 were harvested after the symptom appearance between 10 to 14 dpi.

#### pBet v 1a Detection

Total soluble proteins (TSP) were extracted from the leaves by grinding the tissue sample to a fine powder under liquid nitrogen. The powder was re-suspended in three volumes of extraction buffer (1× phosphate-buffered saline [PBS], 0.1% Tween-20) supplemented with cOmplete™ EDTA-free protease inhibitor (COEDTAF-RO).

The homogenate was centrifuged at 30,000g for 20 min at 4°C. The protein concentration was determined using the Bradford reagent (Sigma B6916).

The presence of pBet v 1a in the homogenate was detected by Western blot analysis. Briefly, equal quantities of TSP were loaded onto a 14% reducing SDS-PAGE. After the electrophoretic separation, the proteins were transferred onto a nitrocellulose membrane by electroblotting and incubated with anti-polyHisitidine and anti-FLAG® antibodies, diluted 1:5000 and 1:1000, respectively. The protein band recognized by the antibodies was detected using the ECL™ Select Western Blotting Detection Reagent (Amersham). In particular, the chemiluminescent signal was captured with the Chemidoc™ (BioRad).

## pBet v 1a Immobilized Metal Affinity Chromatography (IMAC) Purification

The purification of the allergen was carried out as described in Bortesi et al. (2009). Briefly, 10 to 30 g of leaf tissue were homogenized in four volumes of buffer (1× PBS, 10 mM ascorbic acid, 0.1% Tween-20, pH 6.0). To remove the insoluble material, the homogenate was centrifuged at 15,000g, at 4°C for 15 min. Next, the supernatant was removed, passed through a filter paper and adjusted to 500 mM NaCl, 5 mM imidazole, pH 8. After incubation with gentle shaking for 1 h in ice, a last centrifugation step at 4°C for 30 min at 30,000g produced a clear supernatant. To remove non-specifically bound material, the supernatant was loaded onto a disposable column packed with 1 ml Ni-NTA resin (QIAGEN), using a gravity flow. The column was washed with a PBS solution containing 500 mM NaCl, 0.1% Tween-20, 10 mM imidazole, pH 8. pBet v 1a was eluted with an increasing concentration of imidazole, ranging from 10 to 200 mM, using a gradient maker. The purity of the allergen preparation was analyzed by loading aliquots of the eluted fractions on reducing SDS-PAGE, followed by silver staining (Mortz et al., 2001). The pure fractions were pooled and dialyzed against 1× PBS.

#### Size Exclusion Chromatography

The purified Bet v 1a was subjected to size exclusion chromatography using a fast protein liquid chromatography (FPLC) system, model AKTA pure 25L- Gold seal (GE Healthcare Europe GmbH, Milan, Italy). The purified protein was loaded on a gel filtration column Superdex 75 HR10/30 (Amersham Biosciences, Uppsala, Sweden), equilibrated, and eluted with 10 mM Tris-HCl, pH 7.5, 0.25 M NaCl. Fractions of 0.5 ml were collected, and the absorbance at 280 nm was recorded.

#### RP-HPLC Chromatography

Aliquots of the protein fraction collected from the size exclusion chromatography were subjected to RP-HPLC separation. The protein was loaded on a Vydac (Deerfield, IL, USA) C8 column (4.6 × 250 mm), using a Beckman System Gold apparatus (Fullerton, CA, USA). Elution was performed by a multistep linear gradient of eluent B (0.08% TFA in acetonitrile) in eluent A (0.1% TFA) at a flow rate of 1 ml/min. The eluate was monitored at 220 and 280 nm.

#### Estimation of the Pure Protein Concentration

The concentration of the pure protein obtained from size exclusion chromatography was estimated on the basis of the molar extinction coefficient, at 280 nm (11,920 M∙cm−1), calculated for pBet v 1a (178 residues) using the ProtParam tool on the Exapsy server (www.expasy.org).

#### Mass Spectrometry Experiments

The protein sample (5 µg) deriving from the RP-HPLC elution was dried with a centrifugal vacuum concentrator (Savant Speedvac Plus SC110A, Ramsey, Minnesota, USA) and solubilized in ammonium bicarbonate (AMBIC) 0.1 M. Next, a standard protocol of reduction, alkylation, and digestion with trypsin was applied. Briefly, the protein sample was reduced with 1 mM dithiothreithol in 100 mM AMBIC and alkylated with 5.5 mM iodoacetamide (IAA) in 10% acetonitrile and 10 mM AMBIC. The protein was then desalted using ZipTip C18 tips (Millipore, Billerica, MA) and incubated overnight at a trypsin/substrate ratio of 1:50 at 37°C for 20 h. The obtained peptides were separated using an Ultimate 3000 instrument (LC Packings, Sunnyvale, CA) 2D-nano-HPLC online interfaced with a QSTAR-Elite Hewlett Packard, Series 1100 HPLC with UV detector. The peptide sequence identification was obtained using the Mascot Server (Matrix Science, London, UK) on the website www.matrixscience.com.

#### Circular Dichroism (CD) Experiments

CD spectra were recorded on a JASCO J-810 spectropolarimeter (Easton, MD) as already reported (Offermann et al., 2015). A quartz cell of 0.1-cm path length was used to record the spectra over the wavelength range of 260 to 195 nm with a bandwidth of 1.0 nm and a time constant of 8.0 s. For the CD experiments, the protein solution was diluted in the appropriate buffer and left for 2 h at 25°C before the acquisition of the spectra. The measurements were performed in PBS containing 0.1% Tween at 25°C. The protein concentration was 0.10 mg/ml. Each spectrum was baseline corrected for the contribution of the solvent.

For the thermostability experiments, purified pBet v 1a (0.1 mg/ml) was incubated for 5 min in PBS containing 0.1% Tween, at different temperatures. After each incubation, the CD spectrum was recorded as described above.

#### Specific IgE Detection by the FABER Testing System

FABER (ADL S.r.l., Latina, Italy) is a multiplex *in vitro* serological test that allows the detection of IgE antibodies produced by allergic subjects, specifically recognizing the allergens spotted on the biochip (Alessandri et al., 2017; Tuppo et al., 2018). The FABER version used to perform the present study (FABER 244- 122-122) bears 244 allergenic preparations, representing 122 purified molecular allergens and 122 multiple protein allergenic extracts. The data were obtained using 304 sera and a set of biochips containing also Bet v 1-like allergens, namely, Act c 11 from gold kiwi, Api g 1 from celery, Ara h 8 from peanut, Cor a 1 from hazel pollen, and Mal d 1 from apple, and including pBet v 1a, spotted for experimental purposes. Before the immobilization on the FABER biochip, pBet v 1a was coupled on nanobeads following the same procedure applied to all the other allergenic preparations. This multiplex diagnostic test allowed the detection of specific IgE to each of the allergenic preparations contained in the FABER biochip, including Bet v 1 from *Escherichia coli* and that from the plant, in a single run. To obtain information on shared epitopes on homologous proteins a modified single point highest inhibition achievable assay (SPHIAa) was used (Bernardi et al., 2011). IgE binding inhibition on Bet v 1-like allergens was achieved by co-incubating pBet v 1a with 5 sera from Bet v 1-allergic subjects, all having Bet v 1-specific IgE antibodies. For control purposes non-related IgE-positive results were used, and no inhibition was recorded (data not shown). The optimal pBet v 1a concentration for the inhibition experiments was found by preliminary experiments using a range of concentrations between 100 µg and 1.25 µg/ml in a buffer solution. All IgE detections, either for the inhibition assay or the direct measurements, were performed in a single replicate.

Based on the current regulation on spared serum samples from the diagnostic workup, considering the venous blood sampling as part of the routine clinical practice and the observational nature of the study carried out without any action on patients themselves, a formal approval by the ethical committee or signed informed consent was not required.

#### STATISTICS

Statistical evaluation of protein yields and IgE distribution was made by applying the t-test for paired values and a *p* < 0.05 was considered statistically significant (Graphpad Prism 5.0; Graphpad Software Inc., San Diego, CA).

#### RESULTS

#### pBet v 1a Expression

Two different transient expression systems for the expression of Bet v1a in *N. benthamiana* plants were compared.

#### pK7WG2 Expression

We first conducted a time-course expression analysis for the plants transformed with the vector pK7WG2.Betv1 to determine the day of maximum expression (**Figure 1**). A suspension of *A. tumefaciens* carrying the vector was manually infiltrated into the leaves. Three leaves were sampled daily, beginning 3 until 14 dpi. Western blot analysis of the leaf extracts revealed an expression peak at 3 dpi (**Figure 1**). The amount of pBet v 1a at the peak expression level was evaluated by densitometric analysis and corresponded to 5.1 ± 0.46 μg/g of fresh leaves weight (FLW).

#### pGPVXGATEWAY(A) Expression

For the infection with the Potato Virus X (PVX)-based vector, the N. benthamiana leaves were infiltrated with the A. tumefaciens suspension carrying pGPVXGATEWAY(A).Betv1. When the first viral symptoms appeared (10 dpi), we collected the infected leaves and analyzed them by western blot analysis. In particular, three biological replicates consisting of a collection of the leaves of three infected plants (**Figure 2**) were performed. On the basis of the densitometric analysis, the expression level of the recombinant protein Bet v 1 was estimated to correspond to 51.9 ± 2.51 μg/g of FLW.

#### pBet v 1a IMAC Purification and Yields

Small-scale purification experiments were set up, starting from plant material deriving from the two systems used for recombinant protein production in the *N. benthamiana* plants.

We independently extracted the TSP starting from 30 g of leaves infiltrated with pK7WG2.Betv1 and 10 g of leaves infected using pGPVXGATEWAY(A).Betv1. The extracts were subjected to affinity chromatography using a Ni-NTA column (**Figure 3**). The optimal imidazole concentration for pure protein elution was between 55 and 60 mM (Fractions 10-13, **Figure 3**) and leaves infiltrated with pK7WG2.Betv1 and pGPVXGATEWAY(A).Betv1 displayed the same pattern (data not shown). In the discarded fractions 1 to 9, we detected a protein of approximately 55 kDa weight, which resulted to be a contaminant and not an oligomerization of pBet v 1a since it was not detected in the anti-FLAG® Western blot analysis. The yields of the purified recombinant protein obtained using the two systems were estimated by Western blot followed by a densitometric analysis of the protein bands recognized with the specific antibodies of three independent purified pBet v 1a preparations.

The average yield of purified pBet v 1a obtained from pK7WG2.Betv1 and pGPVXGATEWAY(A).Betv1 was of 3.7 ± 0.02 μg/g FLW and 23.4 ± 0.004 μg/g FLW, respectively (**Figure 4**). The stardard deviation was calculated by comparing the results of three independent purification experiments for each expression system.

corresponding loading controls (RuBisCO large subunit) stained with Coomassie Brilliant Blue (B), of pBet v 1a containing protein extract from leaves samples in three biological replicates. Each lane was loaded with 2.5 µg of TSP, the western blot was probed with anti-FLAG® antibody conjugated with horseradish peroxidase. Side numbers indicate molecular mass markers in kDa. p.c., positive control, 10 ng of acommercially available flagged protein; n.c., negative control, extract from leaves infiltrated solely with A. tumefaciens GV3101.

The bottom bar of the B image stands for the imidazole concentration for every gradient fraction.

All the susequent experiments were performed using pBet v 1a produced using the pGPVXGATEWAY(A).Betv1 because of the higher yields obtained.

#### Structural Characterization of pBet v 1a Analysis of the pBet v 1a Preparation by Size Exclusion Chromatography and RP-HPLC

Following the affinity chromatography, pBet v 1a was subjected to size exclusion chromatography. **Figure 5** shows the elution profile of pBet v 1a compared with that of the allergen expressed in *E. coli*. Similar to the *E. coli* derived molecule, pBet v 1a was eluted as a single peak excluding the presence of aggregated forms. As expected, pBet v 1a is eluted at a slightly lower volume due to the higher molecular weight (19,696 Da) than that of the molecule expressed in E. coli (17,570 Da). This is due to the presence of a tail (containing a poly-Histidine tag, a Flag-tag and a linker) of 19 amino acid residues at the N terminus of pBet v 1a. The fractions containing pBet v 1a, eluted from the gel filtration, were collected and pooled. The concentration of the pure protein obtained from size exclusion chromatography was estimated on the basis of the molar extinction coefficient, at 280 nm.

An amount of 0.2 mg was further analyzed by RP-HPLC (**Figure 6**). pBet v 1a was eluted as a single peak that was collected

and concentrated with a centrifugal vacuum concentrator (Savant Speedvac Plus SC110A).

#### Assessment of pBet v 1a Identity by Mass Spectrometry

An amount of Bet v 1a corresponding to 5 µg, deriving from RP-HPLC elution, was used for the assessment of its identity using the in solution enzyme digestion method followed by shotgun proteomics (Shan et al., 2013). This procedure allowed the identification of several overlapping peptides covering 76% of the Bet v 1a protein sequence (see **Figure 7**). Except a peptide deriving from the trypsin enzyme used for the protein digestion, peptides belonging to any other protein contained in Uniprot database were not detected. In addition, no sequence heterogeneities were observed. Therefore, this experiment demonstrates that the plant expressed pBet v 1a was exactly the expected protein and that the molecule had been purified to homogeneity.

In summary, the purity of pBet v 1a eluted from size exclusion chromatography was assessed using different methods. It provided a single band on SDS-PAGE, was eluted as a single peak from RP-HPLC and was identified as a single molecule from shotgun proteomics. No evidence of the presence of posttranslational modifications was recorded.

#### Structural Analysis by Circular Dichroism Experiments

**Figure 8** shows that the CD spectrum obtained for pBet v 1a is different from that of the *E. coli*–produced allergen. The CD curve of the bacterial-derived allergen is very similar to the one reported in literature for the natural Bet v 1 (Batard et al., 2005; Bollen et al., 2007), displaying a broad minimum around 218 nm. Conversely, the spectrum of the pBet v 1a shows two minima around 208 and 216 nm, consistent with a ratio of the alfa-helix higher than that present in the structure of the molecule expressed in *E. coli*.

#### Thermal Stability

**Figure 9** shows the stability of Bet v 1a at different temperatures, reported as molar ellipticity registered at 222 nm. It can be observed that the protein maintains its structure until 80°C and only at 90°C the unfolding starts.

#### Immunological Characterization IgE-Binding Inhibition

As shown in **Figure 10A**, pBet v 1a used in solution to inhibit the IgE binding to the *E.coli*–produced Bet v 1 was able to provide an inhibition of up to 100%. The same was obtained for Cor a 1 and Mal d 1, whereas an almost complete inhibition was achieved with two other Bet v 1-like molecules from peanut and celery. These results are in line with the different degree of primary structure similarity observed by comparing the sequence of Bet v 1a with those of the homologous allergens analyzed in this study (**Figure 7**). In fact, the sequence identity between Bet v 1a and the Bet v 1-like allergens Cor a 1, Mal d 1, Ara h 8 and Api g 1 is 72.5%, 55.6%, 46.2% and 40.0%, respectively. Therefore, a clear correlation between the level of IgE binding inhibition and the primary structure similarities is observed. IgE binding inhibition to kirola, Act c 11, was also observed, although the sequence identity with Bet v 1a is very low (12%). However, Act c 11 (as well as Act d 11) represents a particular case since it is not included in the Bet v 1-like protein family. In fact, it belongs to the Major Latex Protein/Ripening Related Protein (MLP/RRP) family, but it is immunologically correlated with Bet v 1-like allergens with which it displays IgE co-recognition (D'Avino et al., 2011; Chruszcz et al., 2013).

#### Direct IgE Binding

When the pBet v 1a was tested for direct comparison with the one produced in *E. coli* on an extended number of samples, the values were differently dispersed with a mean IgE value lower for pBet v 1a (**Figure 10B**). The observation of single 304 IgE values

homologues are shadowed in gray. The amino acid residues experimentally identified by mass spectrometry are in white and black shadows.

showed that some sera had almost overlapping reactivities, whereas others showed a markedly different behavior, recording a number of negative results. Comparing the IgE value distribution by using the t-test for paired values, it turned out that the series were statistically different (*p* < 0.0001).

The best performing sera belonged to the Bet v 1 IgE-positive subset having IgE reactivities also on other Bet v 1-like molecules (data not shown).

#### DISCUSSION

The development of tools for a precise and definitive diagnosis of allergic diseases is considered strategic for the future, both in terms of developing vaccines for immunotherapy, exploiting a companion diagnostic strategy, and of studying the genuine allergic sesitization of patients to a particular allergen source. Determining the sensitization profile of an inidvidual patient creates the opportunity to assess the individual risk of the severity of an allergic reaction and to predict the natural course (Van Gasse et al., 2015).

The progress in molecular farming exploiting different production platforms together with the current knowledge of allergen components and protein families (www.allergen. org; www.allergome.org) has boosted molecular allergy diagnosis both for food and for inhalant allergens. In this context, plant-based expression systems are considered advantageous for several reasons: (1) cost-effectiveness, (2) ease in scalability, and (3) authenticity with respect to plantderived allergens.

In this study, we have investigated and compared the performance of two different strategies for the transient expression of allergenic proteins in plants, one based on the use of a modified plant-virus and the other on a plant expression vector. The pipeline of the two different strategies is visualized in **Figure 11**. The virus-based mediated a higher expression level, almost ten-fold, than the plant one, as previously demonstrated for other recombinant proteins (Salazar-González et al., 2014). This result may be explained in terms of the higher rate of transcription of viral RNA-dependent RNA polymerase and of the natural capacity of viruses to sequester the plant apparatus for their replication in the host. However, the highest expression levels here obtained were four times (Krebitz et al., 2000) lower than those previously reported in *N. benthamiana* using a tobacco mosaic virus (TMV)-based system for the expression of the same protein. We speculate that this difference may be explained either in terms of a lower efficiency of PVX than TMV in mediating foreign protein expression or of the presence in our construct of a His-tag which may influence the recombinant protein accumulation (Pinnola et al., 2015), or even as a result of a combination of the two factors.

pBet v 1a was purified for the first time to our knowledge from *N. benthamiana* infected and infiltrated leaves using a single-step purification protocol based on affinity chromatography. The recombinant protein was purified at higher absolute yields when using the viral vector in comparison to the use of the plant-specific expression vector. However, the former was characterized by lower relative yields, probably reflecting a lower efficiency of the affinity column in capturing the antigen, this being present in higher amounts in the extracted plant soup.

Bet v 1 is the major allergen of birch pollen. The expression of this protein was chosen for this project because it is an important and well-studied allergen for which reference information is available in the literature. In addition, the recombinant molecule from a prokaryotic expression system is available for comparative analysis. In particular, *E.coli*–made Bet v 1 is already used in *in vitro* allergy diagnosis, and it is available as an allergen reference standard from the European

Directorate for the Quality of Medicines and Health Care, EDQM (Vieths et al., 2012).

The pBet v 1a produced in this study was recognized by the specific IgE of patients allergic to birch pollen and positive to *E. coli*–made Bet v 1. It efficiently bound specific IgE in solution, thus inhibiting their binding with Bet v 1 and its homologs that were immobilized on the FABER biochip. Nevertheless, in some experimental conditions, pBet v 1a showed a different behavior with respect to the molecule expressed in the prokaryotic system. For instance, lower IgE values were often obtained for pBet v 1a when it was spotted on the FABER biochip and tested simultaneously with *E. coli*–made Bet v 1. The results, obtained from an investigation of the structural features of pBet v 1a, suggest that its individual immunological features might be associated with a molecule conformation different from that of the prokaryotic Bet v 1 used for comparison. In fact, CD experiments showed a higher content of the helical structure in pBet v 1a than in the *E. coli*–made protein. This is not very surprising since it is well known that the Bet v 1 conformation is affected by various factors. Its structure is not stabilized with disulphide bridges (Radauer et al., 2008), as occurs for allergens that have a very compact structure, such as LTP and gibberellin-regulated proteins (Tuppo et al., 2014; Giangrieco et al., 2015). Although we cannot exclude the possibility that the presence of the histidine tail and flag, added at the N-terminus of pBet v 1a, could affect its conformation. An additional factor worth considering is the Bet v 1 ligand binding (Śliwiak et al., 2016). In fact, Bet v 1 and its homologs are described as a promiscuous acceptor for a wide variety of hydrophobic ligands (Chruszcz et al., 2013; von Loetzen et al., 2013). In particular, the ligand deoxycholate has been reported to stabilize the protein conformational IgE epitopes. It has also been suggested that ligand-binding affects the allergenicity of the protein and that humans are exposed to both ligand-bound and ligand-free Bet v 1 (Asam et al., 2014). Therefore, the absence or possible presence of a specific ligand bound to pBet v 1a and affecting its structure and immunological behavior is something worth investigation in the near future. Sometimes, as reported also for the kiwifruit allergen Act d 5 (Bernardi et al., 2010), the availability of an allergen with different conformations is useful to have a greater number of panels of the IgE epitopes, which can be used to reveal additional sets of IgE antibodies, thus improving allergy diagnosis.

Thermostability experiments suggest that pBet v 1a might be more stable than the molecule expressed in *E. coli* and lacking

#### REFERENCES


the 19-residue tail present at the N-terminus of the molecule described in this study. In fact, pBet v 1a is stable until 80°C and only at 90°C starts to loose its secondary structure elements. In contrast, a melting point of 66°C was reported for Bet v 1 expressed in *E. coli* (Himly et al., 2009). Further studies are necessary to understand whether the 19-residue tail of pBet v 1a has a stabilizing effect on the structure of this molecule.

In conclusion, the results of this study highlight the valuable performance of the plant-based expression system tested with the major birch allergen Bet v 1a. This represents a feasible platform for the production of the major allergen of birch, being characterized by high yields and a simple, one-step, protocol for recombinant protein purification. We, therefore, suggest that this platform may be used succesfully also for other recombinant allergens whose production is not feasible in bacterial cells. In addition, this technique could be exploited for the development of immunotherapeutic strategies, also considering their potential for oral administered vaccines. In fact, as recently demonstrated for other allergens (Fukuda et al., 2018), the administration to humans of molecules deriving from edible plant organs can be a safer approach than using those obtained from non-edible sources.

#### AUTHOR CONTRIBUTIONS

MS performed the transient pBet v 1a expression in plants, its characterization and purification. IG performed the structural characterization of the expressed protein by CD, size exclusion chromatography, RP-HPLC and thermal stability experiments and conjugated pBet v 1a to the nanobeads for the FABER test. MAC contributed in manuscript writing. RZ supervised the upstream and downstream process set-up. MP contributed in the experimental design. CR, MC, and AM designed, performed, and supervised all the immunological experiments. LA designed the experiments, coordinated all the experimental activities, and wrote the manuscript.

#### ACKNOWLEDGMENTS

We acknowledge the Department of Biotechnology of the University of Verona for funding the bursary of Mattia Santoni. We thank Dr Gabriella Pocsfalvi and Dr Immacolata Fiume of the the mass spectrometry and proteomics facility, at IBBR (CNR), for the mass spectrometry experiments and Dr. Jon Cole for the scientific English revision of the manuscript.


contain most of the IgE epitopes present in birch, alder, hornbeam, hazel, and oak pollen: a quantitative IgE inhibition study with sera from different population. *J. Allergy Clin. Immunol.* 102, 579–591. doi: 10.1016/ S0091-6749(98)70273-8


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Santoni, Ciardiello, Zampieri, Pezzotti, Giangrieco, Rafaiani, Ciancamerla, Mari and Avesani. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Production and Immunogenicity of Soluble Plant-Produced HIV-1 Subtype C Envelope gp140 Immunogens

*Emmanuel Margolin1,2, Rosamund Chapman1, Ann E. Meyers2\*, Michiel T. van Diepen1, Phindile Ximba1,2, Tandile Hermanus4,6, Carol Crowther4,6, Brandon Weber5, Lynn Morris4,6, Anna-Lise Williamson1,3 and Edward P. Rybicki2,3*

#### *Edited by:*

*Heiko Rischer, VTT Technical Research Centre of Finland Ltd, Finland*

#### *Reviewed by:*

*Karen Ann McDonald, University of California, Davis, United States David Montefiori, Duke University, United States*

*\*Correspondence:*

*Ann E. Meyers ann.meyers@uct.ac.za*

#### *Specialty section:*

*This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science*

*Received: 29 May 2019 Accepted: 07 October 2019 Published: 30 October 2019*

#### *Citation:*

*Margolin E, Chapman R, Meyers AE, van Diepen MT, Ximba P, Hermanus T, Crowther C, Weber B, Morris L, Williamson A-L and Rybicki EP (2019) Production and Immunogenicity of Soluble Plant-Produced HIV-1 Subtype C Envelope gp140 Immunogens. Front. Plant Sci. 10:1378. doi: 10.3389/fpls.2019.01378*

*1 Division of Medical Virology, Department of Pathology, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa, 2 Biopharming Research Unit, Department of Molecular and Cell Biology, University of Cape Town, Cape Town, South Africa, 3 Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa, 4 National Institute for Communicable Diseases of the National Health Laboratory Service, Sandringham, South Africa, 5 Structural Biology Research Unit, Division of Medical Biochemistry, Department of Integrative Biomedical Sciences, University of Cape Town, Cape Town, South Africa, 6 Faculty of Health Sciences, University of Witwatersrand, Johannesburg, South Africa*

The development of effective vaccines is urgently needed to curb the spread of human immunodeficiency virus type 1 (HIV-1). A major focal point of current HIV vaccine research is the production of soluble envelope (Env) glycoproteins which reproduce the structure of the native gp160 trimer. These antigens are produced in mammalian cells, which requires a sophisticated infrastructure for manufacture that is mostly absent in developing countries. The production of recombinant proteins in plants is an attractive alternative for the potentially cheap and scalable production of vaccine antigens, especially for developing countries. In this study, we developed a transient expression system in *Nicotiana benthamiana* for the production of soluble HIV Env gp140 antigens based on two rationally selected virus isolates (CAP256 SU and Du151). The scalability of the platform was demonstrated and both affinity and size exclusion chromatography (SEC) were explored for recovery of the recombinant antigens. Rabbits immunized with lectin affinity-purified antigens developed high titres of binding antibodies, including against the V1V2 loop region, and neutralizing antibodies against Tier 1 viruses. The removal of aggregated Env species by gel filtration resulted in the elicitation of superior binding and neutralizing antibodies. Furthermore, a heterologous prime-boost regimen employing a recombinant modified vaccinia Ankara (rMVA) vaccine, followed by boosts with the SEC-purified protein, significantly improved the immunogenicity. To our knowledge, this is the first study to assess the immunogenicity of a near-full length plant-derived Env vaccine immunogen.

Keywords: HIV, glycoprotein, plants, immunogenicity, modified vaccinia Ankara

# INTRODUCTION

Prophylactic vaccines are urgently needed to combat HIV-1, particularly in Sub-Saharan Africa which remains disproportionately affected by the pandemic and accounts for the majority of new infections and AIDS-related deaths (Shao and Williamson 2012). It is estimated that ~1.8 million new infections and ~1 million deaths from AIDS-related illnesses occurred in 2016 alone (www.unaids.org) . Alarmingly, Sub-Saharan Africa accounted for 64% of new infections during this period. Despite the impact of improved access to antiretroviral therapy on disease incidence, new infections continue to occur; consequently, even a partially effective vaccine is expected to have a large impact on the epidemic (Medlock et al., 2017).

Although none of the candidate vaccines evaluated in clinical trials have achieved the level of efficacy required for licensure, recent insights into the development of broadly crossneutralizing antibodies during natural infection and advances in the rational design of immunogens have renewed interest in the development of prophylactic vaccines (Stephenson et al., 2016; McCoy and McKnight 2017; Sanders and Moore, 2017). Additionally, the RV144 trial in Thailand showed that vaccinemediated protection against HIV acquisition is possible, and subsequent analyses have given insight into the kind of responses that a vaccine may need to elicit to achieve protection (Rerks-Ngarm et al., 2009, Haynes et al., 2012; Liao et al., 2013). It is also evident that non-neutralizing antibodies against Env contribute to protection against HIV infection by means of Fc-mediated effector functions (Rerks-Ngarm et al., 2009; Haynes et al., 2012; Barouch et al., 2015; Alter and Barouch 2018).

During natural infection with HIV, a subset of infected individuals develop broadly cross-neutralizing antibody responses against the Env gp160 glycoprotein (Li et al., 2007; Doria-Rose et al., 2009; Sather et al., 2009; Simek et al., 2009; Doria-Rose et al., 2010; Euler et al., 2010; Gray et al., 2011; Hraber et al., 2014; Landais et al., 2016). However, these responses typically occur late during infection and do not usually confer any obvious clinical benefit (Blish et al., 2008; Euler et al., 2010; Gray et al., 2011). The passive transfer of Env-specific neutralizing monoclonal antibodies can protect against viral challenge in non-human primates, suggesting that they would be able to prevent human infection if present at the sites of exposure (Mascola et al., 1999; Parren et al., 2001; Hessell et al., 2009; Rosenberg et al., 2013; Shingai et al., 2014). Many of these broadly neutralizing antibodies preferentially recognize Env epitopes in the context of trimers, as the epitope may span more than one of the protomers of the spike (Walker et al., 2009; Walker et al., 2011; Julien et al., 2013; Doria-Rose et al., 2014; Sok et al., 2014; Longo et al., 2016). Thus, a major focal point of current HIV vaccine research is the production of rationally designed Env trimers which resemble the native, virion-bound glycoprotein spikes (Sanders and Moore, 2017).

These trimer mimetics are designed to both occlude immunodominant, non-neutralizing epitopes that are inaccessible in the native trimer, and preferentially present epitopes targeted by broadly neutralizing antibodies (Sanders et al., 2013). However, the production of these antigens in mammalian cell culture is expensive, and the requisite infrastructure to produce at any scale higher than laboratory-based cultures is largely absent in developing countries. Plant-based expression of heterologous proteins is an attractive alternative to cell culture-based techniques, especially for resource-limited regions, due to the potential for greatly reduced production costs, rapid scalability and less stringent infrastructure requirements (Hefferon 2013; Ma et al., 2013). HIV vaccine implementation will also require unprecedented scalability and the potential need for annually repeated immunizations, as already occurs with seasonal influenza vaccines, could eclipse all available manufacturing capacity (Mortimer et al., 2012). Recently, high yields and promising immunogenicity of plant-produced influenza virus haemagglutinin-derived antigens have been reported, several of which have advanced into clinical trials (D'Aoust et al., 2008; Shoji et al., 2008; Shoji et al., 2009a; Shoji et al., 2009b; Bosch and Schots 2010; Landry et al., 2010; Madhun et al., 2011; Chichester et al., 2012; Shoji et al., 2012; Cummings et al., 2014). Medicago Inc (USA) has also demonstrated the scalability of plant-based expression by producing 10 million doses of fully formulated influenza H1N1pdm vaccines in one month (Yusibov et al., 2015).

Given the recent successes of the expression of influenza immunogens in plants, and the structural similarities between the influenza and HIV glycoproteins, it is feasible that plants may be able to produce suitable Env glycoproteins (D'Aoust et al., 2008; Karlsson Hedestam et al., 2008; Shoji et al., 2008). A number of groups have successfully expressed variable regions of gp120 or portions of gp41, as fusions with either plant virus capsid proteins or using cholera toxin B as a carrier (Yusibov et al., 1997; Durrani et al., 1998; Marusic et al., 2001; Kim et al., 2004). Although these vaccine candidates were immunogenic, they do not faithfully reproduce the conformation of these regions. More recently, Gag-based VLPs presenting the membrane-proximal external region (MPER) of gp41 were produced in *N. benthamiana* plants (Kessans et al., 2016).

The most promising study to date was conducted by Rosenberg and colleagues, who expressed a truncated, soluble Env protein in *N. benthamiana* plants—but as a reagent for characterization of plant-made antibodies, rather than as a vaccine candidate. The protein was a soluble gp140—with the gp41 truncated by removal of both the cytoplasmic and transmembrane domains—that also had the cleavage site, fusion peptide, and immunodominant region of gp41(∆CFI) removed (Rosenberg et al., 2013). It reacted with several prototype monoclonal antibodies, including 2G12 which recognizes a glycan-dependent epitope on the outer domain of Env (Rosenberg et al., 2013). However, its immunogenicity was not reported and it remains unclear if the antigen was trimeric. A similarly modified consensus Env (Con-S ∆CFI) was expressed as a fusion with the influenza haemagglutinin transmembrane and cytoplasmic domains (D'Aoust et al., 2011). While expression of a SIV gp130 protein was described in transgenic maize seed, once again no immunogenicity was reported (Horn et al., 2003).

It has been shown that proteolytic cleavage at the interface of the gp120 and gp41 subunits is important for the proper native conformation (Ringe et al., 2013). Recently, however, native-like soluble Env trimer mimetics were produced, in the absence of cleavage, by substituting the cleavage motif for a flexible linker peptide (Georgiev et al., 2015; Sharma et al., 2015). This approach is attractive for heterologous expression systems, such as plants, where endogenous furin activity is lacking (Wilbers et al., 2016). Our group has been investigating the production of cleavageindependent HIV Env gp140 antigens in mammalian cells (van Diepen et al., 2018) and their suitability as a booster vaccine for prior priming by DNA and/or modified vaccinia Ankara vaccines encoding modified Gag and a gp150 Env (van Diepen et al., 2018). In this study, we report the development of an *Agrobacterium*mediated transient expression system for the production of cognate soluble HIV-1 subtype C gp140 antigens in *N. benthamiana* plants, and immunological studies of these proteins in rabbits.

# MATERIALS AND METHODS

#### Antigen Design

Soluble cleavage-independent HIV Env gp140 antigens were designed as described by Sharma et al., 2015 (**Figure 1**), obviating the need for furin-mediated proteolytic cleavage which does not occur naturally *in planta* (Sharma et al., 2015, Wilbers et al., 2016). The native HIV Env cleavage site was replaced with a 10 amino acid flexible linker comprising of 2 repeats of the glycine-serine based (GGGGS) motif. The isoleucine at residue 559 in the N-terminal heptad repeat of gp41 was mutated to a proline and the coding sequence prematurely terminated by the introduction of a stop codon after amino acid residue 664. The coding sequence of the full length Env from the HIV CAP256 SU virus (clone 256.2.06.c7) was provided by Dr. Penny Moore (Centre for HIV and STIs, National Institute for Communicable Diseases, Johannesburg) and Daniel Sheward (HIV Diversity and Pathogenesis Research Group, University of Cape Town). The HIV-1 Du151 Env sequence was retrieved from GenBank (Accession number AF544008.1). The gene coding sequences were synthesized by GenScript, after optimization, to reflect the preferred human codon usage and the addition of synthetic Age1 and Xho1 restriction sites at the 5' and 3' terminal ends of the genes, respectively. A synthetic Not1 site was included prior to

the stop codon resulting in a run of three alanine residues at the C terminal end of the protein. Lastly, the native signal sequence was replaced with the murine mAB24 heavy chain-derived LPH (leader peptide heavy chain) signal peptide, to direct translocation of the recombinant protein through the plant secretory pathway.

#### Assembly of pEAQ-HT Expression Vectors Encoding HIV-1 Env Antigens

The gp120 and gp41 regions of the Env antigens were synthesized separately and assembled in pUC57, before they were subcloned into pEAQ-HT (Margolin, 2017). The gp140 coding sequences were excised from their respective pUC57 vector backbones using Age1 and Xho1. Similarly, the pEAQ-*HT* expression vector was digested with Age1 and Xho1 to generate compatible sticky ends for cloning. The genetic integrity of the final clones was verified by restriction analysis and sequencing across the cloning junctions with vector-specific primers (5' TTCTTCTTCTTGCTGATTGG3' and 5' CACAGAAAACCGCTCACC 3', respectively) which bind to the pEAQ-*HT* vector on either side of the multiple cloning site. The expression constructs were electroporated into *A. tumefaciens* AGL1 as previously described (Maclean et al., 2007). Transformants were selected for on Luria Bertani agar supplemented with 50 μg/ml kanamycin and 25 μg/ml carbenicillin. Putative transformants were screened by PCR of isolated plasmid DNA using the same primer pair.

#### Propagation of *N. benthamiana* Biomass

*N. benthamiana* seeds were germinated in flat trays filled with soil and incubated at 25°C (55% humidity), under a regulated 16-h light/8-h dark photocycle. After 3 weeks, individual seedlings were transplanted into pots containing a 2:1 mixture of peat to vermiculite. Plants were infiltrated with recombinant *A. tumefaciens* strains at 6–8 weeks and then returned to the greenhouse, under the same environmental conditions, for the duration of the experimental procedure.

#### Transient Expression of Recombinant HIV-1 Env Antigens in *N. benthamiana* Leaves

Glycerol stocks of recombinant *A. tumefaciens* AGL1 were revived in 10 ml LB media, supplemented with 25 µg/ml carbenicillin (Sigma-Aldrich) and 50 µg/ml kanamycin (Sigma-Aldrich). The cultures were sequentially scaled up to an appropriate volume in LB base medium [2.5 g/l tryptone, 12.5 g/l Yeast extract, 5 g/l NaCl, 10 mM MES (pH 5.6)], with 20 µM acetosyringone supplemented during the final culture step. The bacterial suspension was then adjusted to an OD600 of 1.0, using freshly prepared resuspension medium (10 mM MgCl2, 10 mM MES [pH5.6], 200 µM acetosyringone). Whole plants were submerged, upside down, in a beaker of the bacterial culture placed inside a vacuum chamber. A vacuum of -80 kPa was applied to the chamber and the procedure repeated 2–3 times to ensure complete infiltration of the leaves. The agroinfiltrated

plants were then returned to the greenhouse and incubated under the same environmental conditions until harvest.

#### Small Scale Extraction of Crude Soluble Protein

Clippings were harvested from agroinfiltrated leaves and finely ground in liquid nitrogen. The leaf material was resuspended in PBS [Lonza] (50 μl/clipping), supplemented with cOmplete™ EDTA-free protease inhibitor as per the manufacturers instructions (Roche), and incubated at 4°C for 1 h, with shaking. The plant slurries were clarified by centrifugation at 14,000 rpm, for 15 min, and the supernatant stored at -20°C.

#### Affinity Purification of Recombinant HIV-1 Env Glycoproteins

The aerial parts of the plants were harvested 5 days post agroinfiltration and homogenized in two buffer volumes of PBS (Lonza), supplemented with cOmplete™ EDTA-free protease inhibitor (Roche). The crude homogenate was incubated for 1 h, at 4°C, with shaking and then filtered through four layers of Miracloth™ (Merck). The crude plant sap was then clarified by sequential centrifugation steps; twice at 15,344 × *g* for 20 min and then again at 17,000 × *g* for 20 min. The supernatant was vacuum-filtered through a 0.45 µM Stericup-GP device (Merck Millipore) and applied to a *Galanthuis nivalis* lectin (GNL) column (Sigma) with a 0.5–1 ml/min flow rate. The column was sequentially washed with 100 ml of 0.5 M NaCl and then 100 ml of PBS (Lonza). The proteins were eluted in 1 M methyl α-Dmanno-pyranoside (MMP) (Sigma), buffer exchanged into PBS and then concentrated using a Vivaspin Protein Concentrator with a 30 kDa cut-off (GE Healthcare). The purified proteins were quantified using the DC™ Protein Assay (Bio-Rad).

## SEC Purification of HIV Env Glycoprotein Antigens

Following elution from the GNL affinity resin, the recombinant protein was concentrated and buffer- exchanged into 5 ml of PBS [pH 7.4] (Lonza). The purified protein was then size-fractionated using a Superdex 200 HiLoad 16/600 column (GE Healthcare). Fractions corresponding to the chromatogram peaks were analyzed by non-denaturing BN-PAGE followed by Coomassie staining to confirm their oligomeric identity. The desired fractions were pooled and stored at -80°C for immunogenicity studies.

#### Electrophoretic Resolution of Proteins and Western Blotting

Protein samples were resolved under denaturing conditions by 10% SDS-PAGE. Alternately, samples were resolved in their native state using NativePAGE™ Novex® 3–12% Bis-Tris Gels in accordance with the manufacturer's instructions. Following electrophoresis, proteins were either electrophoretically transferred onto Immun-blot® PVDF Membrane (Bio-Rad) or stained with Bio-Safe™ Coomassie stain (Bio-Rad). PVDF membranes were blocked for 2 h in 2% BSA and Env protein detected with a 1:1,000 dilution of goat anti-HIV-1 gp120 antibody (AbD Serotec). In turn, the primary antibody was detected with 1:10,000 dilution of GT34 anti-sheep/goat secondary antibody (Sigma-Aldrich). Western blots were developed with 5 ml BCIP/ NBT substrate (KPL) for 30 min.

#### Protein Identity Determination by Liquid Chromatography-Mass Spectrometry (LC-MS)

The identities of Coomassie-stained protein bands were independently determined by the Centre for Proteomic and Genomic Research (CPGR, Cape Town). Protein bands were recovered from the gel and fragmented by trypsin digestion, alongside a BSA reference standard. The resulting peptide solution was separated using the Dionex Ultimate 3,000 nano-HPLC system (ThermoFischer Scientific, USA) and then analyzed using a Q Exactive™ Hybrid Quadrupole-Orbitrap Mass Spectrometer (ThermoFischer Scientific, USA). The spectra generated by LC-MS were analyzed with Byonic Software (Protein Metrics USA) using publically available sequences retrieved from UniProt (www.uniprot.org) . Samples were interrogated against a merged database comprised of *N. benthamiana*, *N. tabacum*, *Agrobacterium*, and HIV proteomes.

#### Rabbit Immunizations

Rabbit immunizations and blood sampling was conducted at the University of Cape Town Research Animal Facility, in accordance with the guidelines and approval of a Faculty Animal Ethics Committee (AEC 014-30) and at the Animal Unit of the University of Stellenbosch in accordance with the guidelines and approval of the UCT Committee (AEC 015-05). Three-monthold New Zealand White rabbits (> 2 kg) were immunized with 50 µg of recombinant protein suspended in Alhydrogel® Adjuvant 2% (Invivogen) at a concentration of 1:1 (antigen: adjuvant), determined to be the optimal adjuvant in other work with mammalian cell-made HIV-1 gp140 in our lab (van Diepen et al., 2018). Groups of five rabbits were immunized intramuscularly into the quadriceps muscle of the hind leg at weeks 0, 4, 12, and 20. Animals in the last group were inoculated with 1 × 108 pfu rMVA at weeks 0 and 4, followed by immunization with 50 μg of the adjuvanted SEC-purified protein at weeks 12 and 20. Blood was drawn at weeks 0, 4, 8, 12, 14, 16, 20, 22, and 24 weeks for analysis. The experiment was terminated after 24 weeks.

#### Quantification of Serum Antibody Binding Titres

The levels of serum binding antibodies were quantified by ELISA using gp140 produced in HEK293 cells as a coating antigen. The mammalian cell-derived protein was purified the same way as plant-produced protein. SEC-purified protein was used as a coating antigen for animals immunized with SEC-purified gp140, and lectin affinity purified antigen for the cognate plant product where SEC was not performed after affinity purification. The assay was conducted as previously described (van Diepen et al., 2018; van Diepen et al., 2019).

Serum-binding antibodies to the autologous V1V2 Env region were quantified by ELISA using a protein scaffold provided by Professor Penny Moore (Senior Medical Scientist, Centre for HIV and STIs, National Institute for Communicable Diseases, Johannesburg). Ninety-six-well Maxisorb® microtitre plates (NunC) were coated overnight with 450 ng of purified scaffold and the assay conducted as before. Binding antibody levels were determined as a fold-dilution derived from the fitted four-point linear regression curve using a threshold of the minimum + standard error of the minimum for each time point. Env-binding antibodies and V1V2 binding antibodies were both quantified 4 weeks after the 3rd (week 16) and 4th (week 24) immunization. Binding antibodies were quantified using SEC-purified CAP256 SU gp140 protein that was purified from HEK293 cells.

#### Serum Neutralization of Env-Pseudotyped Viruses

The serum from immunized animals was assessed for HIV-1 pseudovirus neutralization at the National Institute for Communicable Diseases (Sandringham, Johannesburg). Neutralizing activity was measured in TZM-bl cells as the ability of serum to reduce luciferase reporter gene expression after a single round of infection with replication-incompetent Envpseudotype viruses (Montefiori, 2005). Two highly sensitive Tier 1A viruses (MW965.26, MN.3), 2 moderately sensitive Tier 1B (6,644, 1,107,356) viruses and the Tier 2 autologous viruses (CAP256 SU and Du151.2) which are more resistant to neutralization were tested. Env-pseudotyped viruses were generated by transient co-transfection of HEK293T cells with pSG3∆Env, containing an HIV genome with a defective Env gene, and a complementary plasmid encoding the Env gene of interest (pcDNA 3.1D/V5-His-TOPO-Env). Serially diluted sera were incubated with pseudovirions and then overlaid on TZM-bl cells seeded in a 96-well flat-bottomed plate. The plates were incubated for 48 h before lysing the cells and assaying the lysate for luciferase activity. The ID50 was calculated as the reciprocal dilution required for a 50% reduction in relative luciferase units. Murine leukaemia virus (MuLV) was included as a negative control. Sera with no neutralizing activity were assigned an arbitrary value of 19 to allow for statistical analysis.

# Statistical Analysis

All statistical analyses were conducted using GraphPad Prism 5. Statistical comparisons between groups over time were determined using a two-way Anova test. Comparisons between two groups at a single time point were performed using a twotailed Mann-Whitney unpaired test. In both cases, a P-value below 0.05 was considered to indicate a significant difference.

# RESULTS

# Selection of HIV Env

The vaccine antigens used were based on two rationally selected HIV-1 subtype C Env sequences. The first was the superinfecting virus from participant CAP256 in the CAPRISA 002 Acute Infection Cohort (CAP256 SU), who developed broadly crossneutralizing antibodies targeting the V1V2 loop (Moore et al., 2011; Moore et al., 2013; Doria-Rose et al., 2014). The CAP256 SU Env glycoprotein also has documented sensitivity to many prototype broadly neutralizing antibodies (Moore et al., 2011; Doria-Rose et al., 2014; Bhiman et al., 2015). The second was HIV Du151 Env, isolated in 1998 from an individual within the first 2 months of infection. It was selected as the vaccine strain by the South African AIDS Vaccine Initiative (SAAVI) in 2003 due to its close similarity to a South African subtype C consensus sequence (Williamson et al., 2003).

#### Transient Expression of HIV-1 gp140 Glycoprotein Mimetics *In Planta*

Soluble gp140 antigens, reflecting the preferred human codon usage, were designed from the CAP256 SU and Du151 Env genes (**Figure 1A**). This codon usage was previously reported to result in higher levels of expression for the analogous influenza HA glycoprotein in plants (Mortimer et al., 2012), as well as for other antigens tested in the Biopharming Research Unit (human and bovine papillomavirus L1 proteins, pers. Comm) (Maclean et al., 2007). *N. benthamiana* plants were vacuum infiltrated with recombinant *A. tumefaciens* strains encoding the HIV-1 gp140 antigens from the pEAQ-*HT* expression plasmid. Expression of both proteins induced severe pathology in infiltrated plants, although the effect was more pronounced for plants expressing Du151 gp140, which displayed marked necrosis by 5 days post infiltration (dpi) (**Figure S1**). Accumulation of the recombinant antigens was monitored by western blotting of SDS-PAGEseparated crude leaf homogenate for 9 days. Expression of both antigens was detectable by 3 dpi and expression levels peaked after 5 days (data not shown). Western blots showed a product just below the 130 kDa molecular weight marker, as well as higher order molecular weight aggregates (>245 kDa) that were poorly resolved by SDS-PAGE (**Figure 1B**). There was no obvious improvement in the resolution of these aggregates when protein was extracted in buffers with different pH values, with detergent, or with an inhibitor of oxidation (data not shown). This is slightly smaller than the expected molecular weight of 140 kDa.

## Purification of Recombinant HIV-1 Env gp140

Protein production was scaled up by increasing the number of plants infiltrated, and the antigens were purified by lectin affinity chromatography. The mean recovery of the CAP256 SU and Du151 antigens was 6.2 mg/kg and 4.9 mg/kg of plant biomass, respectively (*n =* 3 independent infiltrations and purifications). Although the CAP256 SU gp140 demonstrated a trend towards higher expression, this was not statistically significant (unpaired *t* test, p > 0.05). Liquid chromatography mass spectrometry (LC-MS) analysis of Coomassiestained bands following SDS-PAGE verified the purification of HIV Env and that the unresolved products were also Env. In addition, low levels of endogenous plant proteins were evident below the 80 kDa molecular weight marker (**Figure S2** and **Table S1**). The major contaminant of the purified CAP256 SU gp140 antigen was a homologue of *Agrobacterium fabrum* chaperone protein DNA K, although the coverage was poor, and this was most likely a region conserved amongst *Agrobacterium* strains. For Du151 gp140 the main contaminant was the tobacco luminal-binding protein 5, a homologue of the endoplasmic reticulum chaperone BiP.

We also tested a subsequent SEC step to remove plant protein contaminants and aggregated Env species from the purified CAP256 SU gp140 antigen. This protein was specifically selected for SEC due to its trend towards higher expression levels *in planta* compared to that of Du151 Env and the strong rationale underlying its selection for development as a vaccine immunogen. Size fractionation yielded two closely overlapping peaks followed by a small shoulder and another small peak (**Figure 1C**). Coomassie staining of pooled SEC peaks yielded products consistent with the sizes expected for aggregates (>720 kDa) and trimers (~720 kDa) following their resolution by BN-PAGE (Sharma et al., 2015) (**Figure 1D**). The small shoulder that eluted after the putative trimeric fraction was below the threshold of detection by Coomassie staining but is presumed to be monomeric protein. The identity of the small peak that eluted later is unknown. This approach resulted in the successful removal of the majority of contaminating aggregates and monomers, and the recovery of putative trimeric protein (**Figure 1D**). The yield of purified SEC-CAP256 SU gp140 protein was relatively low at ~1.94 mg recovered per kg of biomass.

## Autologous Serum-Binding Antibodies

The immunogenicity of the affinity-purified and SEC-purified antigens was tested by immunizing rabbits four times with 50 µg of protein, formulated in Alhydrogel® adjuvant, at weeks 0, 4, 12, and 20. An additional group was included where animals were primed twice with rMVA encoding a subtype C mosaic Gag antigen (Jongwe et al., 2016) and a membrane-bound gp150 protein matched to our gp140 but truncated at amino acid 730 to remove most of the cytoplasmic domain (van Diepen et al., 2019), followed by two boosts with the SEC-purified plant-made Env (**Figure 2A**). The mosaic Gag was previously observed to form virus-like particles in cells transfected with DNA or infected with recombinant MVA encoding the protein. Furthermore, the antigen was highly immunogenic in both vaccine modalities (Jongwe et al., 2016; Chapman et al., 2017). Addition of gp150 to mosaic Gag DNA or MVA vaccines resulted in inclusion of Env in these Gag VLPs (van Diepen et al., 2019).

The vaccines were well tolerated and all groups of immunized animals rapidly developed binding antibodies which were detectable even after a single immunization (by week 4) (**Figure 2B**). Increased serum binding antibodies were observed over the course of the 24-week experiment, although no statistical difference was observed between groups over time or between individual time points (p > 0.05, two-way Anova).

The levels of SEC-purified Env-binding antibodies elicited by the different vaccination regimens were also quantified after the 3rd and 4th immunization as this probably represents a more authentic and vaccine-relevant form of the protein. Both groups of animals immunized with the SEC-purified protein had significantly higher levels of binding antibodies than animals immunized with the affinity-purified protein (P < 0.05,

two-tailed Mann-Whitney unpaired test) (**Figures 2C**, **D**). No unpaired Mann-Whitney test (\*p < 0.05). The levels of binding antibodies are indicated as a fold-dilution derived from the fitted four-point linear regression curve using a threshold of the minimum + standard error of the minimum for each time point.

significant difference in binding antibody levels was observed between animals immunized with only SEC-purified protein and animals from the MVA prime-protein boost group.

The levels of V1V2 antibodies elicited by the CAP256 SU-derived vaccines were also quantified as the RV144 trial reported that reduced risk of HIV-1 acquisition was correlated with binding antibodies to this region (Haynes et al., 2012). Encouragingly, all vaccination regimens elicited autologous V1V2 scaffold binding antibodies which were detected at weeks 16 and 24 (**Figures 2E**, **F**). No statistically significant differences were observed between groups.

#### Serum Neutralizing Antibodies Against Env-Pseudotyped Virions

Serum samples from immunized animals were assessed for neutralization activity against Env-pseudotyped viruses representing a range of neutralization sensitivities (**Tables S2A**, **B**) (Seaman et al., 2010). Animals immunized with SECpurified protein exhibited a trend towards higher neutralizing titres when compared to animals immunized with the affinitypurified protein (**Figure 3**). Unfortunately, due to differences in the time points at which the assays were conducted, this could not be compared statistically.

All groups of animals developed neutralizing antibodies against the HIV-1 subtype C Tier 1A virus (MW965.26), although in the two groups immunized with the SEC-purified Env titres of over 1,000 were observed in some animals. The group primed with rMVA was determined to have significantly higher levels of neutralizing antibodies against this virus at week 22 compared to the group immunized with the SEC-purified protein only (two-tailed Mann-Whitney unpaired test, p < 0.05). Sporadic neutralization was observed against 6,644 (Tier 1B) in animals immunized with the affinity-purified protein whereas all animals immunized with the SEC-purified Env developed consistent neutralizing antibodies against this virus. Waning neutralization against 1,107,356 was also observed in animals inoculated with the SEC-purified protein. In contrast animals immunized with the affinity-purified protein did not develop any detectable neutralizing antibodies against this virus. Interestingly, more animals from the rMVA prime-SEC protein boost group (5/5) developed neutralizing antibodies against this virus than animals immunized with the SEC-purified Env only (2/5). None of the animals developed neutralizing antibodies against the autologous Tier 2 virus (CAP256 SU and Du151) (**Tables S2A**, **B**). Peak neutralizing antibody titres were observed after the 3rd immunization and decayed to varying extents by the final bleed.

#### DISCUSSION

In view of the sequence diversity and rapid mutation rate of HIV-1, the implementation of successful HIV vaccines will require rapid production scalability, and subunit vaccines especially may be a challenge to produce in sufficient quantities. This will be particularly problematic in the developing countries that are often disproportionately affected by the pandemic, and which mostly lack the infrastructure necessary for manufacturing their own vaccines (Shao and Williamson, 2012). Given the reduced infrastructure requirements and the potential for rapid scalability of plant production systems for recombinant proteins, we have investigated the feasibility of a transient expression platform in *N. benthamiana* plants for the production of a HIV Env-derived candidate vaccine antigen.

We designed two soluble HIV Env mimetics based on rationally selected viral isolates. The antigens were designed with a flexible linker peptide at the interface of gp120 and gp41 to circumvent the need for furin-mediated cleavage, which does not occur naturally in plants (Wilbers et al., 2016). This

vaccination regimens over the course of the experiment. The neutralizing antibody titres (ID50) are reflected as the mean of the reciprocal dilution required to inhibit viral entry into a reporter cell line. The dotted line at 20 represents the threshold below which neutralization activity is considered background.

approach is currently under investigation by a number of groups attempting to produce native-like Env trimer immunogens in mammalian cells (Kovacs et al., 2014; Georgiev et al., 2015; Sharma et al., 2015). Both antigens yielded detectable levels of expression in crude leaf homogenates. The apparent peak in protein expression occurred 5 days post agroinfiltration and western blotting of crude extract separated using denaturing SDS-PAGE yielded a band of approximately 130 kDa, as well as higher molecular weight products that were poorly resolved by electrophoresis. LC-MS analysis of gel-purified bands confirmed the identity of the HIV Env protein and verified that the unresolved higher molecular weight products observed were also HIV Env. Rosenberg and colleagues observed similar unresolved products for their plant-produced HIV Env gp140 ∆CFI antigen, although they concluded that these corresponded to oligomeric Env species (Rosenberg et al., 2013). The slight disparity in the size of our recombinant Env antigens compared to the equivalent proteins produced in mammalian cells may be a reflection of lower levels of glycan site occupancy that has been reported for other heterologous proteins produced in plants (Sharma et al., 2015).

The expression of both antigens resulted in pathology to the host plants and the pathology observed was more severe following expression of Du151 gp140 than for CAP256 SU gp140. Infiltration with recombinant *A. tumefaciens* containing an empty pEAQ-*HT* expression vector produced no obvious effects, suggesting that the pathology was a direct result of overexpressing the glycoproteins *in planta*. It is conceivable that expression of the target proteins may have exceeded the plant's capacity to accommodate folding resulting in the accumulation of misfolded Env and the associated ER-stress response (Howell 2013). The phenotype observed in this study is consistent with considerable ER-stress and the degradation of misfolded Env proteins, as part of the ER quality control system, which would account for the low yields observed in this study (Hamorsky et al., 2015). This is currently under investigation and approaches are being investigated to improve Env production in plants and reduce the associated levels of ER-stress.

The scalability of the system was demonstrated, and a successful purification strategy was devised for the recovery of the antigens after large scale expression. The glycoproteins were efficiently bound by *G. nivalis* lectin during affinity chromatography, enabling the recovery of milligrams of protein. The inclusion of a subsequent SEC step enabled the removal of aggregated and monomeric Env species resulting in the recovery of presumably trimeric CAP256 SU gp140 antigen, although further work is required to determine if the protein is "native-like" in conformation. This could be achieved by determining the reactivity of the purified antigens with monoclonal antibodies derived from human donors and visualization of the antigens by negative stain electron microscopy (Sanders et al., 2013). The properly folded form of the protein is expected to assume a compact propeller-like conformation that exposes epitopes that are targeted by broadly neutralizing antibodies and occludes epitopes that are recognized by non-neutralizing antibodies (Sanders and Moore, 2017).

The yields of both antigens were lower than desirable, with approximately 6 mg/kg fresh weight and 5 mg/kg fresh weight of purified protein recovered by affinity chromatography for the CAP256 SU gp140 and Du151 gp140 antigens, respectively. The implementation of a subsequent SEC step resulted in a recovery of <2 mg/kg of CAP256 SU gp140. The only other report in which HIV-1 gp140 antigens were expressed in plants described raw yields of ~80 mg/kg: this was not of a purified form, however (Rosenberg et al., 2013). It is noteworthy that different HIV isolates exhibit varying levels of Env expression in mammalian cell culture-based expression systems and vary in their propensity to form trimers (Bricault et al., 2015; Julien et al., 2015; Zambonelli et al., 2016). Although other viral isolates may potentially yield higher expression levels, a strong rationale underpinned the selection of these isolates for immunogenicity studies.

Rabbits were immunized four times with the recombinant antigens to evaluate the impact of removing both aggregated and monomeric Env species, and the influence of priming the immune response with rMVA encoding Gag and a cognate Env gp150 antigen. Although this study was aimed at eliciting neutralizing antibodies against the Env glycoprotein, the inclusion of Gag could also contribute to a combination vaccine by eliciting a polyfunctional T cell response capable of suppressing viraemia and preventing disease progression following infection. The Gag used here is particularly promising as it has been optimized *in silico* to improve the coverage of common CD4+ and CD8+ T cell epitopes (Fischer et al., 2007). The co-expression of Gag and Env gp150 is also expected to result in the presentation of the Env glycoprotein on the surface of Gag VLPs when co-expressed from MVA following immunization: our group has previously reported that this Gag mosaic antigen forms VLPs *in vitro* after transfection of cells with DNA or infection of cells with MVA vaccines encoding the protein (Jongwe et al., 2016). It has been argued that presenting Env antigens on the surface of VLPs may stabilize the protein in its natural lipid membrane context, thereby presenting the antigen to the immune system in a more authentic way than in the case of soluble protein (Crooks et al., 2011; Tong et al., 2014; Crooks et al., 2015; Crooks et al., 2017). The size of VLPs also promotes entry into the lymphatic system, enabling increased interaction with professional antigen presenting cells (Tong et al., 2012; Zabel et al., 2013). In contrast, small soluble proteins are poorly taken up by the lymphatic system and are therefore comparatively less immunogenic (Bachmann and Jennings, 2010; Zabel et al., 2013). Lastly, repeating arrays of antigen on the surface of VLPs enable cross-linking of B cell receptors, resulting in the induction of long-lived antibody responses (Bachmann et al., 1993; Bachmann and Jennings, 2010).

The Env produced in plants was well tolerated in rabbits and demonstrated promising immunogenicity. Binding antibodies were detected after the first immunization and increased in titre over the course of the experiment. The boosting effect became less pronounced after the first two immunizations and binding antibody levels were observed to decline to some extent over time. Similar levels of binding antibodies against the matched antigen were observed in all groups with no statistically significant differences observed. However, animals immunized with the SEC-purified protein developed significantly higher levels of binding antibodies after the 3rd and 4th immunization than animals immunized with the protein that had only been lectin affinity-purified. Encouragingly, all immunization regimens elicited binding antibodies directed at the V1V2 loop, which was shown to be a correlate of vaccine-mediated protection in the RV144 trial (Haynes et al., 2012). Furthermore, animals immunized with the SEC-purified protein exhibited a distinct trend towards higher neutralization titres against several Tier 1 viruses. It is assumed that contaminating aggregates and monomeric Env species present in the affinity-purified vaccines elicited "off-target" antibodies that could not engage the native Env glycoprotein, resulting in lower levels of neutralizing antibodies. Interestingly, peak neutralizing antibody titres in both groups were observed after the 3rd immunization, and subsequently declined despite an additional inoculation.

Priming rabbits with rMVA followed by boosting with the secpurified protein further improved the induction of neutralizing antibodies. Although this difference was only significant for MW965.26 at week 22, a trend was seen across the different pseudoviruses that were tested. Despite MVA not being replicationcompetent, its ability to infect cells and express heterologous proteins *in vivo* will lead to processing of the antigens *via* the intracellular proteasome and cross-presentation by MHC receptors, resulting in the induction of both CD4+ and CD8+cells (Sutter and Moss 1992). The induction of improved CD4+ T cell help may have improved the elicitation of neutralizing antibodies following immunization with the SEC-purified protein (Crooks et al., 2007). In further support of this heterologous prime-boost approach, the only clinical trial to report efficacy against HIV acquisition employed a canary pox prime followed by a recombinant Env protein boost (Rerks-Ngarm et al., 2009; Robb et al., 2012).

This is to our knowledge the first study to report the successful production of apparently trimeric soluble HIV-1 Env protein, and to investigate its immunogenicity, and represents an important first step in the development of a candidate plant-produced HIV vaccine for clinical trial. Further work is ongoing to address limitations of yield and potential plant-specific differences in glycosylation and folding in order to improve the induction of neutralizing antibodies. The work described here highlights the importance of stringent purification of Env glycoproteins for use as vaccine immunogens and the benefit of priming with a recombinant poxvirus to improve humoral responses against the Env glycoprotein.

#### DATA AVAILABILITY STATEMENT

All datasets generated for this study are included in the article/ **Supplementary Material**.

#### ETHICS STATEMENT

The animal study was reviewed and approved by University of Cape Town Health Sciences Faculty Animal Ethics Committee (AEC 014-30 and AEC 015-05).

#### REFERENCES


#### AUTHOR CONTRIBUTIONS

EM conducted the plant expression, purification, and immunogenicity experiments and compiled the manuscript. RC designed the sequences of the HIV antigens and cloning strategies used to construct the plant and mammalian expression vectors and MVA. RC, AM, ER, and A-LW supervised the experimental work and contributed to experiment design. MD and PX produced the mammalian cell-derived Env protein and contributed to experiment design. BW assisted with the development of a purification strategy and assisted with protein purification experiments. TH performed and CC aided with the neutralization assays. LM supervised neutralization assay experiments and contributed to interpretation of the results. All authors provided feedback on the manuscript.

#### FUNDING

This work is based upon research supported by the South African Medical Research Council with funds received from the South African Department of Science and Technology and the South African Research Chairs Initiative of the Department of Science and Technology and National Research Foundation.

#### ACKNOWLEDGEMENTS

The authors would like to thank Dr George Lomonossoff (Biological Chemistry Department, John Innes Institute) for providing the pEAQ-HT expression plasmid used in this study, Valerie Bekker for her assistance with the pseudovirus neutralisation assays, Rodney Lucas, Inge Botes and Noel Markgraaff for performing immunisations and blood sampling, Shireen Galant for providing the MVA vaccines, and the Centre for Proteomic and Genomic Research for conducting the LC-MS analysis.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.01378/ full#supplementary-material

challenges in rhesus monkeys. *Science* 349 (6245), 320–324. doi: 10.1126/ science.aab3886


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Margolin, Chapman, Meyers, van Diepen, Ximba, Hermanus, Crowther, Weber, Morris, Williamson and Rybicki. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Hairy Root Cultures—A Versatile Tool With Multiple Applications

Noemi Gutierrez-Valdes <sup>1</sup> , Suvi T. Häkkinen<sup>1</sup> , Camille Lemasson<sup>2</sup> , Marina Guillet <sup>2</sup> , Kirsi-Marja Oksman-Caldentey <sup>1</sup> , Anneli Ritala1† and Florian Cardon2\*†

<sup>1</sup> VTT Technical Research Centre of Finland Ltd., Espoo, Finland, <sup>2</sup> Samabriva SA, Amiens, France

#### Edited by:

Inge Broer, University of Rostock, Germany

#### Reviewed by:

Sumita Jha, University of Calcutta, India Javier Palazon, University of Barcelona, Spain Alejandra Beatriz Cardillo, University of Buenos Aires, Argentina Rosa M. Cusido, University of Barcelona, Spain

\*Correspondence:

Florian Cardon florian.cardon@samabriva.com † These authors share last authorship

#### Specialty section:

This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science

Received: 11 October 2019 Accepted: 13 January 2020 Published: 03 March 2020

#### Citation:

Gutierrez-Valdes N, Häkkinen ST, Lemasson C, Guillet M, Oksman-Caldentey K-M, Ritala A and Cardon F (2020) Hairy Root Cultures—A Versatile Tool With Multiple Applications. Front. Plant Sci. 11:33. doi: 10.3389/fpls.2020.00033 Hairy roots derived from the infection of a plant by Rhizobium rhizogenes (previously referred to as Agrobacterium rhizogenes) bacteria, can be obtained from a wide variety of plants and allow the production of highly diverse molecules. Hairy roots are able to produce and secrete complex active glycoproteins from a large spectrum of organisms. They are also adequate to express plant natural biosynthesis pathways required to produce specialized metabolites and can benefit from the new genetic tools available to facilitate an optimized production of tailor-made molecules. This adaptability has positioned hairy root platforms as major biotechnological tools. Researchers and industries have contributed to their advancement, which represents new alternatives from classical systems to produce complex molecules. Now these expression systems are ready to be used by different industries like pharmaceutical, cosmetics, and food sectors due to the development of fully controlled large-scale bioreactors. This review aims to describe the evolution of hairy root generation and culture methods and to highlight the possibilities offered by hairy roots in terms of feasibility and perspectives.

Keywords: Rhizobium rhizogenes, Agrobacterium rhizogenes, hairy roots, recombinant proteins, specialized metabolites, molecular farming

## INTRODUCTION

Between the 1930s and the 1960s, hairy roots (HRs) were studied primarily as a sign of pathogen invasion in horticultural plants (Doran, 2013). Only until the 1970s to the 1980s, Agrobacterium rhizogenes was identified as the bacterial agent that, through the gene transfer of the bacterial Ri (root-inducing) plasmid, induces HR syndrome (Chilton et al., 1982; Gelvin, 2009). This important revelation prompted several studies that have helped to develop the hairy root cultures (HRCs) technology. We now know that HRs arise from the wounding site of plantlets when they are infected by A. rhizogenes, a symbiotic bacterium currently taxonomically renamed Rhizobium rhizogenes. The infection takes place upon of specific bacterial DNA fragments (T-DNA) from its root-inducing plasmid (pRi) into plant cells (Chilton et al., 1982). Even when the plant responds to bacterial infection by triggering the expression of several defense-related proteins to suppress the pathogen, R. rhizogenes has evolved mechanisms to take advantage of those plant defense proteins by counteracting an action that consequently dismounts the plant defense pathways (Sevón and Oksman-Caldentey, 2002; Mauro et al., 2017). To date, every system for recombinant protein production presents its limitations. For instance, inability to produce and/or secrete functional complex proteins (e.g. bacterial systems), risk of viral transmission and toxic molecules (e.g. bacterial systems, mammalian cells), or very high production costs (e.g. mammalian cells) (Cardon et al., 2019). Since the early 2000s, the production of therapeutic proteins using transgenic plants appeared to offer a number of major advantages over other expression systems including safety (no risk of humanthreatening viral contamination), low upstream costs, and complex glycosylation. Plant cell suspensions and HRCs combine the intrinsic advantages of plants and a confinement of production. In comparison to cell suspensions, HRCs present several advantages such as genotypic and phenotypic stability and possible extracellular secretion of expressed proteins (also referred to as rhizosecretion) offering a convenient method for target proteins purification in a well-defined protein-deficient medium (Wang and Wu, 2013). HRCs are capable for the production of complex compounds and high scalability (Häkkinen and Ritala, 2010; Stoger et al., 2014). In this context, the production of recombinant proteins has been considered a promising application of HRCs. It allows the expression of recombinant proteins by roots grown in bioreactor and their secretion in the culture medium under controlled and confined conditions.

Likewise, R. rhizogenes-mediated transformation has allowed a successful production of different chemical compounds also known as specialized metabolites or secondary metabolites which correspond to complex molecules naturally present in the plants and displaying interesting features at a pharmacological, cosmetic and nutraceutical level. According to literature analysis, over the past three decades, R. rhizogenes transformation has also been used to elucidate physiological processes and biosynthetic pathways, to generate plant-derived molecules, to assist molecular breeding, to improve phytoremediation strategies, and to produce therapeutic recombinant proteins (Georgiev et al., 2012; Häkkinen and Oksman-Caldentey, 2018).

Due to their technological and economic advantages, the development of HRCs has gained an increasing interest by academic research teams, biotechnology companies and pharmaceutical industries. To exemplify, according to SCOPUS databases, from 01/2012 to 11/2018, 607 articles dealt with research using HRCs. According to PubMed databases, from 01/2012 to 02/2019, 767 scientific publications were identified with the keywords "hairy roots" or "hairy root" or "transformed root cultures" or "transformed root culture." When the terms HRCs and recombinant protein (RP) are inquired, different subject areas are identified, from which the three most relevant are Biochemistry, Genetics and Molecular Biology (38.2%), Immunology and Microbiology (17%), and Agricultural and Biological Sciences (17%). In SCOPUS database, 78 patents dealing with recombinant protein production using HRCs were identified. Most of these patents were published in 2017 and mainly comprise the description of methods to increase the production yield in plant material (Medina-Bolivar and Yang, 2017). Some are specifically dedicated to increase the secretion of recombinant proteins (Jost et al., 2014). Globally, as illustrated in Figure 1, the scientific research related to the HRCs is on the rise since the 1990s, with a marked increasing interest during the last 15 years (from PubMed databases).

FIGURE 1 | Overtime evolution of the number of scientific publications dealing with hairy roots (HRs) (date of access, December 16th, 2019 with key words "hairy roots" or "hairy root").

## R. RHIZOGENES: A PARTICULAR BACTERIUM

R. rhizogenes is a gram-negative soil bacterium inhabiting near plant roots and ultimately causing in the infected plant host the so-called "hairy root syndrome." This syndrome consists of a non-geotropic branching root overgrowth at the site of infection (Guillon et al., 2006). The molecular events involved in the formation of the so-called HRs are not yet entirely understood. However, the genetic transformation process can be divided into the following stages: (a) Rhizobium-perceiving phenolic compounds e.g. acetosyringone are released by the explant usually after wounding, activating attachment of the bacteria to explant/root cells; (b) processing of the T-DNA into bacterial cells and T complex formation (T strands and other associated proteins); (c) transfer of T complexes from the bacteria to the plant host genome; (d) T-DNA incorporation and expression in the plant genome; and (e) HRs emerging from the infection site (Georgiev et al., 2012).

The T-DNA of the Ri plasmid is randomly integrated into the plant genome and expressed as mRNA. Various loci such as the so-called vir region of the pRi, T-DNA, and chromosomal virulent (chv) genes are vital for efficient transformation. These genes consist of virD1 and virD2, that portray proteins that attach to and cut DNA at 25-bp T-DNA border repeat sequences (Georgiev et al., 2012). Proteins translated from virE1 and virE2 genes are also significant as they shield T strands from nuclease digestion and ease their integration into the plant chromosome. Even though some R. rhizogenes strains do not possess these genes, they still transfer T strands effectively because of the existence of the pRi GALLS gene portraying a protein with a nuclear localization signal and helicase activity (Gelvin, 2009). The T-DNA contains two independent sequences, namely left and right borders, TL and TR, respectively. TL-DNA and TR-DNA are usually independently transferred and stably integrated into the genome of the host plant (Chandra, 2012). However, only the TL-DNA is vital and sufficient for HR induction. After sequencing of the TL-DNA, four open reading frames were discovered as essential for HR induction (rolA, rolB, rolC, and rolD). The products of these rol genes have specific functions in the formation; however, the rolB gene seems to be the most relevant in the induction. Also the rol-genes have a big influence on the phenotype of the plants regenerated from the HRCs (Sevón et al., 1997). In a loss-of-function study, it was discovered that the knock-out of the rolB gene causes the plasmid to be avirulent (Mauro et al., 2017). Additionally, the rolB gene showed to be involved in RNA silencing pathways through microRNA overexpression (Bulgakov et al., 2015). Finally, the rolB gene of R. rhizogenes is involved in the activation of the transcription factors of most specialized metabolites in HRCs as well as on the expression of chaperone-type proteins (Bulgakov et al., 2016; Bulgakov et al., 2018). However, the function of this activator is still poorly understood. Very recently, a critical role of rolA in the long-term cultivation was also discovered, opening up a new research area to fully understand the rol genes function (Veremeichik et al., 2019). Overall, rol genes are described as modulators of plant growth and cell differentiation and could also mediate uncommon signal transduction pathways in plants (Bulgakov, 2008). Further research is still required to have a better understanding of all molecular events concerning HR induction.

#### HR PROCESS: GENERATION, CONSERVATION, AND CULTURE

The first comprehensive publication on the state of the art technology on HRCs (Hairy Roots: Culture and Applications) (Doran, 1997) provided information on laboratory protocols for initiation, culture, and genetic transformation of HRs. Additionally, the book described applications in plant propagation, alkaloid synthesis, and downstream processing considerations for largescale HRCs. Nowadays, the laboratory protocols for HRC are highly comparable to those previously used.

To develop an optimal HRC system, the complexity of the molecules to be produced has to be considered very early in the process (such as molecular weight, potential toxicity, etc…), along with the intrinsic properties of the plant species selected for HR induction (such as growth capacity, ability to be transformed, etc…). Actually, some species are more efficient than others in terms of productivity. This is particularly true regarding the production of specialized metabolites for which the choice of the plant to infect will influence on the metabolites to be produced. For example some hemp species have to be privileged according to the cannabinoid compounds to be produced (De Meijer et al., 2003). The choice of plant species also needs to be considered for the production of recombinant proteins. As an example, one of the plant species among the most commonly used to produce recombinant proteins is Nicotiana tabacum. It has been demonstrated that this species is less efficient than Brassica rapa rapa for the production and secretion at least of eGFP under the same HRC conditions (Huet et al., 2014). Moreover, the stability over time on the production capacity of the HRs has to be considered (Huet et al., 2014; Häkkinen et al., 2016). Finally, when possible, the safety of the plant species has to be taken into account in particular for pharmaceutical application (e.g. edible plants). Globally, the two most important criteria, when selecting the most appropriate plant species for HRC, are its ability to produce and secrete high amounts of the molecule of interest and its biomass production capacity.

Usually, HRC induction involves cultivation of sterile wounded plant explants that are directly inoculated with a R. rhizogenes (Figure 2A). When the goal is to produce recombinant proteins, this bacteria have to be first genetically engineered, so as to portray the genes of interest to be later expressed by the HRs. Certain plant species, such as monocotyledonous species, are considered recalcitrant as they cannot be transformed using this method. For these recalcitrant species, a technique called sonication-assisted R. rhizogenesmediated transformation (SAArT) has been demonstrated to be suitable for inducing HRCs (Georgiev et al., 2015). The explants are then treated with antibiotics to eradicate the bacteria. The resultant neoplastic HRs with multiple branches grow on hormone-free media. At this point, to confirm that these roots are indeed HRs and not adventitious roots, and that the R. rhizogenes was efficiently eradicated, PCR is normally performed using primers that amplify rol and vir genes. If a heterologous gene has been integrated in R. rhizogenes bacteria, parallel PCR using transgene-specific primers is also used in order to confirm its integration in the HR genome.

Once developed and selected, the HR strains can undergo different maintenance procedures according to financial and practical constraints (Figure 2B). Currently, the mainly used preservation method is a monthly subculture of individualized HRs on solid and/or liquid media. This method is timeconsuming, expensive, and may present high risks of contamination and eventual loss of original strains. Another HR preservation alternative that avoids the abovementioned problems is cryopreservation of the HR clones (Georgiev et al., 2012). The cryoconservation refers to the storage of a biological material (as plant) for a long-term period in specific conditions to avoid any genetic modification or alteration that may threat the ability of the material to produce a well-characterized molecule of interest. To date, three different approaches have been proposed for the cryopreservation of plant material cultures in liquid nitrogen: (a) the slow cooling technique, (b) the vitrification method, and (c) the dehydration of immobilized cells within alginate. Few articles describe methods for cryoconservation of plant materials and especially HRs (Hirata et al., 1998; Xue et al., 2008; Lambert et al., 2009; Häkkinen et al., 2016). These methods present advantages to be transferable in biobanks and to be in agreement with the usual definition of a Master and Working transgenic bank, thus facilitating the potential use of HRCs in GMP processes.

One of the final goals of HRCs is to produce plant-expressed compounds of interest, ideally in large-scale bioreactors (Figure 2C). The use of bioreactors for HRCs has to be optimized according to the species and the molecules of interest. Some of the considerations for optimization are hairiness, thickness, length, and branching of the roots. Additionally, both metabolite production and cell growth are non-homogeneous according to

there are specialized metabolites and recombinant proteins. Transgenic HRs can also be used for phytoremediation or mechanism understanding.

the plant initially transformed and to the molecule to be produced, which further complicates bioreactor optimization (Doran, 2013; Lehmann et al., 2014). Moreover, monitoring the root growth is challenging due to the inability to obtain homogeneous HR samples on a real-time basis. Furthermore, doubling time is also a crucial factor to be considered to ensure that the HR growth occurs as expected. Despite all the above-mentioned constraints, when bioreactors can be developed, they are suitable for HRCs because of their self-containment with inflow and outflow systems for liquid and air, enabling a controlled growth in sterile environments that contain only liquid nutrients. Generally, conditions such as pH, aeration, temperature, and dissolved gasses can be controlled in bioreactors (Doran, 2013).

The bioreactors used for HRCs can be classified into gasphase or liquid-phase reactors and are generally derived from classical bioreactors but adapted for the culture of plant tissues. Also a lot of studies have been conducted using disposable bioreactors e.g. wave-type of bioreactors (Eibl and Eibl, 2008; Eibl et al., 2011; Georgiev et al., 2013). In brief, in gas-phase reactors, roots are exposed to air and other gas mixtures, and nutrients are commonly distributed to the roots as droplets of different sizes. An example of this type of bioreactor is the mist and trickle-bed bioreactors (Georgiev et al., 2013; Srikantan and Srivastava, 2018). Such bioreactor generally offers abundant oxygen supply. On the other hand, in liquid-phase reactors, roots are submerged in the medium (explaining the given name of "submerged reactors"). Examples of liquid-phase reactors are stirred tank and airlift and bubble column reactors (Doran, 2013). The advantages of this type of bioreactors are the wellknown and simple design and construction, low risk of contamination, and low maintenance. Until now, most of the HRCs were grown in liquid-phase bioreactors with volume up to 20–30 L (Lee et al., 1999; Sivakumar et al., 2005; Georgiev et al., 2013). However, for some HRCs, volumes of several hundreds of liters have been reached (Wilson et al., 1990; Samabriva's internal data). The emergence of companies able to produce HRs in optimized and perfectly controlled, large volumes bioreactors, offer interesting perspectives for an industrial use of this biological material in the future.

Finally, the classical downstream process of any biosynthetic compound takes place after the production. The produced compound can be secreted into the media, facilitating the downstream processing. Nevertheless, some products may remain within the cells, thus potentially generating trouble for purification (Talano et al., 2012). It is interesting to note that sometimes non-natural or biosynthetically foreign compound can be more efficiently secreted to the culture medium than compounds which are naturally produced by the HRs. This was earlier shown by tobacco HRs overexpressing hyoscyamine-6bhydroxylase and secreting up to 85% of the produced scopolamine in the culture medium, in contrast to the Hyoscyamus HRs where the majority of the product was retained in intracellular space (Häkkinen et al., 2005). Other strategies enabling a permeabilization of HRs for an optimization of the targeted metabolite/recombinant protein release into the culture medium were also used. Several approaches were tested, most of the time involving treatment with organic solvents or surfactants (Chandra and Chandra, 2011).

The products being expressed by HRs can correspond to specialized metabolites, naturally produced by the plant, or recombinant, heterologous proteins. HRs can also serve as tools for the study of gene function, phytoremediation, molecular breeding, among others (Figure 2D).

#### PRODUCTION OF RECOMBINANT PROTEINS

To our knowledge, the first proof of concept for HRCs was achieved by producing a mouse monoclonal antibody in HRs of tobacco plants (Wongsamuth and Doran, 1997). By then, it was shown that this antibody was secreted and accumulated in the culture medium. Afterwards, other recombinant proteins have been produced and secreted by tobacco HRs, including for example the green fluorescent protein (GFP) (Medina-Bolívar and Cramer, 2004), human acetylcholinesterase (Woods et al., 2008), murine interleukin (Liu et al., 2009), thaumatin sweetener (Pham et al., 2012), human interferon alpha-2b (Luchakivskaia et al., 2012), recombinant human EPO (rhEPO) (Gurusamy et al., 2017), and recombinant alpha-L-iduronidase (Cardon et al., 2019). Even complex glycosylated proteins can be produced using HR platforms with highly homogeneous posttranslational profiles. As an example, Cardon et al. (2019) detailed, the N- and O-glycosylation profiles of a human lysosomal enzyme produced from HRC and showed that all Nglycosylation sites of this protein were occupied by paucimannose profiles with a very high homogeneity. Such paucimannosidic profile is of particular relevance for the production of proteins that could be used to treat patients with some lysosomal disorders (such as Gaucher disease or Fabry disease). Therefore, numerous heterologous proteins have been produced using HRs-based expression systems including antigens, antibodies, enzymes, and immunomodulators as described in Table 1.

The design of the construct used to express the recombinant protein is a critical step. Commonly, a construct contains a promoter, a signal peptide to orientate the heterologous proteins to the expected pathway (i.e. the secretory pathway), the gene encoding the protein of interest, and a polyadenylation sequence. Gene constructs are strategically designed to contain a strong promoter for high-level gene expression. The strong constitutive promoter cauliflower mosaic virus (CaMV35s) or its enhanced version (deCaMV35S) have been most commonly used to drive transgene expression in HRCs (Georgiev et al., 2012). Besides using the above-mentioned constitutive promoters, inducible promoters can also be used. For instance, glucocorticoid- or thermo-inducible promoters are used to drive controlled gene expression at required times in HRC systems (Sun and Peebles, 2016). Moreover, the use of enhancer like TMV omega enhancer could help to improve the productivity. Retention signals such as the well-known KDEL or HDEL sequences can be incorporated in the molecular construct in order to increase the cytoplasmic content of the heterologous protein. This strategy can be relevant to prevent protein degradation in the culture medium in some situations, thus enhancing the overall production yields. For example, the use of the KDEL sequence at the C-terminal end of the 14D9 antibody increases its accumulation level in tobacco HRCs (Chahardoli et al., 2018). On the other hand, when the goal is to target the expression of a foreign protein to the secretory pathway, a dedicated signal peptide such as the ER signal peptide cal (N-terminal calreticulin fusion sequence) can be incorporated in the molecular construction. To illustrate such strategy this signal peptide has appeared essential for the secretion of human erythropoietin into the medium of N. tabacum HRCs (Gurusamy et al., 2017).

R. rhizogenes mediated transformation can be applied, once the molecular construct has been designed, to introduce the heterologous gene of interest in the recipient plant without changing the genetic architecture of the plant host, enabling the transgenic HR to produce the respective heterologous protein. The most common strategy consists in modifying the R. rhizogenes strain by integrating a heterologous gene of interest into the bacteria via classical molecular biology methods before infecting the plant. In this case, standard plant expression binary vectors harboring specific T-DNA with the gene of interest are used. Using this strategy, stable integration of the gene(s) of interest is obtained. Due to the random insertion into the genome of the plant, it is necessary to identify the most robust clones among all generated clones in terms of productivity and growth capacity as illustrated by Huet et al. (2014). As described in Donini and Marusic (2018) a second strategy consists in directly infecting a transgenic plant already expressing the heterologous protein using the wild type strain of R. rhizogenes. Thus, no genetic engineering of the bacterial strain is required. In this respect, the availability of plant mutant banks offers a wide range of possibilities.

As mentioned above, the recombinant proteins can be either produced internally or secreted into the culture medium. Large proteins– can be secreted into the culture medium of HRC (Cardon et al., 2019), enabling a large spectrum of protein production (Table 1). As an example, alpha-L-iduronidase, a complex human protein of 72 kDa is well-secreted into the culture medium once produced by HRCs (Cardon et al., 2019). Beyond 110 kDa, the secretion of the protein requires the use of particular strategies, such as a wall permeabilization if the protein remains bound to the biomass (e.g. using DMSO)



(Talano et al., 2012). Moreover, the physicochemical characteristics of the proteins such as hydrophobicity and charge can also be important determinants for secretion and/or retention. The use of protein-stabilizing agents in the plant culture medium also represents a promising method for a stable maintenance of the productivity. Examples of protein stabilizers and protease inhibitors that can be employed are bovine serum albumin (BSA) (for protein-based stabilization), gelatine, PEG, PVP, and other polymers (to protect proteins from plant cell denaturing agents), and mannitol (to regulate osmotic pressure of the medium thus minimizing cell lysis) (Alvarez and Alvarez, 2014).

In summary, all steps of the process have to be carefully designed when producing recombinant proteins in HRCs, from the selection of the starting plant material to the molecular strategy considering not only the particularities of the heterologous gene structure itself but also its future features according to the source from where it will be purified. By meeting those criteria, the production of functional proteins within expected productivities is more likely to occur.

#### PRODUCTION OF SPECIALIZED METABOLITES

For the production of specialized metabolites, HRCs are more appropriate than cell or callus cultures due to characteristics such as genetic stability, high biomass production, and efficient biosynthetic capacity. Moreover, HRCs are able to produce specialized metabolites for a long period of time (Peebles et al., 2009; Häkkinen et al., 2016). Specialized metabolites are usually produced using non-transgenic HRCs. In this case, the plants selected are those species which naturally produce the compound of interest. Some examples of specialized metabolites already produced in wild-type HRCs are azadirachtin (biopesticide) (Thakore and Srivastava, 2017), betalain (red pigments for food industry) (Pavlov et al., 2005), camptothecin (antitumor agent in the treatment of ovarian and colorectal cancers treatment) (Wetterauer et al., 2018), and whitanolide A (brain regenerative compound) (Shajahan et al., 2017). A transgene can also be introduced in wild type HRs in order to increase the amount of a particular specialized metabolite (Jouhikainen et al., 1999; Moyano et al., 2003; Zhang et al., 2004; Ritala et al., 2014). As an example, gene silencing knock-out strategies have been used either to avoid negative regulation or to modify a biosynthetic pathway, therefore improving the production of specific compounds (Mehrotra et al., 2015). In this last case, the over- or co-expression of enzymes involved in the biosynthesis pathway may be instrumental in the production of high-value metabolites. Some examples of specialized metabolites produced using transgenic HRCs are tropane alkaloids such as scopolamine and hyoscyamine (Jouhikainen et al., 1999; Häkkinen et al., 2016; Guo et al., 2018; Khezerluo et al., 2018), catharanthine (Hu, 2006), ginsenosides (Woo et al., 2004), solanoside (Putalun et al., 2004), and vitamin C (Wevar Oller et al., 2009). Sirikantaramas and Taura (2017), utilizing transgenic HRCs and substrate feeding, were able to produce cannabinoids in N. tabacum, even if this species is normally not able to produce such specialized metabolite. A recent review describes some examples of valuable metabolites produced in transgenic HRs and their associated strategies, e.g. the concept of phytoremediation (Doran, 2009; Vaghari et al., 2017).

Regardless of the strategy (i.e. use of transgenic or nontransgenic HRCs), the production of specialized metabolites has been successfully improved by optimizing the growth conditions (e.g. carbon source, aeration, pH, light/dark, culture medium composition), selection of the clone (Sevón et al., 1998), and/or by paying a particular attention to the selection of adapted elicitors (Wang and Wu, 2013; Perassolo et al., 2017; Halder et al., 2018). The expression of specialized metabolites in HRCs, as in all plant-based production systems, requires the identification of the most appropriate elicitors and administration scheme. There is substantial literature exemplifying the elicitors that can be used to produce specialized metabolites using in vitro plant tissue culture systems (Zhou and Wu, 2006; Ramakrishna and Ravishankar, 2011; Hussain et al., 2012; Murthy et al., 2014; O'Kennedy et al., 2016). Not all of these elicitors were tested in HRCs and only a few of the most representative examples have been selected in this document to illustrate the current trend in the use of elicitors (see Table 2). Table 2, although not exhaustive, highlights the great diversity of elicitors available. However, a thorough analysis of all elicitors used in the literature shows that some candidates have been more frequently used and/or successfully tested. These include methyl-jasmonate, jasmonic acid, chitosan, or salicylic acid. The ability of an elicitor to induce a metabolic pathway is dependent on the plant and on the metabolite of interest. Also the ability of certain elicitors to affect the secretion of the metabolite has to be considered (Bais et al., 2003; Thimmaraju et al., 2003). The recognition of elicitors by the plants seems to be carried out by specific receptors on the surface of the plant cell or at the intracellular level allowing the triggering of a signal transduction cascade leading to the stimulation of the cells, a characteristic set of plant defense responses (Nürnberger, 1999).

Various studies have also shown that there is a synergistic effect between several elicitors. This is the case, for example, with the cross-use of biotic and abiotic elicitors, which makes possible to increase the production of tanshinones in Salvia miltiorrhiza HRC (Yan et al., 2006). The increase in productivity can be moderate: 4.5 fold for the production of isoflavonoid in Pueraria candollei HRCs after elicitation by the yeast extract (Udomsuk et al., 2011), or can be significant as stilbenes excretion increased by 570-fold after the elicitation with MeJA and methyl-b-cyclodextrins in Vitis vinifera HRs (Tisserant et al., 2016). Moreover, beyond the type of elicitor, its dosage (not too strong to avoid toxicity nor too low to generate an effect), its exposure time, and the plant material and the age to which it is applied are of particular relevance.

A recently used approach for the optimization of specialized metabolites through HRCs is in silico modeling (e.g. fuzzy logicbased simulations, neural network). This type of computer modeling approaches has been used to optimize chemical culture conditions of HRCs (Lenk et al., 2014).

#### CONCLUSIONS AND FUTURE PERSPECTIVES

In theory, HRCs can be induced from basically all plant species. Therefore, this technology could potentially be implemented to rare, valuable, threatened, or endemic medicinal species in an TABLE 2 | Examples of specialized metabolites produced by HRCs and examples of elicitors used.


effort to preserve biodiversity. Moreover, HRCs have become important tools for studying biosynthesis of plant-derived molecules and they are appealing bioproduction systems with different advantages over cell suspensions and open-field-grown plants. HRCs are also very optimal model systems for recombinant protein/specialized metabolite production, or for unraveling the intricate interactions implicated in phytoremediation.

Using recent molecular biology tools, it is now possible to develop high producing clones that are able to secrete improved functional and efficient molecules for different industrial applications (food, cosmetic, or pharmaceuticals). Thus, the current state of the HRC technology should have an impact on broad commercialization of HRCs, given their several applications. This increasing interest for HRC requires developing robust cost effective GMP ready process with well-controlled systems.

Moreover, the new molecular biology tools (e.g. the CRISPR/ Cas9 technology) allow a strong perspective for development of HRCs. According with PubMed Central database, the first transformation of HRs using CRISPR/Cas9 strategy has been effective in 2014, when transgenic tomato HRs producing eGFP were published (Ron et al., 2014). CRISPR/Cas9 has been used as a tool to study and optimize the mutagenesis processes in HRs of soybean (Glycine max) and Medicago truncatula (Michno et al., 2015; Du et al., 2016) and more recently for the knock-out of a gene involved in HRs architecture (BcFLA1) in Brassica carinata (Kirchner et al., 2017). The possibility to generate HR clones harboring knock-out gene(s) will facilitate metabolic pathway understanding or modification. This kind of experimentation can be used to reinforce specialized metabolites expression, to modify posttranslational modifications of recombinant proteins, e.g. glycosylation features or to help in the elucidation of biosynthetic pathways.

Due to an initial poor productivity and a scale-up issue, HRCs were, until now, not used at an industrial level except in the cosmetic area where an extract from a basil HRC is commercialized to treat hair loss. The strong optimization of the process in terms of production capacity, bioreactor size, and ability to modify HRCs to produce tailored-made complex molecules pave the way to a prominent place as a future biotechnology tool of this technology in plant molecular farming.

#### REFERENCES


#### AUTHOR CONTRIBUTIONS

FC and AR conceived the idea of this review and designed the overall concept. FC and NGV were the most involved in the writing. SH, CL, MG, KMOC, and AR also participated in the writing of the review and/or reviewed it. All authors approved the final version.

#### FUNDING

This review was partly supported by the European Union's Horizon 2020 Research and Innovation Program (PharmaFactory project under grant agreement No 774078).


arachidin-3 in hairy root cultures of peanut co-treated with methyl jasmonate and cyclodextrin. J. Agric. Food Chem. 63, 3942–3950. doi: 10.1021/jf5050266


Conflict of Interest: FC, CL, and MG are employed by the company Samabriva SA.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Gutierrez-Valdes, Häkkinen, Lemasson, Guillet, Oksman-Caldentey, Ritala and Cardon. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.