# NEXT GENERATION AGRICULTURE: UNDERSTANDING PLANT LIFE FOR FOOD, HEALTH AND ENERGY

EDITED BY : Domenico De Martinis, Eugenio Benvenuto, Nicola Colonna, Briardo Llorente and Edward Rybicki PUBLISHED IN : Frontiers in Plant Science

### Frontiers eBook Copyright Statement

The copyright in the text of individual articles in this eBook is the property of their respective authors or their respective institutions or funders. The copyright in graphics and images within each article may be subject to copyright of other parties. In both cases this is subject to a license granted to Frontiers. The compilation of articles constituting this eBook is the property of Frontiers.

Each article within this eBook, and the eBook itself, are published under the most recent version of the Creative Commons CC-BY licence. The version current at the date of publication of this eBook is CC-BY 4.0. If the CC-BY licence is updated, the licence granted by Frontiers is automatically updated to the new version.

When exercising any right under the CC-BY licence, Frontiers must be attributed as the original publisher of the article or eBook, as applicable.

Authors have the responsibility of ensuring that any graphics or other materials which are the property of others may be included in the CC-BY licence, but this should be checked before relying on the CC-BY licence to reproduce those materials. Any copyright notices relating to those materials must be complied with.

Copyright and source acknowledgement notices may not be removed and must be displayed in any copy, derivative work or partial copy which includes the elements in question.

All copyright, and all rights therein, are protected by national and international copyright laws. The above represents a summary only. For further information please read Frontiers' Conditions for Website Use and Copyright Statement, and the applicable CC-BY licence.

ISSN 1664-8714 ISBN 978-2-88966-062-9 DOI 10.3389/978-2-88966-062-9

### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# NEXT GENERATION AGRICULTURE: UNDERSTANDING PLANT LIFE FOR FOOD, HEALTH AND ENERGY

### Topic Editors:

Domenico De Martinis, Italian National Agency for New Technologies, Energy and Sustainable Economic Development (ENEA), Italy Eugenio Benvenuto, Italian National Agency for New Technologies, Energy and Sustainable Economic Development (ENEA), Italy Nicola Colonna, Italian National Agency for New Technologies, Energy and Sustainable Economic Development (ENEA), Italy Briardo Llorente, Macquarie University, Australia Edward Rybicki, University of Cape Town, South Africa

Citation: De Martinis, D., Benvenuto, E., Colonna, N., Llorente, B., Rybicki, E., eds. (2020). Next Generation Agriculture: Understanding Plant Life for Food, Health and Energy. Lausanne: Frontiers Media SA. doi: 10.3389/978-2-88966-062-9

# Table of Contents

*05 Editorial: Next Generation Agriculture: Understanding Plant Life for Food, Health and Energy*

Domenico De Martinis, Edward P. Rybicki, Nicola Colonna, Eugenio Benvenuto and Briardo Llorente

*07 Expression of a Chloroplast-Targeted Cyanobacterial Flavodoxin in Tomato Plants Increases Harvest Index by Altering Plant Size and Productivity*

Martín L. Mayta, Rocío C. Arce, Matias D. Zurbriggen, Estela M. Valle, Mohammad-Reza Hajirezaei, María I. Zanor and Néstor Carrillo

*20 Combating Micronutrient Deficiency and Enhancing Food Functional Quality Through Selenium Fortification of Select Lettuce Genotypes Grown in a Closed Soilless System*

Antonio Pannico, Christophe El-Nakhel, Marios C. Kyriacou, Maria Giordano, Silvia Rita Stazi, Stefania De Pascale and Youssef Rouphael


Fernanda Gabriela González, Nicolás Rigalli, Patricia Vivian Miranda, Martín Romagnoli, Karina Fabiana Ribichich, Federico Trucco, Margarita Portapila, María Elena Otegui and Raquel Lía Chan


Hrvoje Fulgosi and Lea Vojta

*147* Cynara cardunculus *L. as a Multipurpose Crop for Plant Secondary Metabolites Production in Marginal Stressed Lands*

Helena Domenica Pappalardo, Valeria Toscano, Giuseppe Diego Puglia, Claudia Genovese and Salvatore Antonino Raccuia

*161 Enhancing Biomass and Lutein Production From* Scenedesmus almeriensis*: Effect of Carbon Dioxide Concentration and Culture Medium Reuse*

Antonio Molino, Sanjeet Mehariya, Angela Iovine, Patrizia Casella, Tiziana Marino, Despina Karatza, Simeone Chianese and Dino Musmarra

*173 Genetic Control of Reproductive Traits in Tomatoes Under High Temperature*

Maria José Gonzalo, Yi-Cheng Li, Kai-Yi Chen, David Gil, Teresa Montoro, Inmaculada Nájera, Carlos Baixauli, Antonio Granell and Antonio José Monforte


Ryan M. Lefers, Mark Tester and Kyle J. Lauersen

*219 Phosphorylation of ADP-Glucose Pyrophosphorylase During Wheat Seeds Development*

Danisa M. L. Ferrero, Claudia V. Piattoni, Matías D. Asencion Diez, Bruno E. Rojas, Matías D. Hartman, Miguel A. Ballicora and Alberto A. Iglesias

*230 Global Role of Crop Genomics in the Face of Climate Change* Mohammad Pourkheirandish, Agnieszka A. Golicz, Prem L. Bhalla and Mohan B. Singh

# Editorial: Next Generation Agriculture: Understanding Plant Life for Food, Health and Energy

Domenico De Martinis 1\*, Edward P. Rybicki <sup>2</sup> , Nicola Colonna<sup>1</sup> , Eugenio Benvenuto<sup>1</sup> and Briardo Llorente3,4

 ENEA Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Rome, Italy, Biopharming Research Unit, Department of Molecular and Cell Biology, University of Cape Town, Cape Town, South Africa, ARC Center of Excellence in Synthetic Biology, Macquarie University, Sydney, NSW, Australia, <sup>4</sup> CSIRO Synthetic Biology Future Science Platform, Sydney, NSW, Australia

Keywords: climate change, agriculture, bioproducts, food, biofuel, bioeconomy, artificial intelligence, molecular farming

Editorial on the Research Topic

Next Generation Agriculture: Understanding Plant Life for Food, Health and Energy

### Edited and reviewed by:

Abraham J. Escobar-Gutie´rrez, Institut National de Recherche pour l'agriculture, l'alimentation et l'environnement (INRAE), France

\*Correspondence:

Domenico De Martinis domenico.demartinis@enea.it

### Specialty section:

This article was submitted to Crop and Product Physiology, a section of the journal Frontiers in Plant Science

Received: 14 July 2020 Accepted: 28 July 2020 Published: 12 August 2020

### Citation:

De Martinis D, Rybicki EP, Colonna N, Benvenuto E and Llorente B (2020) Editorial: Next Generation Agriculture: Understanding Plant Life for Food, Health and Energy. Front. Plant Sci. 11:1238. doi: 10.3389/fpls.2020.01238 Current global population growth and the associated increasing demands on farming, together with the threat of climate change and the need for better environmental protection, pose formidable challenges for the agriculture of the future. Crop productivity is already reaching high capacity but will have to increase perhaps by as much as 100% to sustain a world population of nearly 10 billion people by 2050. These global challenges are also not evenly distributed. While developed countries face the needs derived from an aging population, highly urbanized areas, and stagnation of cultivated land, developing nations are blooming in terms of population growth, the building of infrastructure, and the use of land for agriculture. Less developed countries, on the other hand, also face demographic growth but generally lack modern infrastructure and efficient farming practices required for the expansion of agricultural production. Climate change also afflicts different regions of the world with varying intensity and in very diverse ways, generating complex effects on agricultural systems, which remain hitherto unpredictable. This Research Topic provides a perspective of how and where agriculture will be conducted in the future, what will be cultivated, and for what purpose.

Studies published in this topic provide an overview of how different technologies and research areas may converge in agricultural innovation. Biotechnology could improve "intrinsically" crop performance (Fulgosi and Vojta), make plant crops more climate-resilient (Soto et al.; Gonzalo et al.), productive (Mayta et al.; Grossi et al.), and nutritious (Pannico et al.; Yuan and Li; Ferrero et al.), as well as less dependent on the use of agrochemicals (Fabian et al.), and tailored for biofuel production (Gangwar and Shankar).

Approaches dedicated to highly technologically intensive disciplines could be of great support to agriculture. Artificial Intelligence could play a game-changing role in the automation of agricultural practices (Jin et al.; Fabris et al.), freeing farmers from the workload of traditional agriculture and from the variability that still shapes the output of agricultural production.

Future agriculture is expected to produce food, energy, pharmaceuticals, and other high-value commodities, and may take place beyond traditional cultivated lands. In the past, humanity learned to claim agricultural land by draining and terracing. Now, climate change and scarcity of arable land

**5**

might lead future agriculture to respond to production needs (Pourkheirandish et al.) by moving to more extreme environments (Pappalardo et al.), taking advantage of algae cultivation (Molino et al.), and merging technologies to create greenhouses to enable sustainable agriculture in desert areas (Lefers et al.). Biocontainment approaches to enable molecular farming in large-scale field conditions are discussed in this issue (Clark and Maselko). Concurrently, we have to consider that, in the near future, more crops would be grown also in urban environments, with little or no ground available. Urban farming in cities might help reduce the environmental impact of agriculture while contributing to sustainably achieving food security.

Agriculture with no-land is already happening and will eventually evolve into agricultural systems to support human on earth and beyond, on the International Space Station1 , and for space exploration<sup>2</sup> .

This editorial work has been implemented in the timeframe Fall 2019–Spring 2020 on the rampage of the COVID-19 pandemic. Counting the accepted papers only, this Research Topic involved 47 laboratories, 116 authors, and 39 referees from 24 countries worldwide (Argentina, Brazil, Mexico, Cuba, USA, Australia, Japan, South Korea, Taiwan, China, India, Israel, Saudi Arabia, South Africa, Turkey, Cyprus, Croatia, Italy, Spain, Portugal, Czech Republic, Germany, UK, Poland). Working on future knowledge during a global emergency situation with so many people in lockdown has not been easy, both mentally and practically. Now that the emergency sanitary phase seems to be controlled, a lesson about our ability to continuously provide society with basic needs, such as energy, water, and food, has been clearly highlighted. The issue of food security that has been largely neglected in recent years in developed countries has returned to the fore and governments' agenda. The primary sector must continue to ensure healthy and sufficient food for all, even during crises like the one we have experienced. When the movement of people and goods is stopped, it is of paramount importance to have local food production capacity to cope with the unavailability of external sources. The agriculture of the future will have to sustain a world population with different needs and opportunities and provide resources also in case of a global crisis while at the same time reducing environmental impact in the general upheaval of climatic conditions.

To achieve this, advances at the frontiers of plant science will become essential. Knowledge efforts in the field of physical– chemical life sciences applied to the agricultural production sector must be supported to increase the resilience of the global agri-food system. These include expanding the use of plants for the production of novel materials, complex chemicals, pharmaceuticals and biologics, and bioenergy, as well as understanding how to improve our farming practices toward a circular economy. Next-generation agriculture will certainly shape our future. It will take advantage of Smart and Molecular Farming, Data, and Artificial Intelligence; it will move into the cities and go vertical, occur in extreme environments, and support the human conquest of extraterrestrial space.

### AUTHOR CONTRIBUTIONS

The authors contributed equally to the work.

Conflict of Interest: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 De Martinis, Rybicki, Colonna, Benvenuto and Llorente. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

<sup>1</sup> Meals Ready to Eat: Expedition 44 Crew Members Sample Leafy Greens Grown on Space Station NASA Newsletters https://www.nasa.gov/mission\_pages/station/ research/news/meals\_ready\_to\_eat

<sup>2</sup> Ground Demonstration of Plant Cultivation Technologies for Safe Food Production in Space https://eden-iss.net/

# Expression of a Chloroplast-Targeted Cyanobacterial Flavodoxin in Tomato Plants Increases Harvest Index by Altering Plant Size and Productivity

*Martín L. Mayta1‡, Rocío C. Arce1‡, Matias D. Zurbriggen1†, Estela M. Valle1, Mohammad-Reza Hajirezaei2, María I. Zanor1\* and Néstor Carrillo1\**

1 Instituto de Biología Molecular y Celular de Rosario (IBR-UNR/CONICET), Facultad de Ciencias Bioquímicas y Farmacéuticas, Universidad Nacional de Rosario (UNR), Rosario, Argentina, 2 Leibniz Institute of Plant Genetics and Crop Plant Research, Stadt Seeland, Germany

Tomato is the most important horticultural crop worldwide. Domestication has led to the selection of highly fruited genotypes, and the harvest index (HI), defined as the ratio of fruit yield over total plant biomass, is usually employed as a biomarker of agronomic value. Improvement of HI might then result from increased fruit production and/or lower vegetative growth. Reduction in vegetative biomass has been accomplished in various plant species by expression of flavodoxin, an electron shuttle flavoprotein that interacts with redoxbased pathways of chloroplasts including photosynthesis. However, the effect of this genetic intervention on the development of reproductive organs has not been investigated. We show herein that expression of a plastid-targeted cyanobacterial flavodoxin in tomato resulted in significant reduction of plant size affecting stems, leaves, and fruit. Decreased size correlated with smaller cells and was accompanied by higher pigment contents and photosynthetic activities per leaf cross-section. Flavodoxin accumulated in green fruit but declined with ripening. Significant increases in HI were observed in flavodoxin-expressing lines due to the production of higher fruit number per plant in smaller plants. Therefore, overall yields can be enhanced by increasing plant density in the field. Metabolic profiling of ripe red fruit showed that levels of sugars, organic acids, and amino acids were similar or higher in transgenic plants, indicating that there was no trade-off between increased HI and fruit metabolite contents in flavodoxin-expressing plants. Taken together, our results show that flavodoxin has the potential to improve major agronomic traits when introduced in tomato.

Keywords: tomato, flavodoxin, chloroplasts, transgenic plants, harvest index

# INTRODUCTION

Harvest index (HI) is defined as the ratio of grain, fruit, or tuber yield to total plant biomass (Gur et al., 2010) and reflects the ability of a sink tissue to capitalize on the availability of photosynthates to increase the yield of harvestable product. As such, HI has been regarded as a reference parameter to evaluate the progress of breeding programs aimed at improving yield potential. Indeed, the so-called "green revolution" that took place in the middle of the twentieth century largely stemmed

### Edited by:

Briardo Llorente, Macquarie University, Australia

### Reviewed by:

Nunzia Scotti, Institute of Bioscience and Bioresources (CNR), Italy José Tomás Matus, Instituto de Biología Integrativa de Sistemas (UV+CSIC), Spain

### \*Correspondence:

María I. Zanor zanor@ibr-conicet.gov.ar Néstor Carrillo carrillo@ibr-conicet.gov.ar

### †Present address:

Matias D. Zurbriggen, Institute of Synthetic Biology and CEPLAS, University of Düsseldorf, Düsseldorf, Germany

‡These authors have contributed equally to this work

### Specialty section:

This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science

Received: 24 May 2019 Accepted: 15 October 2019 Published: 08 November 2019

### Citation:

Mayta ML, Arce RC, Zurbriggen MD, Valle EM, Hajirezaei M-R, Zanor MI and Carrillo N (2019) Expression of a Chloroplast-Targeted Cyanobacterial Flavodoxin in Tomato Plants Increases Harvest Index by Altering Plant Size and Productivity. Front. Plant Sci. 10:1432. doi: 10.3389/fpls.2019.01432

from major increases in HI resulting from the development of dwarf varieties of rice and wheat with diminished leaf biomass coupled to similar or higher grain yields (Khush, 2001). These dwarfing traits were found to result from mutations of genes involved in gibberellin synthesis and signaling (Hedden, 2003). Crossing of the mutant lines with high-yielding varieties led to new cultivars displaying a greater proportion of photoassimilates partitioned into the grain (Langridge, 2014).

Intensive research has been carried out to identify genes affecting vegetative and reproductive growth in a way that favors high HI. Association mapping in rice (Li et al., 2012) and rapeseed (Luo et al., 2015) indicated that HI is a complex multigenic trait affected by both environmental and genetic determinants. HI can be improved by favoring nutrient transport from leaves to harvestable organs and/or by decreasing vegetative growth (Luo et al., 2015). An early example of the latter approach was provided by tobacco plants over-expressing a phytochrome gene that exhibited impaired shade avoidance causing proximitydependent dwarfing and higher HI (Robson et al., 1996). Although many quantitative trait loci associated with HI have been reported in various plant species, the specific genes and molecular mechanisms determining the magnitude of this index are largely unknown.

While a higher HI has a general relevance for crop yield enhancement, its agronomic value is particularly important in the case of species that are grown for industrial purposes (e.g. processing tomatoes) as a strategy to obtain smaller and more compact plants with equivalent or higher fruit production that are expected to render increased yields per planted surface (Gur et al., 2010).

Decreases in plant and leaf size have been observed in creeping bentgrass and *Arabidopsis* lines expressing a chloroplastlocated flavodoxin (Fld; Li et al., 2017; Su et al., 2018). Fld is an electron shuttle flavoprotein found in cyanobacteria and some marine algae, which mediates essentially the same electron transfer reactions as the iron-sulfur protein ferredoxin (Fd; Pierella Karlusich et al., 2014). Fd transcript and protein levels are down-regulated by most environmental stresses (Pierella Karlusich et al., 2014, and references therein), and under such conditions Fld expression is induced to take over the activities of its functional counterpart and allow growth and reproduction of the microorganism in the adverse situation (Zurbriggen et al., 2008; Pierella Karlusich et al., 2014). Fld-encoding genes are absent from plant genomes (Pierella Karlusich et al., 2015), but introduction of a plastid-targeted Fld in transgenic plants resulted in increased tolerance to multiple sources of biotic and abiotic stress (Tognetti et al., 2006; Tognetti et al., 2007; Zurbriggen et al., 2008; Zurbriggen et al., 2009; Coba de la Peña et al., 2010; Li et al., 2017; Rossi et al., 2017).

In this study we transformed tomato plants with DNA sequences encoding a cyanobacterial Fld directed to chloroplasts (*Slpfld* lines, for *Solanum lycopersicum* **p**lastidic **Fld**) or the **c**ytosol (*Slcfld* lines), and evaluated vegetative and reproductive growth to determine if tomato HI could be increased by this genetic intervention. Mature-sized Fld was detected in leaves and fruit, but its levels declined with fruit ripening, in parallel with the general decline of total soluble protein. Lines expressing plastidtargeted Fld displayed a number of distinct phenotypic features compared to wild-type (WT) and *Slcfld* siblings, including smaller plants, leaves, and fruits; more flowers per inflorescence; increased fruit number; and higher HI. Biochemical analysis and metabolic profiling revealed that *Slpfld* fruit contained higher levels of soluble solids and similar or increased contents of sugars, amino acids, and organic acids relative to their WT counterparts. The results indicate that the chloroplast Fld approach constitutes a promising strategy to generate novel tomato lines displaying increased HI without affecting fruit metabolite contents.

### MATERIALS AND METHODS

### Generation of Transgenic Tomato Lines

The *pfld*- and *cfld*-harboring pCAMBIA2200 plasmids (Tognetti et al., 2006; see **Supplementary Figure S1A**) were used to direct expression of Fld from *Anabaena* PCC7119 in the chloroplasts or the cytosol, respectively, of tomato plants (*S. lycopersicum* cv Moneymaker) by standard *Agrobacterium*-mediated procedures (Tauberger et al., 2000). A total of 22 *Slpfld* and 10 *Slcfld* transformants were obtained exhibiting detectable levels of Fld in leaf extracts. Typical examples are shown in **Supplementary Figure S2**. Homozygous lines were selected by evaluating resistance to 100 μg ml−1 kanamycin and by measuring Fld levels in the progeny of self-pollinated T2 transformants, using known amounts of purified recombinant Fld as reference (**Supplementary Figure S1B**). Leaf contents of the flavoprotein were analyzed by immunoblotting up to the T5 generation to ensure that the transgene was neither lost nor silenced during seed propagation.

Plants were germinated in soil and grown at 200 µmol photons m−2 s−1, 25°C, 40%/90% humidity with a 16/8-h light/ dark photoperiod (growth chamber conditions) on randomly distributed 3-L pots. Watering was carried out daily to field capacity until harvest at 120 days post-germination (dpg).

# Determination of Cell Size and Number

Discs (0.5 cm in diameter) were punched from the interveinal region of the third leaflet from the fourth fully expanded leaf of several independent plants at 30 dpg (**Figure 1A**) and fixed in 96% (v/v) ethanol, followed by incubation in 85% (w/v) lactic acid for clearing. Four pictures of different regions in each disc were used to calculate cell area and at least 100 cells were counted. Cell number was estimated using leaf and cell areas. Image analysis was performed with ImageJ (Rasband, 1997–2008, http://rsb. info.nih.gov/ij/).

Fruit samples were analyzed at the breaker stage. Four thin (0.5–1 mm) transverse sections of one fruit from the first truss in

**Abbreviations:** Chl, chlorophyll; PS, photosystem; dpg, days post-germination; Fd, ferredoxin; Fld, flavodoxin; HI, harvest index; LEF, linear electron flow; MDHAR, monodehydroascorbate reductase; 1 H-NMR, proton nuclear magnetic resonance; NPQt, non-photochemical quenching; PETC, photosynthetic electron transport chain; ROS, reactive oxygen species; SDS-PAGE, sodium dodecyl sulfate–polyacrylamide gel electrophoresis; TCA, tricarboxylic acid, TSP, 3-(trimethylsilyl) propionic-2,2,3,3-d4 acid; WT, wild-type.

of plants at 30 days post-germination. Bar = 10 cm. Arrowheads indicate the third leaflets used to extract tissue samples. (B) Representative micrographs from palisade parenchyma cells. Contour of typical cells are shown in yellow. Bar = 20 µm. (C) Leaf area was determined by image analysis. (D) Cell area and (E) cell number were calculated from clarified leaf tissue (see Materials and Methods). Data reported are means ± SEM of n biological replicates, as shown below each line. Asterisks indicate statistically significant differences (P < 0.05) determined using one-way ANOVA and Tukey's multiple comparison test.

six different plants per line (24 sections per line) were hand-cut from the equatorial section of the pericarp with a razor blade. The tissue was fixed in 10% (v/v) formaldehyde, 5% (v/v) acetic acid, and 52% (v/v) ethanol, vacuum-infiltrated twice for 15 min, and incubated overnight. The fixation solution was subsequently replaced by ethanol for 2 h, and pericarp tissues stored in 70% (v/v) ethanol until processing. For cell size studies, samples were stained by incubation with 0.5% (w/v) toluidine blue in 0.1% (w/v) Na2CO3 for 30 s. Samples were rinsed with water to prevent staining of the internal cell layers and mounted in 30% (v/v) glycerol. Stained sections were photographed using a camera attached to a dissecting microscope (Leica Microsystems, Switzerland). Cell layers were counted four times (technical replicates) in each fruit section from the exocarp to endocarp avoiding the vascular bundles, as described by Mu et al. (2017). Cell size was calculated using ImageJ (Rasband, 1997–2008, http://rsb.info.nih.gov/ij/).

### Phenotyping

For seed viability and germination analysis, seeds were cultured under growth chamber conditions on 0.8% (w/v) agar plates containing half-strength Murashige-Skoog basal salts (Sigma). Germination was recorded from the day the radicle broke through the seed coat. Groups of 30 seeds were used for each germination test, and the assay was repeated four times (independent experiments). Time to leaf emergence was determined for the emergence of the first and second node in 12 to 13 plants of each line germinated at the same time and grown in soil. Inflorescence and flower counting and tagging were performed every 2 days. Fruit ripening stages were determined by epicarp color change and by pressing it gently. They were classified as immature green (~10 days post-anthesis), mature green (~50 days post-anthesis), breaker, and ripe red, when they changed to a dark red color and soft texture. For biochemical and physiological measurements, fruit of equivalent developmental stages were employed, irrespective of their days from anthesis. To determine yield and HI, ripe red fruit were collected daily up to 120 dpg. At this stage, all remaining fruit were harvested irrespective of their ripening stage, and used for calculations. Dry weight was recorded after 2 weeks of incubation at 65°C. Vegetative weight (leaves and stems) was determined after fruit harvest, and HI was calculated as the ratio between total fruit yield and total above-ground biomass (fruit plus vegetative). Data shown were averaged from three independent experiments carried out during a 2-year period.

### Leaf Pigment Contents and Photosynthetic Measurements

Chlorophyll (*Chl*) and carotenoid levels were determined spectrophotometrically after extraction with 96% (v/v) ethanol (Lichtenthaler, 1987). Chlorophyll fluorescence measurements were performed using a MultispeQ-Beta device controlled by the PhotosynQ platform software (Kuhlgert et al., 2016). Measurements were performed on leaves from the fifth node in two fully expanded leaflets using six independent plants per line at 45 dpg.

# Metabolite Quantifications

For determination of fruit metabolites, two ripe red fruit were sampled from the first truss of at least three individual plants (biological replicates) from each genotype. Metabolite profiling was performed by proton nuclear magnetic resonance (1 H-NMR) spectroscopy according to published procedures (Sorrequieta et al., 2013; López et al., 2015). Briefly, pericarp tissue of ripe red fruit was obtained by removing the epicarp, locule tissues, and seeds, immediately frozen in liquid nitrogen and stored at −80°C until analysis of the primary metabolite composition by 1 H-NMR. Frozen samples were ground in liquid nitrogen using a Retsch MM400 mixer mill until obtaining a homogeneous and fine powder, which was rapidly dissolved in 0.3 ml of 1 M cold sodium phosphate buffer (pH 7.4) prepared in D2O to obtain a mixture containing about 30% by weight of D2O. The mixture was centrifuged at 20,000*g* for 15 min at 4°C and the supernatant filtered to remove any insoluble material. Internal standard [1 mM TSP: 3-(trimethylsilyl) propionic-2,2,3,3-d4 acid] was added to the resulting transparent soluble fraction, and the solution was subjected to spectral analysis at 600.13 MHz on a Bruker Avance II spectrometer. Proton spectra were acquired at 298 K by adding 512 transients of 32 K data points with a relaxation delay of 5 s. A 1D-NOESY pulse sequence was utilized to remove the water signal. The 90° flip angle pulse was always ~10 μs. Proton spectra were referenced to the TSP signal (δ = 0 ppm), and their intensities were scaled to that of TSP. Spectral assignment and identification of specific metabolites was established by fitting the reference proton nuclear magnetic spectroscopy spectra of several compounds using the software Mixtures, developed *ad hoc* as an alternative to commercial programs (Abriata, 2012). Further confirmation of the assignments for some metabolites was obtained by acquisition of new spectra after addition of authentic standards.

# Analytical Procedures

Fld levels in the various tissues were estimated by sodium dodecyl sulfate–polyacrylamide gel electrophoresis (SDS-PAGE) and immunoblotting (Tognetti et al., 2006). Total protein extracts were prepared by grinding 100 mg tissue powder in 200 µL protein extraction buffer [0.2 M Tris-HCl, pH 6.8, 3 M urea, 1% (v/v) glycerol, 8% (w/v) SDS, 0.5 mM dithiothreitol, 5% (v/v) β-mercaptoethanol]. The composition of this buffer allows a more efficient protein extraction (Steiner et al., 2016). Samples were vortexed, incubated for 20 min at 80°C and centrifuged at 13,000*g* for 15 min. Supernatants were subjected to SDS-PAGE on 15% polyacrylamide gels and transferred to nitrocellulose membranes. Gel loading was carried out on the basis of fresh weight (FW) to avoid major changes in protein patterns and levels among the various tissues (see below). Membranes were washed three times for 15 min each with 5% (w/v) skim milk in 0.01% (v/v) Tween phosphate-buffered saline (TPBS; 8 mM Na2HPO4, 2 mM KH2PO4 pH 7.4, 137 mM NaCl, 2.7 mM KCl) and incubated for 1 h with polyclonal antibodies raised in rabbits against *Anabaena* Fld (diluted 1:300 in TPBS). Following washing with TPBS (three times × 15 min), membranes were incubated with rabbit anti-IgG immunoglobulins conjugated to alkaline phosphatase (Bio-Rad), in a 1:3,000 ratio in TPBS. After washing with TPBS (three times × 15 min), membranes were finally incubated in phosphatase solution (100 mM Tris-HCl pH 9.5, 100 mM NaCl, 5 mM MgCl2) supplemented with 0.01% (w/v) 5-bromo-4-chloro-3-indolyl phosphate and 0.01% (w/v) nitroblue tetrazolium until color development.

Total protein concentrations were measured in cleared leaf and fruit extracts as described by Simonian (2002), using bovine serum albumin as standard. The concentration of purified recombinant fld was determined by the absorption of bound flavin mononucleotide (ε454 = 8.8 mM−1 cm−1).

Concentrations of total soluble solids were measured in duplicate as described by Zanor et al. (2009) using a portable MA871 Digital Brix refractometer in a random sample of six fruit per line. Results were expressed in Brix degrees.

# Statistical Analyses

Data were analyzed using one-way ANOVA and multiple range tests as specified in each experiment. Significant differences refer to statistical significance at *p* < 0.05.

# RESULTS

# Expression of a Plastid-Targeted Fld Decreases Tomato Plant Size

To generate tomato plants expressing a plastid-targeted Fld (*Slpfld* lines), the coding region of the *Anabaena* PCC7119 *fld* gene was fused in-frame to the 3' end of a DNA sequence encoding the chloroplast transit peptide of pea Fd-NADP+ reductase (Tognetti et al., 2006; see *Materials and Methods*). The fused gene was cloned under the control of the constitutive cauliflower mosaic virus (CaMV) 35S promoter (**Supplementary Figure S1A**). A construct lacking the transit peptide sequence was also prepared to generate plants in which the expressed flavoprotein accumulated in the cytosol (*Slcfld* lines; **Supplementary Figure S1A**). The presence of Fld in foliar tissue was evaluated by SDS-PAGE and immunoblot analysis. Most of the flavoprotein was recovered as mature-sized peptides in *Slpfld* lines (**Supplementary Figure S1B**), indicating that it was imported by chloroplasts as already shown for Fld-expressing tobacco plants (Tognetti et al., 2006; Ceccoli et al., 2012). Homozygous lines were selected by segregation analysis and confirmed by proportional increases in leaf Fld contents. Lines *Slpfld*8-1, *Slpfld*60-4, and *Slcfld*10-5, belonging to the T5 generation and displaying high levels of Fld in chloroplasts or the cytosol, were used for phenotypic characterization.

Fld expression in *Slpfld* and *Slcfld* lines did not affect seed viability, germination rates, and time to leaf setting (**Supplementary Figure S3**), indicating that there was no retardation of vegetative development in the transformants. However, *Slpfld* plants exhibited smaller leaves and leaflets and shorter rachis compared to WT and *Slcfld* siblings (**Figures 1A, C**; **Supplementary Figure S4**; **Supplementary Table S1**), in agreement with the leaf phenotypes displayed by chloroplast Fldexpressing creeping bentgrass (Li et al., 2017) and *Arabidopsis* (Su et al., 2018). The number of leaflets per compound leaf and their overall architecture were instead unchanged (**Figure 1A**). Decreases in internodal distances of ~30% accounted for stem shortening in *Slpfld* plants relative to WT counterparts (**Supplementary Table S1**). Overall size reduction was accompanied by significant decreases in FW and dry weight of the aerial parts (**Supplementary Table S1**). Stem diameter and relative water contents were not affected by the presence of the flavoprotein (**Supplementary Table S1**).

Leaf area reduction in *Slpfld* plants resulted from a decrease in mesophyll cell size, without significant changes in cell number (**Figures 1B, D, E**). Epidermal cells were also significantly smaller (**Supplementary Figure S5**). As in tobacco (Tognetti et al., 2006; Ceccoli et al., 2012; Mayta et al., 2018), *Slpfld* plants contained higher pigment contents per leaf area (**Figure 2**). *Chl a* and *Chl b* levels were 17% and 39% higher than those determined in WT and *Slcfld* leaves (**Figures 2A–C**). Marginal increases in carotenoids were also observed in a number of experiments, albeit without statistical significance (**Figure 2D**).

Higher *Chl* contents in *Slpfld* plants were reflected at the level of photosynthetic activities. The quantum yield of photosystem (PS) II (ФPSII) per leaf cross-section, which provides an estimation

of the electron flow through PSII (Baker, 2008), and the rate of linear electron flow (LEF), were significantly enhanced in *Slpfld* plants at 45 dpg compared to the WT and *Slcfld* genotypes (**Figures 3A, B**). Other relevant photosynthetic parameters such as the coefficient of photochemical quenching *qP* and the fraction of open PSII reaction centers *qL* were also higher in *Slpfld* transformants (**Figures 3C, D**), whereas the magnitude of non-photochemical quenching (NPQt), which reflects the ability of the photosynthetic electron transport chain (PETC) to dissipate light energy into various processes (Baker, 2008), did not vary significantly among genotypes (**Figure 3E**). Similarly, the *Fv′*/*Fm′* parameter, which is regarded as a measure of PSII integrity, was not affected by Fld expression in chloroplasts or cytosol (**Figure 3F**).

### Plants Expressing Chloroplast-Located Fld Produce a Higher Number of Smaller Fruit

Flowering time was retarded in plants expressing a chloroplast Fld relative to WT and *Slcfld* siblings, as indicated by a delay of 10 to 12 days to the blossoming of the first flower (**Supplementary** 

E), and PSII integrity (Fv′/Fm′, F) were carried out in the third leaflet of the fourth fully expanded leaf of plants at 45 days post-germination. They correspond to the means ± SEM of six biological samples with six technical replicates each. Statistical differences between wild-type and transgenic lines are indicated by asterisks and were determined using one-way ANOVA and Tukey's multiple comparison test (P < 0.05).

**Figures S6A, B**). As a result, *Slpfld* plants had normally developed two more leaves on average at the time of flower setting (**Supplementary Figure S6C**). The number of tomato inflorescences and their architecture depend on the cultivar and are dramatically affected by environmental factors (Gur et al., 2010). Tomato trusses typically bear five to six flowers organized in a zigzag branch (Lippman et al., 2008). While the total inflorescences produced by *Slpfld* plants did not differ from the other two lines, they developed ~50% more flowers per truss (**Figure 4A**; **Supplementary Table S2**). Delay in flowering might cause additional branching of the inflorescence, as previously reported (Lippman et al., 2008; Park et al., 2012), but the mechanism by which chloroplast-targeted Fld exerts this effect is at present unknown.

The presence of extra flowers resulted in more prolific fruit production per plant (**Figure 4B**; **Supplementary Table S2**). Interestingly, the delay in flowering was partially compensated by accelerated ripening in *Slpfld* plants, with both color break and fruit ripening occurring 6 to 7 days (counting from anthesis) earlier than WT and *Slcfld* siblings (**Supplementary Table S2**). To estimate yield and HI, fruits were collected as they ripened up to 120 dpg. At this time, all tomatoes remaining in the plant were harvested irrespective of their maturation stage.

Fld activity in chloroplasts depends on interaction with endogenous redox partners, most conspicuously the PETC (Pierella Karlusich et al., 2014). The redox chemistry of chloroplasts closely resembles that of the phototrophic microorganisms in which the flavoprotein is normally found, but Fld expression

each plant. (B) Total fruit numbers per plant included fully ripe, breaker, and green fruit collected up to 120 days post-germination. Data are means ± SEM of n biological replicates, with n indicated below each line. Statistically significant differences are shown by asterisks and were determined using one-way ANOVA and Tukey's multiple comparison test (P < 0.05).

levels and possible interaction(s) in non-photosynthetic plastids such as those present in red fruit remain unknown. It should be borne in mind, however, that fruits stay green during a large part of their development, before the conversion of chloroplasts into chromoplasts (Marano and Carrillo, 1991; Marano et al., 1993; Muñoz and Munné-Bosch, 2018).

Immunoblot analyses were used to estimate Fld accumulation in fruit tissues. Since ripening proceeded at a different pace in the various genotypes (**Supplementary Table S2**), fruit at equivalent ripening stages (different days from anthesis) were used by employing the classification described in *Materials and Methods*. Levels of total soluble protein were lower in immature green fruits compared to leaves, and declined further in all lines as ripening progressed (**Supplementary Figures S7**  and **S8**). Gels for Fld detection were therefore loaded on the basis of FW, considering that unlike protein levels, the fraction of dry matter did not change significantly between tomato ripening stages (Matsuda and Kubota, 2010; Radzevičius et al., 2016). **Figure 5A** shows that mature-sized Fld was expressed in the green pericarp of immature *Slpfld* and *Slcfld* fruit. At more advanced ripening stages, chloroplast Fld levels declined together with total protein, to become barely detectable in ripe red fruit (**Figure 5A**; **Supplementary Figure S8B**). Contents of the cytosol-targeted flavoprotein were instead maintained up to the mature green stage (**Supplementary Figure S8B**), resulting in a relative Fld enrichment within total soluble proteins (**Supplementary Figure S8C**). Full ripening led to down-regulation of cytosolic Fld levels to those of *Slpfld* plants (**Supplementary Figure S8**).

Fld is therefore most likely functional in the green developmental stages, and fruit of *Slpfld* plants were smaller on average than those of the WT (**Figures 5B, D**), resembling the leaf phenotype. As in leaves, this effect was caused by a reduction in cell size (**Figures 5C, E**), while the number of cell layers in a transverse section of the pericarp was similar between WT and *Slpfld* fruit (**Figure 5F**), resulting in transformants with a thinner pericarp (**Figure 5B**).

The compromise between higher fruit number and smaller fruit size and weight resolved in moderately increased fruit yields in *Slpfld* plants relative to those of WT siblings, but the small differences failed to show statistical significance (**Figure 6A**). In turn, the combination of similar total fruit weight per plant with lower vegetative biomass (**Supplementary Table S1**) led to a ~30% increase in the HI of *Slpfld* lines compared to WT and *Slcfld* genotypes (**Figure 6B**).

# Metabolite Profiling of Ripe Red Tomato Fruit

While chloroplast Fld increased HI and fruit production, this improvement could be detrimental to fruit quality and nutrient contents. Determination of soluble solids measured in Brix degrees provides a fast and reliable indicator of fruit quality. As shown in **Figure 7A**, *Slpfld* plants displayed a moderate but statistically significant increase in pericarp soluble solids, indicating that reduction of photosynthetically active tissue in these lines was not translated into lower sugar accumulation in the sink organ. Indeed, fruit dry weight and relative water content were similar in all lines (**Supplementary Table S2**).

Quantitative metabolite profiling was performed in ripe red fruit of all lines to further assess the effects of Fld expression on nutrient contents of the marketable product (**Figure 7B**). With the conspicuous exception of sucrose, which declined in *Slpfld* fruit relative to WT siblings, soluble sugars displayed similar

FIGURE 5 | Plastid-located flavodoxin (Fld) affected tomato fruit development. (A) Fld expression at different fruit stages: immature green (IM), mature green (MG), and ripe red (RR). Cleared extracts corresponding to 5 mg FW were loaded in each lane, resolved by 15% sodium dodecyl sulfate–polyacrylamide gel electrophoresis and analyzed by immunoblot using Fld antisera, as described in Materials and Methods. The two membranes were assayed together and overreacted to reveal unspecific staining. Purified Fld (0.8 pmol) is shown in the extreme right. MW: molecular weight marker. (B) Phenotypes of representative fruits from each line. Bar = 5 cm. Insets show pericarps delineated by arrows. (C) Pericarp sections from representative breaker fruits stained with toluidine blue. The inner epidermis is denoted as "en" and the outer epidermis as "ex." Typical cells are contoured in yellow to illustrate size differences. Bar = 1 mm. (D) Average fruit weight, n indicates the number of fruit assayed. Cell size (E) and the number of cell layers of the pericarp (F) were calculated from six biological replicates using sections as those depicted in panel (C). Data presented correspond to means ± SEM. Statistically significant differences are indicated by asterisks and were determined using one-way ANOVA and Tukey's multiple comparison test (P < 0.05).

(galactose, xylose, mannose) or increased (glucose, fructose) levels in both lines expressing chloroplast Fld (**Supplementary Table S3**). Intermediates of central metabolism such as pyruvate, citrate, succinate, malate, and γ-aminobutyrate also exhibited higher contents in *Slpfld* fruit, whereas fumarate and α-ketoglutarate levels did not show significant differences among genotypes (**Supplementary Table S3**). Similarly, accumulation of the 11 proteinogenic amino acids measured was not affected by Fld presence (**Figure 7B**; **Supplementary Table S3**). In line with the moderate differences in Fld accumulation (**Figure 5A**; **Supplementary Figure S1B**), the two *Slpfld* lines showed a similar trend, without statistically significant differences, for most metabolites, whereas fruit from *Slcfld* plants in which Fld accumulated in the cytosol displayed metabolic profiles that more closely resemble those of their WT counterparts (**Figure 7B**; **Supplementary Table S3**). The only conspicuous exception was *trans*-cinnamic acid, which contents showed a major increase only in *Slpfld*8-1 plants, but remained at WT levels in *Slpfld*60-4 siblings (**Supplementary Table S3**).

The collected results indicate that the increased HI of plants expressing a chloroplast-located Fld was accompanied by similar or higher contents of sugars and other metabolites in ripe fruit, underscoring the potential value of the introduced trait.

# DISCUSSION

Tomato domestication has been conducted over the centuries to select varieties with increased fruit weight and number. These traits are genetically controlled through a few *loci* associated with carpel anatomy and cell proliferation (Chakrabarti et al., 2013), but phytohormones and environmental and/or metabolic conditions, most conspicuously photosynthetic activity in source tissues, also affect fruit development (Ariizumi et al., 2013). Searching for traits that can reduce plant size while increasing HI is a most relevant objective and accordingly, limited vegetative growth is generally a desirable trait in crops (Gur et al., 2010).

Expression of a cyanobacterial Fld targeted to chloroplasts of various species led to significant decreases in vegetative growth, but the effects of this intervention on the development of reproductive organs were not reported (Li et al., 2017; Su et al., 2018). We addressed herein this question by expressing a plastid-targeted Fld in a commercial variety of tomato, and found a similar reduction in plant size (**Figure 1**; **Supplementary Figures S3** and **S4**; **Supplementary Table S1**). Decreased leaf area correlated with lower cell size (**Figure 1**), and was accompanied by higher *Chl* contents and photosynthetic activities per leaf cross-section (**Figures 2** and **3**). Fld was expressed in green fruit but its levels decreased with the progress of fruit ripening (**Figure 5A**; **Supplementary Figure S8**). Since expression of the flavoprotein was driven by a constitutive promoter (**Supplementary Figure S1A**), down-regulation of Fld contents with fruit ripening presumably involved post-transcriptional and/or post-translational mechanisms, including changes in the rates of protein synthesis and/or degradation (responsible for total protein decrease) and in the case of plastid-targeted Fld, alterations of import capacity during the transition leading to chromoplast formation (Sadali et al., 2019).

Chloroplast Fld exerted opposite effects on flowering time and fruit ripening (**Supplementary Figure S6**; **Supplementary Table S2**). Flowering time and the subsequent processes of flower patterning and fruit development are regulated by a different suite of genes, although common players do exist (Zhu and Helliwell, 2011). Association of flowering time to redox poise has been reported (Shim and Imazumi, 2015), and it is tempting to speculate that chloroplast Fld could affect this balance through its electron shuttling activity, but further research will be required to address this issue.

Plants accumulating Fld in plastids produced a higher number of smaller fruit, leading to increased HI (**Figures 5** and **6**; **Supplementary Table S2**) without detrimental effects on metabolite contents (**Figure 7**; **Supplementary Table S3**). Actually, ripe red fruit of *Slpfld* lines contained increased levels of soluble sugars (glucose and fructose) and organic acids (mainly citrate and malate). The higher hexose contents could indicate an increase of invertase activity in ripe red fruit, together with an incomplete or non-cyclic operation of the tricarboxylic acid (TCA) cycle, as reflected by the higher citrate and malate contents. In the partial TCA cycle, one branch produces citrate while the other synthesizes malate (Igamberdiev and Eprintsev, 2016). The metabolite composition of fruit influences its flavor, which is determined by several factors including the sugar/acid ratio as an important determinant of taste (Zanor et al., 2009). Then, the metabolomics approach suggests a tastier tomato.

It is remarkable that fruit yield was maintained and sugar contents increased in plants expressing chloroplast Fld despite significant mass decreases in source tissue. Higher photosynthetic activity per leaf area (**Figure 3**) might contribute to this phenotype by providing higher source strength. While the possibility that the plastid-targeted flavoprotein affected source-sink assimilate partitioning cannot be ruled out, genes differentially expressed by chloroplast Fld in tobacco failed to reveal any obvious trait associated to this system (Bermúdez et al., 2008; Bermúdez et al., 2014; Pierella Karlusich et al., 2017).

Modification of HI as a key agronomical goal has been accomplished through various strategies. Most attempts have relied on crosses with wild relatives to select favorable alleles and on engineering hormone accumulation and signaling. Many introgression lines of *Solanum pennellii* did show high HI (Schauer et al., 2006), usually accompanied by decreased overall yields (Gur et al., 2010). On the other hand, brassinosteroid metabolism has been targeted to modify several traits in tomato including HI. Overexpression of the brassinosteroid receptor *SlBRI1* resulted in increased plant size and leaf area, with little or no effect on fruit yield (Nie et al., 2017). As a consequence, the overall HI was decreased in those plants, partly compensated by accelerated fruit ripening and improved quality. *DWARF*, the key brassinosteroid biosynthetic gene in tomato, has also been overexpressed, leading to significant increases in plant height and biomass, and lower expansion diameter (Li et al., 2016). The major decline in HI was partially compensated by early flowering and accelerated fruit ripening, and the authors predicted higher yield per planted surface due to the compact architecture of *DWARF*-expressing plants (Li et al., 2016).

In a different approach, improvement of HI was obtained by silencing the expression of a chloroplast DnaJ chaperone involved in assimilate partitioning into fruit (Bermúdez et al., 2014). The increase in HI was gained through higher ripe fruit weight per plant without modification of the aerial biomass (Bermúdez et al., 2014). Under the conditions employed in that trial, the HI of WT control plants was very low (~0.1), compared to 0.48 in our assay (**Figure 4E**). Despite the more stringent background, *Slpfld* plants HI did increase to ~0.63, even higher than those reported for DnaJ-silenced plants that were close to 0.5 (Bermúdez et al., 2014). Moreover, calculations based on the horizontal expansion diameters of the *Slpfld* plants (**Supplementary Figure S3B**), as done by Li et al. (2016), suggest that major improvements in absolute fruit yield per planted surface could be gained by increasing plant density per square meter in the field.

Tomato plants did not represent the only reported case of leaf size decrease upon expression of a plastid-located Fld. As indicated, creeping bentgrass (Li et al., 2017) and *Arabidopsis* (Su et al., 2018) displayed a similar phenotype. Interestingly, development of vegetative tissues in Fld-expressing tobacco (Ceccoli et al., 2012) and *Medicago truncaluta* (Coba de la Peña et al., 2010) lines did not differ significantly from those of their WT siblings, suggesting that the effect of the flavoprotein on growth might have some degree of species specificity. Cell size, however, was actually reduced in leaves of *pfld* tobacco plants (Mayta et al., 2018), prompting for a more detailed study of the developmental features displayed by these plants in the absence of stress.

The mechanisms by which chloroplast-targeted Fld can modulate organ development are presently unknown. Su et al. (2018) proposed that size reduction might reflect the lower efficiency of Fld, compared to Fd, as electron carrier during photosynthesis (Nogués et al., 2004). Alternatively, Fld could modulate redox processes and down-regulate accumulation of reactive oxygen species (ROS), as observed in Fld-expressing plants exposed to adverse environmental situations (Tognetti et al., 2006, Tognetti et al., 2007; Zurbriggen et al., 2008; Zurbriggen et al., 2009; Li et al., 2017; Rossi et al., 2017). Oxidative bursts have been detected during both leaf and fruit transitions (Muñoz and Munné-Bosch, 2018), and proposed to provide signaling cues required for these developmental programs. The cellular origin(s) of the observed ROS build-up are still unclear but they most likely involve chloroplasts and mitochondria (Muñoz and Munné-Bosch, 2018).

Ascorbate is a canonical plant antioxidant, and manipulation of its metabolism has been used to modify yield and HI in cherry tomatoes. The levels of ascorbate oxidase, which oxidizes ascorbate to monodehydroascorbate, and of monodehydroascorbate reductase (MDHAR), which participates in ascorbate regeneration, were reduced using RNAi techniques (Garchery et al., 2013; Truffault et al., 2016). Knocked-down ascorbate oxidase plants showed improved yield (Garchery et al., 2013), while the opposite effect was observed in siblings with impaired MDHAR activity (Truffault et al., 2016). The results revealed a strong correlation between antioxidant levels (in this case, ascorbate) and yield. Fruit-specific decreases of ascorbate oxidase and MDHAR activities obtained by expressing the RNAi sequences under control of a fruit promoter had no consequences in yield, indicating that the antioxidant capacity of leaves was the key factor determining the yield phenotype (Truffault et al., 2018). Moreover, application of this same strategy to Moneymaker tomatoes failed to show any fruit change (Truffault et al., 2018), underscoring the importance of the genotype in the determination of yield and HI. Cherry tomatoes produce small fruit and exhibit low HI values compared to Moneymaker, indicating that source-sink relationships and metabolite allocation must be necessarily different in the two cultivars.

MDHAR isoforms are distributed in various cellular compartments, whereas ascorbate oxidase is an apoplastic enzyme. Then, to the best of our knowledge this is the first report in which genetic manipulations of a chloroplast redox shuttle modify growth of a sink tissue and HI. The connection between plastid function and organ development has been recognized only recently (Andriankaja et al., 2012; Van Dingenen et al., 2016). Chloroplasts communicate information to nuclei as a response to environmental and developmental stimuli, a process known as retrograde signaling (Hernández-Verdeja and Strand, 2018). Several potential operating signals originating from these organelles have been proposed, including ROS and the redox status of the PETC and the chloroplast stroma (Exposito-Rodriguez et al., 2017), all of which can be affected by Fld presence (Pierella Karlusich et al., 2014; Rossi et al., 2017; Mayta et al., 2018). Organ size, on the other hand, has been related to various cellular processes including endoreduplication (Kawade and Sukaya, 2017), phytochromes (Husaineid et al., 2007), and modulation *via* proteasomal activity (Sonoda et al., 2009; Nguyen et al., 2013). Noteworthy, the ubiquitin-proteasome system has been reported to modulate plastid-nuclear bidirectional communication (Hirosawa et al., 2017), and most proteasomal components were up-regulated by chloroplast Fld presence in tobacco plants grown under normal conditions (Pierella Karlusich et al, 2017), suggesting that the effects of the flavoprotein could be mediated by selective protein degradation. Then, our working hypothesis is that, by productively interacting with the PETC and

### REFERENCES


other oxido-reductive pathways of the chloroplast, Fld affects retrograde signaling involved in organ development, presumably mediated by proteasomal function, ploidy, receptors, etc. Research is currently underway to address these possibilities.

# DATA AVAILABILITY STATEMENT

The datasets generated for this study are available on request to the corresponding author.

# AUTHOR CONTRIBUTIONS

MM, MZ, EV, M-RH, MIZ, and NC conceived the original research plans. MM, RA, MZ, EV, M-RH, MIZ, and NC designed the experiments. MM, RA, MZ, EV, M-RH, and MIZ performed the experiments. MM, RA, MZ, EV, M-RH, MIZ, and NC analyzed the data. MM, RA, MZ, EV, M-RH, MIZ, and NC wrote the manuscript.

# FUNDING

This work was supported by grants PICT-2015-3828 and PICT-2017-1301 from the National Agency for the Promotion of Science and Technology (ANPCyT, Argentina). MM and RA are postdoctoral and doctoral Fellows, respectively, from the National Research Council (CONICET, Argentina). EV, MIZ, and NC are Staff Researchers from CONICET. MM, EV, MIZ, and NC are Faculty members of the School of Biochemical and Pharmaceutical Sciences, University of Rosario (Facultad de Ciencias Bioquímicas y Farmacéuticas, Universidad Nacional de Rosario, Argentina). MZ is a Faculty member of the Düsseldorf University.

# ACKNOWLEDGMENTS

We wish to thank Diego Aguirre for excellent technical assistance and Mercedes Sáenz for her valuable help during plant harvest.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.01432/ full#supplementary-material

Baker, N. R. (2008). Chlorophyll fluorescence: a probe of photosynthesis *in vivo*. *Annu. Rev. Plant Biol.* 59, 89–113. doi: 10.1146/annurev.arplant.59.032607.092759


and yield and MDHAR activity is correlated with sugar levels under high light. *Plant Cell Environ.* 39, 1279–1292. doi: 10.1111/pce.12663


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Mayta, Arce, Zurbriggen, Valle, Hajirezaei, Zanor and Carrillo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Combating Micronutrient Deficiency and Enhancing Food Functional Quality Through Selenium Fortification of Select Lettuce Genotypes Grown in a Closed Soilless System

### Edited by:

Eugenio Benvenuto, Energy and Sustainable Economic Development (ENEA), Italy

### Reviewed by:

Michela Schiavon, University of Padova, Italy Sylwester Smolen, University of Agriculture in Krakow, Poland Andre Reis, São Paulo State University, Brazil

> \*Correspondence: Youssef Rouphael youssef.rouphael@unina.it

†These authors have contributed equally to this work

### Specialty section:

This article was submitted to Crop and Product Physiology, a section of the journal Frontiers in Plant Science

Received: 20 June 2019 Accepted: 28 October 2019 Published: 20 November 2019

### Citation:

Pannico A, El-Nakhel C, Kyriacou MC, Giordano M, Stazi SR, De Pascale S and Rouphael Y (2019) Combating Micronutrient Deficiency and Enhancing Food Functional Quality Through Selenium Fortification of Select Lettuce Genotypes Grown in a Closed Soilless System. Front. Plant Sci. 10:1495. doi: 10.3389/fpls.2019.01495

*Antonio Pannico1†, Christophe El-Nakhel1†, Marios C. Kyriacou2, Maria Giordano1, Silvia Rita Stazi3, Stefania De Pascale1 and Youssef Rouphael1\**

1 Department of Agricultural Sciences, University of Naples Federico II, Portici, Italy, 2 Department of Vegetable Crops, Agricultural Research Institute, Nicosia, Cyprus, 3 Department of Chemical and Pharmaceutical Sciences (DSCF), University of Ferrara, Ferrara, Italy

Selenium (Se) is an essential trace element for human nutrition and a key component of selenoproteins having fundamental biological and nutraceutical functions. We currently examined lettuce biofortification with Se in an open-gas-exchange growth chamber using closed soilless cultivation for delivering Se-rich food. Morphometric traits, minerals, phenolic acids, and carotenoids of two differently pigmented Salanova cultivars were evaluated in response to six Se concentrations (0–40 μM) delivered as sodium selenate in the nutrient solution. All treatments reduced green lettuce fresh yield slightly (9%), while a decrease in red lettuce was observed only at 32 and 40 μM Se (11 and 21% respectively). Leaf Se content increased in both cultivars, with the red accumulating 57% more Se than the green. At 16 μM Se all detected phenolic acids increased, moreover a substantial increase in anthocyanins (184%) was recorded in red Salanova. Selenium applications slightly reduced the carotenoids content of green Salanova, whereas in red Salanova treated with 32 μM Se violaxanthin + neoxanthin, lutein and β-cryptoxanthin spiked by 38.6, 27.4, and 23.1%, respectively. Lettuce constitutes an ideal target crop for selenium biofortification and closed soilless cultivation comprises an effective tool for producing Se-enriched foods of high nutraceutical value.

Keywords: anthocyanins, carotenoids profile, hydroponics, Lactuca sativa L., mineral composition, nutrient solution management, phenolic acids, sodium selenate

# INTRODUCTION

Selenium (Se) is considered a non-essential mineral nutrient for higher plants (Sors et al., 2005; Pilon-Smits and Quinn, 2010; Malagoli et al., 2015), nevertheless several studies demonstrate the effectiveness of Se at low concentrations in improving photo-oxidative stress tolerance, delaying senescence and stimulating plant yield (Hartikainen, 2005; Lyons et al., 2009). The anti-oxidative function of Se is related to the increased activity of antioxidant enzymes including lipoxygenase,

1 **20** superoxide dismutase, catalase, ascorbate peroxidase, and glutathione peroxidase with the consequent decrease of lipid peroxidation, as well as to the enhanced synthesis of antioxidant molecules such as phenols, carotenoids, flavonoids, and anthocyanins in Se treated-plants (Djanaguiraman et al., 2005; Hawrylak-Nowak, 2008; Ramos et al., 2010; Ardebili et al., 2015).

While Se is considered merely beneficial to plants (Pilon-Smits et al., 2009; Vatansever et al., 2017; Chauhan et al., 2019), it is deemed essential for animal and human nutrition as it constitutes the key component of selenoenzymes and selenoproteins with fundamental biological functions (Rayman, 2002). Low dietary intake of Se has been associated with serious human illnesses, such as cardiovascular diseases, viral infections and certain types of cancer (Rayman, 2000; Combs, 2001; Finley, 2005). Selenium deficiency has been estimated to affect up to one billion people worldwide (Jones et al., 2017). Most serious consequences have been reported in China, the UK, Eastern Europe, Africa, and Australia (Chen et al., 2002; Lyons et al., 2004), in areas with arable soils of low Se bioavailability that inevitably limits Se entry into the food supply chain.

The Recommended Dietary Allowance (RDA) of Se for adult men and women is 55 μg day−1 (Johnson et al., 2003), however, Burk et al. (2006) have found that Se supplementation of 200 μg day−1, reduces the risk of prostate, lung and colon cancer. Plants constitute a potentially significant source of this element for human diet through biofortification. Biofortification is the process that increases the bioavailable content of targeted elements in edible plant parts through agricultural intervention or genetic selection (White and Broadley, 2005). In this perspective, recent works have demonstrated that Se fertilization increases the content of this element in a wide range of crops including rice (Chen et al., 2002), wheat (Lyons et al., 2004), radish (Pedrero et al., 2006; Schiavon et al., 2016), spinach (Ferrarese et al., 2012), potato (Turakainen et al., 2004), bean (Hermosillo-Cereceres et al., 2011), soybean (Yang et al., 2003), pea (Jerše et al., 2018), tomato (Schiavon et al., 2013), rocket (Dall'Acqua et al., 2019), lamb's lettuce (Hawrylak-Nowak et al., 2018), and lettuce (Businelli et al., 2015; Esringu et al., 2015; Smolen et al., 2016a; Silva et al., 2017; Silva et al., 2018a). Se fertilization is a relatively low-cost approach to the prophylaxis of consumers against nutrient deficiency. Several countries, such as Finland, Malawi, Australia, and New Zealand, have supported this strategy through biofortification programs, demonstrated to boost Se content in human tissue and body fluids of the population (Arthur, 2003; Eurola et al., 2004; Chilimba et al., 2012), as well as Brazil, where studies were performed on upland rice (Reis et al., 2018), rice (Andrade et al., 2018) and cowpea (Silva et al., 2018b; Silva et al., 2019).

Higher plant roots uptake Se mainly as selenate and selenite. Selenate is transported across the plasma membrane of root cells, using the assimilation pathways of sulfate *via* the enzyme sulfate permease (Terry et al., 2000; Hawkesford and Zhao, 2007), while selenite is transported *via* phosphate transporters (Li et al., 2008). The selectivity of these transporters is species-dependent and affected by soil sulfate concentration, salinity, pH and redox potential (Combs, 2001; White et al., 2004); moreover, the different types of sulphate transporters (SULTR1;1, SULTR1;2, SULTR2;1) may have different selectivity for selenium and sulfur (Dall'Acqua et al., 2019). Nevertheless, selenate is more soluble, less phytotoxic and easily transported and accumulated in crops compared to selenite (Lyons et al., 2005; Smrkolj et al., 2005; Hawrylak-Nowak, 2013).

Regarding the bioactive value of Se, several studies have demonstrated its role in plant secondary metabolism by increasing tocopherol, flavonoids, phenolic compounds, ascorbic acid and vitamin A (Hartikainen et al., 2000; Xu et al., 2003; Ríos et al., 2008; Businelli et al., 2015), noting that plant secondary metabolites are health promoting phytochemicals that prevent a range of human diseases and are used as well as medicinal active ingredients (El-Nakhel et al., 2019). However, at high concentrations Se is phytotoxic, inhibiting growth and modifying the nutritional characteristics of plants (Hartikainen et al., 2000). Selenium phytotoxicity is attributable to non-specific incorporation of selenocysteine (SeCys) and selenomethionine (SeMet) which replace their sulphur analogues compounds in plant proteins (Ellis and Salt, 2003).

Vegetables are widely used in biofortification studies, including lettuce (*Lactuca sativa* L.), which is the most produced and consumed leafy vegetable in the world (Baslam et al., 2013; Hawrylak-Nowak, 2013). It has attained a central role in human nutrition as it combines palatable organoleptic properties with a rich content of nutraceutical compounds (phenolic acids, carotenoids, flavonoids, and vitamins B9, C, and E) and a low content of dietary fats, which makes lettuce an attractive low-calorie food (Kim et al., 2016). Moreover, since lettuce is generally eaten raw, more nutrients are retained compared to cooked foods, including Se that has been has been shown to diminish in concentration after food processing, such as boiling, baking or grilling (Dumont et al., 2006; Sager, 2006). Being also one of the most easily cultivated vegetables both in soil and in hydroponic systems, lettuce can be considered therefore a promising candidate for Se biofortification.

Several biofortification techniques have been proposed, such as soil/substrate dosing with Se, foliar spray with Se solution and hydroponic cultivation with Se enriched nutrient solution (Smrkolj et al., 2007; Puccinelli et al., 2017; Wiesner-Reinhold et al., 2017). The technique choice should consider, among other aspects, the possible run-off of Se fertilizers resulting in Se accumulation in groundwater. In this respect, hydroponic cultivation, especially in closed-loop systems, has several advantages: (i) environmental spread of Se is minimized, (ii) Se uptake is higher than other methods, as the constant exposure of the roots with the fortified nutrient solution and the absence of micronutrient-soil interactions maximize uptake efficiency and accumulation in edible plant parts, (iii) product quality is standardized through precise management of the concentration and composition of nutrient solution, (iv) very small amounts of selenium are needed, and no modification of conventional closed soilless cultivation technique is required thus ensuring no additional cost (Puccinelli et al., 2017; Wiesner-Reinhold et al., 2017; Rouphael and Kyriacou, 2018).

Taking into account these considerations, the effects of sodium selenate application were evaluated in this present work at six different doses on two lettuce cultivars of different pigmentation (green and red) cultivated in a closed soilless system. The aim of this study was to identify the appropriate Se concentration in the nutrient solution in order to maximize the accumulation of selenium and enhance the nutraceutical characteristics (lipophilic and hydrophilic antioxidant molecules), by creating a dual enrichment of lettuce, without causing important loss of yield in lettuce.

## MATERIALS AND METHODS

### Growth Chamber Conditions, Plant Material and Experimental Design

Two butterhead lettuce (*L. sativa* L. var. capitata) cultivars with different leaf pigmentation, green Salanova® "Descartes" and red Salanova® "Klee" (Rijk Zwaan, Der Lier, The Netherlands), were cultivated in a 28 m2 open-gas-exchange growth chamber (7.0 m × 2.1 m × 4.0 m, width × height × depth) situated at the experimental station of the University of Naples Federico II, Italy.

The lighting of the growth chamber was provided by High Pressure Sodium lamps (Master SON-T PIA Plus 400W, Philips, Eindhoven, The Netherlands) with a photosynthetic photon flux density (PPFD) of 420 ± 10 µmol m−2 s−1, measured at leaf height using a spectral radiometer (MSC15, Gigahertz-Optik, Turkenfeld, Germany). Day/night temperatures of 24/18°C were established with a 12 h photoperiod and a relative air humidity of 60−80% respectively. The experiment was carried out at ambient CO2 concentration (390 ± 20 ppm), while air exchange and dehumidification were guaranteed by two HVAC systems. Plants were grown in nutrient film technique (NFT) established on rigid polyvinyl chloride (PVC) gullies (14.5 cm wide, 8 cm deep and 200 cm long), with a 1% slope. The gullies were 60 cm above ground level and each of them was fed by a separate 25 L plastic reservoir tank containing the nutrient solution (NS). Continuous recirculation (1.5 L min−1) of the NS was provided by a submerged pump (NJ3000, Newa, Loreggia, PD, Italy) in each reservoir tank. Twenty-day-old lettuce seedlings were transplanted in rockwool cubes (7 × 7 × 7cm, Delta, Grodan, Roermond, The Netherlands) and transferred into the gullies with an intra-row and inter-row spacing of 15 and 43 cm respectively, corresponding to a density of 15.5 plants m−2. Each gully was covered with PVC lid in order to avoid NS evaporation. The NS was a modified Hoagland formulation prepared with osmotic water containing: 8.0 mM N–NO3 −, 1.5 mM S, 1.0 mM P, 3.0 mM K, 3.0 mM Ca, 1.0 mM Mg, 1.0 mM NH4 +, 15 µM Fe, 9 µM Mn, 0.3 µM Cu, 1.6 µM Zn, 20 µM B, and 0.3 µM Mo, with electrical conductivity (EC) 1.4 dS m−1 and pH 6.0 ± 0.1.

The experimental design was a randomized completeblock factorial design (6 × 2) with six selenium concentrations in the nutrient solution (0, 8, 16, 24, 32, or 40 μM as sodium selenate, from Sigma-Aldrich, St. Louis, MO, USA) and two lettuce cultivars (green or red butterhead Salanova), with three replicates. Each experimental plot consisted of six plants.

# Growth Analysis and Biomass Determination

Plants were harvested at nineteen days after transplant (DAT). Number of leaves and fresh weight of the aerial plant parts were determined, then leaf area was measured by an area meter (LI-COR 3100C, Biosciences, Lincoln, Nebraska, USA).

Leaf dry weight was determined on an analytical balance (Denver Instruments, Denver, Colorado, USA) after sample desiccation in a forced-air oven at 70°C to constant weight (around 72 h). Leaf dry matter was determined according to the official method 934.01 of the Association of Official Analytical Chemists.

### Collection of Samples for Mineral and Nutritional Quality Analyses

Part of the dried leaf tissue of green and red Salanova plants was used for macro-mineral and selenium analyses. For the identification and quantification of phenolic acids and carotenoid compounds by HPLC-DAD, fresh samples of three plants per experimental unit were instantly frozen in liquid nitrogen and stored at −80°C before lyophilizing them in a Christ, Alpha 1–4 (Osterode, Germany) freeze drier.

### Mineral Analysis by Ion Chromatography and ICP-OES and Consumer Safety of Se-Enriched Butterhead Lettuce

Leaf soluble cations and anions were determined by liquid ion exchange chromatography (ICS 3000 Dionex Sunnyvale, CA, USA) with conductimetric detection, as described previously by Rouphael et al. (2017b). Briefly, 250 mg of dried sample ground at 0.5 mm in a Wiley Mill (IKA, MF 10.1, Staufen, Germany) were suspended in 50 ml of ultrapure water (Milli-Q, Merck Millipore, Darmstadt, Germany) and stirred in shaking water bath (ShakeTemp SW22, Julabo, Seelbach, Germany) at 80° C for 10 min. The mixture was centrifuged at 6,000 rpm for 10 min (R-10M, Remi Elektrotechnik Limited, India), then filtered through a 0.45 μm syringe filter (Phenomenex, Torrance, CA, USA). Chromatographic separation of Na, K, Mg, and Ca was achieved in isocratic mode (20 mM methanesulphonic acid) on an IonPac CS12A analytical column (4 × 250 mm, Dionex Sunnyvale, CA, USA) equipped with an IonPac CG12A precolumn (4 × 250 mm, Dionex Sunnyvale, CA, USA) and a self-regenerating suppressor CERS500 (4 mm, Dionex Sunnyvale, CA, USA). Nitrate, phosphate, and sulphate were detected in gradient mode (1mM-50mM KOH) on an IonPac ATC-HC anion trap (9×75 mm, Dionex Sunnyvale, CA, USA), and an AS11-HC analytical column (4 × 250 mm, Dionex Sunnyvale, CA, USA) equipped with an AG11-HC precolumn (4 × 50 mm, Dionex Sunnyvale, CA, USA) and a self-regenerating suppressor AERS500 (4 mm, Dionex Sunnyvale, CA, USA). Ions were expressed as g kg−1 dry weight (dw) and nitrate was expressed as mg kg−1 fresh weight (fw) on the basis of each sample's original dw.

In addition to macro-minerals analysis, Se content was also measured in green and red Salanova leaf tissue. Each sample was subjected to a first phase of acid digestion performed using a commercial high-pressure laboratory microwave oven (Mars plus CEM, Italy) operating at an energy output of 1,800 W. Approximately 300 mg of each dry sample was inserted directly into a microwave-closed vessel. Two milliliters of 30% (m/m) H2O2, 0.5 ml of 37% HCl and 7.5 ml of HNO3 69% solution were added to each vessel. The heating program was performed in one step: temperature was ramped linearly from 25 to 180°C in 37 min, then held at 180°C for 15 min. After the digestion procedure and subsequent cooling, samples were transferred into a Teflon beaker and total volume was made up to 25 ml with Milli-Q water. The digest solution was then filtered through DISMIC 25HP PTFE syringe filter of pore size 0.45 mm (Toyo Roshi Kaisha, Ltd., Japan) and stored in a screw cap plastic tube (Nalgene, New York). Blanks were prepared in each lot of samples. All experiments were performed in triplicate. The reagents of super pure grade, used for the microwave-assisted digestions, were: hydrochloric acid (36% HCl), nitric acid (69% HNO3), and hydrogen peroxide (30% H2O2) (Merck, Darmstadt, Germany). High-purity water (18MΩcm−1) from a Milli-Q water purification system (Millipore, Bedford, USA) was used for the dilution of the standards, for preparing samples throughout the chemical process, and for final rinsing of the acid-cleaned vessels, glasses, and plastic utensils. For this work, tomato leaves (SRM 1573a) were used as external certified reference material. Selenium quantification was performed using an Inductively Coupled Plasma Optical Emission Spectrometer (ICP-OES) with an axially viewed configuration (8,000 DV, PerkinElmer, Shelton, CT, USA) equipped with an Hydride Generation system for Se quantification at 196.06 nm. Twentyfive ml of digested material was pre-reduced by concentrated HCl (5 ml, superpure grade) followed by heating at 90°C for 20 min. After pre-reduction, the solution was diluted to 50 ml in polypropylene vial with deionized water (18 MΩ cm−1). In order to determine the Se concentration calibration standards were prepared, treated in same way before dilution. Selenium content in lettuce leaves was expressed as mg kg-1 dw.

The green vegetables hazard quotient (HQgv) was calculated according to the United States Environmental Protection Agency (USEPA) Protocol using the following formula:

$$\mathbf{HQ}\_{\mathcal{B}^\vee} = (\mathbf{ADD} \,\mathrm{'} \,\mathrm{RHD})$$

where ADD is the average daily dose of selenium (μg Se day−1) and RfD represents the recommended dietary tolerable upper intake level of selenium (μg Se day−1) assessed equal to 400 μg day−1 (Johnson et al., 2003), referring to the risk to human health of a 70-kg adult resulting from Se intake through the consumption of a 50-g portion of fresh lettuce.

### Phenolic Acids and Anthocyanins Identification and Quantification

Four hundred mg of lyophilized samples were solubilized in a solution of methanol/water/formic acid (50/45/5, v/v/v, 12 ml) as described by Llorach et al. (2008) to determine phenolic acids as hydroxycinnamic derivatives. The suspensions were sonicated for 30 min and then subjected to centrifugation (2,500 *g* for 30 min at 4°C). After a second centrifugation of supernatants at 21,100 *g* for 15 min at 4°C, samples were filtered through 0.22 µm cellulose filters (Phenomenex). A reversed phase C18 column (Prodigy, 250 × 4.6 mm, 5 µm, Phenomenex, Torrance, CA) equipped with a C18 security guard (4.0 × 3.0 mm, Phenomenex) was utilized for the separation of hydroxycinnamic derivatives and anthocyanins. Twenty µL of each extract were injected and the following elution gradient was built based on solvent (A) water formic acid (95:5, v/v) and (B) methanol: (0/5), (25/40), (32/40) in min/%B. The flow rate was 1 ml min−1. The LC column was installed onto a binary system (LC-10AD, Shimadzu, Kyoto, Japan), equipped with a DAD (SPD-M10A, Shimadzu, Kyoto, Japan) and a Series 200 autosampler (Perkin Elmer, Waltham, MA). Chlorogenic and chicoric acids at 330 nm were used for the calibration curves of hydroxycinnamic derivatives. Identification of caffeoyl-meso-tartaric acid and caffeoyl-tartaric acid was performed by LC-MS/MS experiments.

The chromatographic profiles of reference curves and samples were recorded in multiple reaction monitoring mode (MRM) by using an API 3000 triple quadrupole (ABSciex, Carlsbad, CA). Negative electrospray ionization was used for detection and source parameters were selected as follows: spray voltage −4.2 kV; capillary temperature: 400°C, dwell time 100 ms, nebulizer gas and cad gas were set to 10 and 12 respectively (arbitrary units). Target compounds [M–H]− were analyzed using mass transitions given in parentheses: chicoric acid (m/z 473 311, 293), chlorogenic acid (m/z 353 191), caffeoyl tartaric acid (m/z 311 179, 149, retention time 15.8 min), caffeoyl-meso-tartaric acid (m/z 311 179, 149, retention time 17.8 min). The concentration of phenolic acids was reported as mg 100 g−1 of dw.

Anthocyanins were also measured within the same LC-DAD chromatographic runs, at 520 nm and the concentration calculated by using cyanidin as reference standard to calculate the concentration. The results were reported as µg of cyanidin equivalent per g of dw.

### Carotenoids Identification and Quantification

One gram of lyophilized samples was used to determine carotenoids content following the method of Vallverdú-Queralt et al. (2013) with slight modifications. Samples were solubilized in ethanol/hexane (4:3, v/v, 2.5 ml) with 1% BHT, vortexed at 22°C for 30 s and sonicated for 5 min in the dark. Then, the solution was centrifuged (2500 g, 4°C, 10 min) and filtered through 0.45 µm nylon syringe filters (Phenomenex, Torrance, CA, USA). The extracts were dried in N and the dried extracts were dissolved in 1% BHT in chloroform. Twenty µl of each sample was injected onto a C18 column (Prodigy, 250 × 4.6 mm, 5 µm, Phenomenex, Torrance, C A, USA) with a C18 security guard (4.0 × 3.0 mm, Phenomenex). Two mobile phases were used: (A) acetonitrile, hexane, methanol, and dichloromethane (4:2:2:2, v/v/v/v) and (B) acetonitrile. Carotenoids were eluted at 0.8 ml min−1 through the following gradient of solvent B (t in [min]/[%B]): (0/70), (20/60), (30/30), (40/2). Carotenoids were quantified by a binary LC-10AD system connected to a DAD (SPD-M10A, Shimadzu, Kyoto, Japan) equipped with a Series 200 auto-sampler (Perkin Elmer, Waltham, MA, USA). Violaxanthin, neoxanthin, β-cryptoxanthin, lutein and β-carotene were used as reference standards. Identification of the peaks was achieved by comparison of UV-vis spectra and retention times of eluted compounds with pure standards at 450 nm. Three separate sets of calibration curves were built; each set was injected three times in the same day (intraday assay) and three times in three different days (interday assay). The accuracy was reported as the discrepancies between the calibration curves performed intraday and interday and the results were expressed as relative standard deviation RSD (%). A recovery test was performed spiking two samples with two known amounts of carotenoids (50 and 100 µg ml−1 final concentration) and taking into account the overestimation due to the target analytes already present in the samples. Except for violaxanthin + neoxanthin which was expressed as μg violaxanthin equivalent per g dw, the concentration of the target carotenoids was expressed as μg g−1 of dw.

## Statistics

All morphometric, nutritional and functional quality data were subjected to analysis of variance (two-way ANOVA) using IBM SPSS 20 software package (www.ibm.com/software/analytics/ spss). Cultivar means were compared by t-Test. Duncan's multiple range test was performed for comparisons of the selenium treatment means. In order to determine the interrelationship among the morphometric, nutritional and functional quality traits in respect to the experimental treatments, a Principal Component Analysis (PCA) was performed using the appropriate function PCA from the SPSS 20 software package.

# RESULTS AND DISCUSSION

### Advanced Integrative Simultaneous Analysis of Morpho-Physiological Traits

Genetic material is the main pre-harvest factor that strongly affects the biometric characteristics as well as the biosynthesis, the composition and accumulation of bioactive compounds (Kim et al., 2016). For most of the measured agronomic parameters no significant interaction between the two tested factors, lettuce cultivar (C) and Se concentration in the nutrient solution (Se), was recorded, except for leaf area and fresh yield (**Table 1**). In particular, green Salanova had higher leaf number, shoot dry biomass and leaf dry matter content (%). Regarding the effect of Se concentration in the nutrient solution, increasing Se concentration to 24 μM resulted in non-significant differences in shoot dry biomass with the control (0 μM) and 16 μM treatments; whereas increasing Se concentration from 0 to 40 μM yielded a significant increase in leaf dry matter content, with the highest values observed at 40 μM (5.7%) (**Table 1**). Leaf number was not affected by the addition of Se to the nutrient solution.

TABLE 1 | Growth parameters, fresh biomass, dry biomass and leaf dry matter content of green and red Salanova lettuce grown hydroponically in a Fitotron open-gasexchange growth chamber under six Se concentrations applied in the nutrient solution.


ns,\*,\*\*, \*\*\* Non-significant or significant at P ≤ 0.05, 0.01, and 0.001, respectively. In the absence of interaction, cultivar means were compared by t-Test and Se application means by Duncan's multiple-range test (P = 0.05). Different letters within each column indicate significantly different means. All data are expressed as mean ± SE, n = 3.

Leaf area and fresh biomass incurred significant interaction of the tested factors (**Table 1**), as the dose effect of Se on these two morphometric traits was cultivar-dependent. In the red cultivar, a reduction of the leaf area was observed with increasing Se dose, amounting to about 11% reduction in the range of 8–32 μM Se and up to 19% at the higher Se dose (40 μM) compared to the control treatment; whereas no significant differences were recorded in the green cultivar. Cultivars/genotypes may develop different Se-tolerance and response mechanisms depending on the concentration and time of exposure. This was the case in the current experiment, since fresh yield decreased in both cultivars with increasing Se concentration in the nutrient solution although the redpigmented butterhead lettuce was less affected than the green-pigmented cultivar especially at mild and moderate Se concentrations (i.e. 8 to 24 μM) (**Table 1**). In red Salanova, fresh yield was not affected by the addition of Se up to a concentration of 24 μM, whereas the addition of 32 μM and especially 40 μM induced a reduction in the fresh biomass of 11 and 21%, respectively, compared to the 0, 8, 16, and 24 μM treatments. Finally, a significant decrease in green Salanova fresh biomass (about 10%) was observed in response to Se application without significant differences between the five Se treatments (**Table 1**).

Several studies demonstrate the beneficial or toxic effects on morphometric traits of lettuce depending on the interaction of cultivar and application level (Ríos et al., 2008; Rios et al., 2010a; Ramos et al., 2011; Hawrylak-Nowak, 2013). Ramos et al. (2011) studied the influence of 15 µM of selenate and 15 µM of selenite concentrations in the nutrient solution on the yield of 30 lettuce accessions grown hydroponically. The authors reported that just 5 of 30 accessions treated with 15 µM of selenate showed an increase in fresh biomass compared to the control. Contrarily, Hawrylak-Nowak (2013) confirmed a decrease in both leaf area and fresh biomass of green lettuce cv. Justyna grown hydroponically and supplied with 10 µM of selenate, while in another similar work on green lettuce cv. Vera, a reduction of dry biomass was observed only at 8 μM selenate dose (Ramos et al., 2010), both of which findings are in line with our current ones on green Salanova. Additional studies conducted by Ríos et al. (2008; 2010a) also reported a decrease of dry biomass in hydroponically grown green lettuce (cv. Philipus) treated continuously with nutrient solution containing 80 μM Se compared to the control treatment.

The cultivar-dependent response to supplemental Se observed in our experiment, where the red-pigmented Salanova showed better tolerance to selenate compared to the green one, was in agreement with the study on red lettuce cv. Veneza Roxa by Silva et al. (2018a), where no significant reduction in shoot fresh weight was observed with selenate concentrations ranging from 10 to 40 μM. Considering the above, it appears that the beneficial or toxic effect of Se on plant growth and crop productivity may vary in relation to different interacting variables, including the Se concentration, time of exposure and cultivation system (Pedrero and Madrid, 2009). In the light of this finding, additional studies should focus on elucidating the cultivar × application dose × cultivation system (soilless versus

soil) interaction in order to select optimal combinations to ensure balance between yield and biofortification.

### Nitrate Content, Mineral Composition, Selenium Biofortification, and Consumer Safety

Nitrate content in plants grown for human consumption is extremely important, since a high intake of this nutrient may harm human health due to its potential transformation to nitrite and nitrogenous compounds that can cause serious pathological disorders, such as methaemoglobinaemia and blue baby syndrome (Colla et al., 2018). In addition, it should be taken into account that lettuce is considered a nitrate hyperaccumulator; hence the European Commission (Commission Regulation no. 1258/2011) has set as maximum limit for nitrate concentration in lettuce at 4,000 and 5,000 mg kg−1 fw for harvest occurring from April 1 to September 30 and from October 1 to March 31, respectively. In respect to the effect of Se concentration in the nutrient solution, the green cultivar had a higher nitrate content (1,810 mg kg−1 fw) than the red one (1,272 mg kg−1 fw), however both values were by far below EU regulation limits (**Table 2**). In fact, it is well established that nitrate accumulation in lettuce, aside from the cultivation management, depends mainly on genotypic factors (Burns et al., 2010; Burns et al., 2011; López et al., 2014). In the current study, nitrate content was influenced by both tested factors and the cultivar × Se interaction (**Table 2**). In green Salanova a significant reduction of nitrate content was observed at 8 μM (15%), 32 μM (16%), and 40 μM Se (32%) compared to the control, while no significant Se effect was found regarding this parameter in red Salanova (**Table 2**). The reduction of nitrate content prompted by selenate could be associated to the antagonistic relation of these two anions (Rios et al., 2010a). Moreover, Nowak et al. (2004) have demonstrated that Se affects the nitrate reductase enzyme, increasing its activity in plants. In addition, the reduction in foliar nitrate could be related to a greater assimilation rate of this anion due to a higher amino acid synthesis driven by enhanced nitrate reductase activity. In fact, Se toxicity in plants may be due to the formation of non-specific selenoproteins; in particular, the replacement of cysteine (Cys) with SeCys in non-specific selenoproteins would invoke a higher demand of amino acids for the synthesis of functional proteins, which would elicit the removal of these malformed selenoproteins (Van Hoewyk, 2013). Our data reflect a nitrate reduction observed in previous works, where selenate has been applied on green-pigmented lettuce at different concentrations (Lee et al., 2008; Rios et al., 2010a; Ríos et al., 2010b).

The growth and development of plants depends on the equilibrium of the mineral elements, as stress occurs in the presence of nutritional imbalances (Salt et al., 2008). Minerals are also essential for human health and lettuce is considered a good source of them (Baslam et al., 2013; Kim et al., 2016). Irrespective of Se concentration in the nutrient solution, green Salanova recorded the higher potassium and calcium content, while red Salanova showed the higher quantity of magnesium and sulphate (**Table 2**). As previously reported in literature,


TABLE 2 | Nitrate, phosphate, sulphate, potassium (K), calcium (Ca), magnesium (Mg) and sodium (Na) concentrations of green and red Salanova lettuce grown hydroponically in a Fitotron open-gas-exchange growth chamber under six Se concentrations applied in the nutrient solution.

lettuce mineral content is quite variable depending on head type, leaf color and cultivar (Kim et al., 2016). However, regardless Se concentration in the nutrient solution and lettuce cultivar, our results particularly, potassium, calcium and magnesium were proximate to those reported by Blasco et al. (2012) on lettuce grown in controlled environment conditions.

Neither cultivar nor Se treatment had significant effect on Na accumulation in leaf tissue (avg. 0.37 g kg−1 dw), whereas phosphate and calcium were highly influenced by cultivar and Se concentration with no significant interaction between the two tested factors (**Table 2**). Averaged over cultivar, phosphate content decreased significantly (about 15%) in response to Se treatments from 24 to 40 μM compared to the 0 to 16 μM treatments. In addition, the calcium content at 40 μM Se was significantly lower than the control (9%) (**Table 2**). Our findings, are in line with those of Rios et al. (2013) who reported a 9% decrease in calcium concentration at a Se dose of 40 μM compared to the control and a similar reduction in phosphate content was also observed by the same authors in response to Se concentration ranging from 20 to 120 μM.

Leaf contents in potassium, magnesium and sulphate were influenced by cultivar and Se treatments with significant C × Se interaction (**Table 2**). In green Salanova, a significant reduction of K was observed at Se 8 μM (10%) and 40 μM (17%) compared to the control (**Table 2**). Likewise, a 10% decrease in Mg content was noted with respect to the control, both at 8 and 40 μM Se. On the contrary, in the red cultivar potassium content spiked by 9% at Se 32 μM and magnesium content by about 12% increase when Se treatment ranged between 16-40 μM, compared to the control treatment (**Table 2**). The lowest K and Mg contents observed in green Salanova at 40 μM Se application coincide with the results obtained by Rios et al. (2013) at the same dose of selenate on Philipus green lettuce cultivar. Similarly, Smoleń et al. (2016b) found a decrease in potassium content by about 9% in green butterhead lettuce leaves treated with selenium combined with iodine. On the other hand, the increase of K and Mg recorded in red Salanova treated with Se was in disagreement with other scientific literature where the authors found no variation in these two macroelements content after selenate applications (Wu and Huang, 1992; Silva et al., 2018a).

Furthermore, sulphate content increased significantly and linearly in both cultivars with selenate concentration ranging from 2.10 to 12.30 mg kg−1 dw in green Salanova and from 3.63 to 27.60 mg kg−1 dw in red Salanova (**Table 2**). These data imply a synergic relationship between selenate and sulphate. Selenium is chemically similar to sulfur, therefore plants absorb and metabolize Se *via* S uptake and assimilation pathway (Sors et al., 2005; Pilon-Smits and Quinn, 2010). Selenate is assimilated by plants through a process of active transport, which is driven by sulphate transporters (SULTR) (Dall'Acqua et al., 2019). SULTR mediate the movement of the sulfate in the vascular bundles, thus both selenate and sulphate are actively accumulated in

ns,\*,\*\*, \*\*\* Non-significant or significant at P ≤ 0.05, 0.01, and 0.001, respectively. In the absence of interaction, cultivar means were compared by t-Test and Se application means by Duncan's multiple-range test (P = 0.05). Different letters within each column indicate significantly different means. All data are expressed as mean ± SE, n = 3.

the plant cells against their electrochemical gradient (Terry et al., 2000; Dall'Acqua et al., 2019). Our results are confirmed by White et al. (2004) who found that selenate applications promoted the accumulation of sulphate in the shoots of the model plant *Arabidopsis thaliana*. Similar findings were found in lettuce by several authors (Ramos et al., 2011; Hawrylak-Nowak, 2013; Rios et al., 2013; Silva et al., 2018a), and in particular Rios and co-workers (Ríos et al., 2008) reported an increase in S content in lettuce shoots with Se concentrations up to 40 μM. The first stage in the S-assimilation process consists of the activation of the enzyme ATP-sulfurylase, which produces adenosine phosphosulfate from sulfate and ATP (Pilon-Smits et al., 1999). Then, activated selenate is reduced *via* selenite to selenide and assimilated into SeCys and SeMet. These Se-amino acids can replace their S-analogues, amino acids Cys and Met in proteins (Sors et al., 2005; Van Hoewyk, 2013). In this sense, selenate applications can increase the ATP-sulfurylase activity and consequently a greater presence of selenate could imply increased production of Se and S end products (Ríos et al., 2008). Furthermore, despite the highest SULTR expression and sulphate translocation from roots to the shoots, certain S amino acids tend to decrease as the Se dosage increases. In *Eruca sativa* a lower leaf content of Cys and glutathione was found when plants were treated with Se concentrations equal to or higher than 10 μM (Dall'Acqua et al., 2019). It is conceivable that the lower accumulation of S-compounds may be due to the interference of Se with the S flow through the assimilation pathway, consequently reducing sulphate demand and eliciting a higher accumulation of this anion in the leaves.

The effectiveness of a selenium biofortification program is strongly related with the capacity of the candidate crop to assimilate and accumulate this element in the edible parts of the plant. In the current study Se leaf content increased with selenate application rate (**Figure 1**). Comparing cultivars, red leaf lettuce accumulated on average 57% more Se than green one. Selenium leaf content was influenced by cultivar and Se treatments with highly significant interaction between the two studied factors. In particular, Se concentration peaked in green Salanova at 40 μM dose (128.43 mg kg−1 dw), while in red Salanova it peaked at 32 and 40 μM (116.67 and 128.20 mg kg−1 dw of Se, respectively). Anyhow, Se leaf content was significantly higher than the control treatment in treatments ≥ 16 μM dose for both cultivars. Our results are in agreement with previous studies on red and greenpigmented lettuce (Ramos et al., 2010; Hawrylak-Nowak, 2013; Silva et al., 2018a) demonstrating the actual feasibility of using lettuce crop in Se biofortification programs.

In the Mediterranean basin, dietary habits vary according to geographical area, but overall the well-known Mediterranean diet is mainly based on cereals, fruit, vegetables, dairy products and meat. The daily intakes of food groups considered part of the Mediterranean diet are: 219 g of cereals, 247 g of fresh and dried fruit, 226 g of vegetables and legumes, 327 g of dairy products and 136 g of meat and fish (Couto et al., 2011). These food intakes, multiplied by the average Se concentration of the individual groups, correspond to a total Se intake of around 80 μg day−1 per capita. Considering that the RDA of this trace element stipulated for adults is 55 μg day−1 (Johnson et al., 2003),

FIGURE 1 | Effects of genotype and selenium concentration in the nutrient solution on selenium biofortification of green and red Salanova lettuce grown hydroponically in a Fitotron open-gas-exchange growth chamber under six Se concentrations applied in the nutrient solution. Different letters indicate significant differences according to Duncan's test (P < 0.05). The values are means of three replicates. Vertical bars indicate ± SE of means.

it can be deduced that Se deficiency has a very low incidence in the Mediterranean area. In other countries, such as Brazil, it was found that the Se intake is only 25 μg day−1, so about 30 μg Se day−1 must be integrated to reach the minimum recommended dose (Silva et al., 2019). The average serving of leafy vegetables, including lettuce, is about 50 g fw (Voogt et al., 2010). In our experiment, Se daily intake and percentage of RDA-Se for Se intake through consumption of 50 g portions of fresh green and red Salanova lettuce were influenced by cultivar and Se treatments with significant C × Se interaction (**Table 3**). Se daily intake increased significantly and linearly in both cultivars with selenate concentration ranging from 2 to 377 μg day−1 in green Salanova and from 4 to 355 μg day−1 in red Salanova (**Table 3**). Consequently, the RDA-Se varies with the same trend reaching a peak at 40 μM dose in both cultivars (685 and 646%, respectively for the green and red Salanova, respectively). Our RDA-Se values observed at the lowest Se dose (8 μM), were comparable with those found by Smoleń et al. (2019) on six varieties of lettuce biofortified with selenium combined with iodine at the 6.3 μM Se dose. Particularly, the iceberg varieties Krolowa and Maugli showed the lowest values (23.8 and 27.1%, respectively), while the green butterhead Cud Voorburgu and the red lettuce Lollo rossa reached the highest percentage (44.7 and 44.8%, respectively) which were comparable with the values found in green and red Salanova at the 8 μM Se dose (57 and 45%, respectively). Taking into account the Se biofortification target, 50 g fw day−1 of green and red Salanova at 16 μM Se dose provide 50 and 106 μg Se day−1 respectively (91 and 193% of the RDA), then in countries like Brazil, the RDA can be satisfied by consuming only 15 g fw day−1 of red Salanova or 30 g fw day−1 of green Salanova. On the other hand, in order to assess the risks TABLE 3 | Selenium daily intake, percentage of recommended daily allowance for Selenium (RDA-Se) and hazard quotient (HQgv) for Se intake through consumption of 50 g portions of fresh green and red Salanova lettuce by adult humans (70 kg body weight) grown hydroponically in a Fitotron opengas-exchange growth chamber under six Se concentrations applied in the nutrient solution.


ns,\*\*, \*\*\* Nonsignificant or significant at P ≤ 0.01, and 0.001, respectively. In the absence of interaction, cultivar means were compared by t-Test and Se application means by Duncan's multiple-range test (P = 0.05). Different letters within each column indicate significantly different means. All data are expressed as mean ± SE, n = 3.

to human health, the green vegetables hazard quotient (HQgv) was calculated according to the United States Environmental Protection Agency (USEPA) Protocol, where HQgv values below 1.00 indicate that the vegetable is safe for consumption by human beings. In the current study HQgv increased with selenate application rate ranging from 0.00 to 0.94 in green Salanova and from 0.01 to 0.89 in red Salanova, therefore the 50 g daily portion of biofortified lettuce can be considered safe since the values of HQgv are less than 1 in all treatments (**Table 3**). In particular, in lettuce at 16 μM Se dose, the HQgv values are very low (0.12 and 0.27, respectively for green and red Salanova), indicating that even if the standard 50 g portion was tripled, these vegetables would not be in any case detrimental to human health.

### Target Phenolic Compounds and Carotenoids Profiles

HPLC analysis revealed in both cultivars the presence of four main caffeic acid derivatives (**Table 4**). Chicoric acid was the most abundant phenolic acid detected in both cultivars (101.44 and 105.99 mg 100 g−1 dw, respectively for the green and the red cultivar), chlorogenic acid (88.02 mg 100 g−1 dw) and caffeoyl-meso-tartaric acid (41.08 mg 100 g−1 dw) were higher in red Salanova, while caffeoyl-tartaric acid (17.77 mg 100 g−1 dw) was higher in green Salanova compared to the red cultivar (**Table 4**). The sum of detected phenolic acids was higher in the red-pigmented cultivar with respect to the green one (239.52 and 139.10 mg 100 g−1 dw, respectively). The content of phenolic acids varies according to the type of lettuce (Kim et al., 2016). Our results are consistent with the literature in which red cultivars have more phenolic acids than green ones (Llorach et al., 2008; Kim et al., 2016). The presence of chlorogenic acid, chicoric acid, and caffeoyl tartaric acid was also detected in seven different lettuce cultivars previously studied by Rouphael et al. (2017a). All phenolic acids were affected by cultivar and Se treatments with significant cultivar × Se interaction (**Table 4**). In green Salanova, caffeoyl-tartaric acid increased by 69% and 46% respectively at Se doses of 16 and 24 μM, but decreased by 75% at 32 μM, while in red Salanova the highest content was obtained at 16 μM (105%) compared to the control. Chorogenic acid in the green cultivar decreased by 57% at Se 32 μM but increased by 143% at the most concentrated Se dose, while in the red cultivar the content increased at 8, 16, 24 and 40 μM with the highest value recorded at 16 μM (191.64 mg 100 g−1 dw). Similarly, chicoric acid in the green cultivar increased at Se doses of 8, 16, 24 and 40 μM with the highest value recorded at 16 μM (148.53 mg 100 g−1 dw), but decreased by 67% at 32 μM; conversely, in the red cultivar chicoric acid content increased by 32% at 16 μM but decreased at Se doses 8, 24, 32 and 40 μM (**Table 4**). In red Salanova, caffeoyl-meso-tartaric acid increased by 270%, 84% and 89%, respectively, by adding in the nutrient solution 16, 24, and 40 μM of Se compared to the control treatment, while no significant differences were found for this phenolic acid in green Salanova. In the green cultivar, the sum of detected phenolic acids was significantly higher at 8, 16, 24, and 40 μM with the highest value observed at 24 μM (194.55 mg 100 g−1 dw), but decreased by 67% at 32 μM, while in red cultivar the sum of phenolic acids increased by 112% at 16 μM and decreased at Se doses of 8, 32, and 40 μM compared to the control (**Table 4**).

Our results showed irregular variation of phenolic acids content in both cultivars, as the concentrations of these hydrophilic antioxidant molecules varied with Se concentration without a clear trend. Furthermore, this pattern is consistent with what was found by Schiavon et al. (2016) in radish and by D'Amato et al. (2018) in rice sprouts, but is in disagreement with Ríos et al. (2008) who reported a rise in the total phenol content of lettuce as the Se dose applied increased. On the other hand, the presence of Se constitutes an abiotic stress similar to that caused by other heavy metals. Plants react to their presence by activating the phenylpropanoid pathway (Wang et al., 2016) to produce phenolic compounds that can chelate metals and inhibit enzymes such as xanthine oxidase in an effort to prevent the production of Reactive Oxygen Species (ROS) (Ríos et al., 2008).

Anthocyanins are one of the phenolic phytochemical subclasses (Harborne and Williams, 2001) encompassing water-soluble pigments responsible for the red pigmentation


TABLE 4 | Phenolic acids composition, total phenolic acids and anthocyanins of green and red Salanova lettuce grown hydroponically in a Fitotron open-gasexchange growth chamber under six Se concentrations applied in the nutrient solution.

ns, \*\*, \*\*\* Nonsignificant or significant at P ≤ 0.01, and 0.001, respectively. In the absence of interaction, cultivar means were compared by t-Test and Se application means by Duncan's multiple-range test (P = 0.05). Different letters within each column indicate significantly different means. All data are expressed as mean ± SE, n = 3. n.d. not detectable.

in lettuce (Kim et al., 2016). Consequently, these pigments were not detected in green Salanova but exclusively in the red cultivar with an average concentration of 13.28 μg cyanidin eq. g−1 dw (**Table 4**). Anthocyanins have many physiological effects on plants and humans, such as antioxidation, protection against ultraviolet damage and the prevention and treatment of various diseases (Hamilton, 2004). Anthocyanins in red Salanova, were found to be significantly affected by selenate applications; in particular they increased by 184%, 84%, and 31% respectively at Se doses of 16, 24, and 32 μM compared to the control (**Table 4**). Our results are in accordance with Liu et al. (2017), where anthocyanins in red lettuce cv. Purple Rome increased significantly at moderate doses of Se, while they were lower and comparable to the control at higher Se doses. In their study, the authors showed that the Se influence on accumulation and molecular regulation of anthocyanins synthesis was mainly due to the expression levels of the flavanone 3-hydroxylase (F3H) and UDP-glycose flavonoid glycosyl transferase (UFGT) genes that played a key role in anthocyanins biosynthesis. The F3H and UFGT genes were significantly up-regulated by moderate Se treatments compared to the control (Liu et al., 2017).

Carotenoids are essential lipid-soluble pigments that have antioxidant properties and are found in all photosynthetic organisms (Gross, 1991). These compounds play significant roles in the prevention of chronic ailments, such as cancer, cardiovascular disease, diabetes and osteoporosis, owing to their potent antioxidant, immunomodulatory, gap-junction communication, photoprotective, neuroprotective, and vitamin A activity (Saini et al., 2015). Carotenoids are classified into two groups, xanthophylls which include neoxanthin, violaxanthin, lutein, zeaxanthin, and β-cryptoxanthin, and carotenes which include β-carotene, α-carotene, and lycopene. In human diet, neoxanthin, violaxanthin, lutein, and β-carotene are primarily obtained from dark green or red vegetables. Specifically in lettuce, higher carotenoids content has been found in red leaf cultivars compared to green ones (Nicolle et al., 2004). This finding is in agreement with our results where red Salanova had a significantly higher content of all the target carotenoids detected compared to green Salanova. The sum of all detected carotenoids was 133% higher in the red cultivar compared to the green one (**Table 5**). As in the case of phenolic compounds, the content in target carotenoids was affected by both cultivar and Se treatments with significant however cultivar × Se interaction (**Table 5**). In green Salanova, all detected carotenoids decreased in response to selenate applications compared to the control (**Table 5**), whereas in red Salanova this trend was differentiated. violaxanthin + neoxanthin, lutein, TABLE 5 | Composition of carotenoids profile of green and red Salanova lettuce grown hydroponically in a Fitotron open-gas-exchange growth chamber under six Se concentrations applied in the nutrient solution.


ns, \*\*\* Non-significant or significant at P ≤ 0.001, respectively. In the absence of interaction, cultivar means were compared by t-Test and Se application means by Duncan's multiple-range test (P = 0.05). Different letters within each column indicate significantly different means. All data are expressed as mean ± SE, n = 3.

and β-cryptoxanthin increased in red Salanova with increasing selenate application levels, reaching their highest levels at the 32 μM Se dose, whereas β-carotene in the 24–40 μM Se dose range was on average 23% lower than the control. Regarding the green cultivar, our results are in agreement with what has been found in the literature on lettuce (Hawrylak-Nowak, 2013), rice (D'Amato et al., 2018), and *Arabidopsis* (Sams et al., 2011), where a reduction of the total carotenoids content was observed following the application of sodium selenate. Pertinent to these results is previous work on *Arabidopsis* that has demonstrated that the presence of selenate may down-regulate phytoene synthase, a major enzyme involved in the biosynthesis of carotenoids (Sams et al., 2011). On the other hand, the increase in xanthophylls (violaxanthin + neoxanthin, lutein and β-cryptoxanthin) found in red Salanova in response to Se doses up to the 32 μM could be associated to a dissimilar activation of molecular and physiological mechanisms in this cultivar, which differently influence the biosynthesis and accumulation of secondary metabolites, such as xanthophylls. Moreover, in our experiment, it was noted that the presence of selenate had contrasting effects on various classes of secondary metabolites.

# Principal Component Analysis

A comprehensive overview of the nutritional and functional quality profiles determined by ion chromatography and HPLC-DAD on red and green butterhead Salanova lettuce in response to Se concentration in the nutrient solution was obtained through Principal Component Analysis (PCA; **Figure 2**). The principle component (PC1) accounted for 51.1% of the cumulative variance, while PC2, and PC3 explained 23.4 and 8.2%, respectively of the total variance (**Table 6**). PC1 correlated positively to the four target carotenoids, caffeoyl-meso-tartaric and chlorogenic acid, magnesium, and sulphate content. PC1 correlated negatively to agronomical traits (shoot biomass and leaf number), as well as to nitrate, calcium, and potassium content. PC2 positively correlated to fresh yield, chicoric acid, total phenolic acids, and phosphate content; and negatively to leaf dry matter and Se content (**Table 6**). Furthermore, the loading matrix indicated the correlations among the examined quanti-qualitative traits, wherein two variables at an angle < 90° were positively correlated, whereas an angle > 90° designated negatively correlated variables. In our experiment, variation in chlorogenic and anthocyanin contents were most closely aligned with β-carotene content, whereas variation in total phenolics did not correlate to nitrate content (**Figure 2**).

The effectiveness of PCA in interpreting cultivar differences across multiple nutritional and functional quality characters in response to several pre-harvest factors (e.g., nutrient solution management, biofortification, plant biostimulants) has been previously demonstrated (Colonna et al., 2016; Cardarelli et al., 2017; El-Nakhel et al., 2019). This was also the case in our study,

TABLE 6 | Eigen values, relative and cumulative proportion of total variance, and correlation coefficients for growth parameters, mineral profile, nutritional and functional

concentrations of selenium (Se) added as sodium selenate (0, 8, 16, 24, 32, and 40 μM).



aBoldface factor loadings are considered highly weighed. bLN, leaf number; DM, dry matter; LA, leaf area.

since the score plot of the PCA highlighted crucial information on the nutritional and functional quality of the tested butterhead cultivars exposed to different Se concentrations in the nutrient solution. The PCA clearly divided the two tested cultivars along PC1 with red-pigmented lettuce on the positive side and the green one on the negative side. Accordingly, greenpigmented lettuce distinguished for fresh and dry biomass, nitrate and mineral profile (Ca, phosphate and K contents); whereas the red-pigmented cultivar was superior in target lipophilic and hydrophilic antioxidant molecules as well as in total phenolic acids (**Figure 2**). Particularly, the red-pigmented lettuce treated with 8, 16, and 24 µM Se, positioned in the upper right quadrant of the PCA score plot, delivered premium quality and high concentration of hydrophilic and lipophilic antioxidants (**Figure 2**). Red Salanova at the highest two doses of Se was characterized by high content of Se and sulphate. Green butterhead lettuce grown under 0, 16, and 24 µM Se was positioned in the upper left quadrant, characterized overall by higher plant growth parameters (leaf area, fresh yield and shoot dry biomass) and mineral composition (phosphate, K, and Ca). Finally, the lower left quadrant depicted high Se concentration treatments of green lettuce, which yielded the lowest nutritional and functional quality traits of all 12 treatments except from a high percentage of leaf dry matter content (**Figure 2**). The PCA performed in the present study configured an integrated view of yield and quality traits quantitated by ion chromatography and HPLC. It thus enabled the interpretation of variation patterns in these traits with respect to the genetic material and Se biofortification applications studied.

# CONCLUSIONS

As demand for functional foods with beneficial effects on human health is rising, selenium biofortification of lettuce facilitated in closed soilless cultivation is presently demonstrated as an effective, low-cost method to produce Se-enriched food of high nutritional value. Our findings indicate that shoot dry biomass, mineral composition, as well as phenolic acids and carotenoids were strongly affected by genotype, with the red cultivar proved to have higher nutritional and functional quality than the green one. Our results demonstrated that the application of 16 μM Se in the nutrient solution improved the phenolic acids content in both cultivars, especially in red Salanova, which was also distinguished by a substantial increase in anthocyanins content (184%). In green Salanova, Se applications slightly reduced the overall carotenoids content, while in the red cultivar 16 and 32 μM Se doses triggered an increase in violaxanthin + neoxanthin, lutein and β-cryptoxanthin. Therefore, we can deduce that the optimal Se dose is 16 μM, as it improves the nutraceutical characteristics in both cultivars with a slight and acceptable reduction in fresh marketable yield (8%) recorded only in green Salanova. Selenium leaf content increased significantly with the sodium selenate application rate in both cultivars. Moreover, the 16 μM treatment yielded sufficient Se leaf content to satisfy 91% and 193% of RDA of this trace element by consuming respectively 50 g fw of green and red Salanova, without any toxic effect to humans, since the amount does not exceed the maximum allowable intake.

# REFERENCES


# DATA AVAILABILITY STATEMENT

The datasets generated for this study are available on request to the corresponding author.

### AUTHOR CONTRIBUTIONS

AP wrote the first draft of the manuscript, followed the statistical analysis, and contributed to results data interpretation. CE-N carried out the Fitotron experiment and wrote the first draft of the manuscript. MG and SS performed the mineral analysis and data interpretation. MK and SP were involved in data analysis, data interpretation, and editing the manuscript. YR coordinated the whole project, provided the intellectual input, set up the experiment, and corrected the manuscript.

### ACKNOWLEDGMENTS

The authors are grateful to Anna-maria Palladino, Mirella Sorrentino, and Antonio De Francesco for their technical assistance in the Fitotron Plant Growth Chamber experiment, as well as to Dr. Sabrina De Pascale, Prof. Paola Vitaglione and Dr. Antonio Dario Troise for providing the access to HPLC facilities and analysis.


and nitrogen and its relation to grain quality. *J. Cereal Sci.* 79, 508–515. doi: 10.1016/j.jcs.2018.01.004


phytate concentrations in seeds. *J. Sci. Food Agric.* 29, 371-379. doi: 10.1002/ jsfa.9872


Yang, F., Chen, L., Hu, Q., and Pan, G. (2003). Effect of the application of selenium on selenium content of soybean and its products. *Bio. Trace Elem. Res.* 93, 249– 256. doi: 10.1385/bter:93:1-3:249

**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Pannico, El-Nakhel, Kyriacou, Giordano, Stazi, De Pascale and Rouphael. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Metabolic Engineering a Model Oilseed Camelina sativa for the Sustainable Production of High-Value Designed Oils

Lixia Yuan<sup>1</sup> and Runzhi Li 2\*

<sup>1</sup> College of Biological Science and Technology, Jinzhong University, Jinzhong, China, <sup>2</sup> Institute of Molecular Agriculture and Bioenergy, Shanxi Agricultural University, Taigu, China

Camelina sativa (L.) Crantz is an important Brassicaceae oil crop with a number of excellent agronomic traits including low water and fertilizer input, strong adaptation and resistance. Furthermore, its short life cycle and easy genetic transformation, combined with available data of genome and other "-omics" have enabled camelina as a model oil plant to study lipid metabolism regulation and genetic improvement. Particularly, camelina is capable of rapid metabolic engineering to synthesize and accumulate high levels of unusual fatty acids and modified oils in seeds, which are more stable and environmentally friendly. Such engineered camelina oils have been increasingly used as the super resource for edible oil, health-promoting food and medicine, biofuel oil and high-valued chemical production. In this review, we mainly highlight the latest advance in metabolic engineering towards the predictive manipulation of metabolism for commercial production of desirable bio-based products using camelina as an ideal platform. Moreover, we deeply analysis camelina seed metabolic engineering strategy and its promising achievements by describing the metabolic assembly of biosynthesis pathways for acetyl glycerides, hydroxylated fatty acids, medium-chain fatty acids, w-3 long-chain polyunsaturated fatty acids, palmitoleic acid (w-7) and other high-value oils. Future prospects are discussed, with a focus on the cutting-edge techniques in camelina such as genome editing application, fine directed manipulation of metabolism and future outlook for camelina industry development.

Keywords: Camelina sativa (L.) Crantz, model oilseed, metabolic engineering, fatty acids, designed oil

# INTRODUCTION

Plant seed oils enriched in triacylglycerols (TAGs) consisting of three fatty acids esterified to a glycerol backbone are energy-dense molecules that are utilized for energy production in the life cycle of plants (Athenstaedt and Daum, 2006). More importantly, they not only provide the nutritional requirements of humans and animals but also serve as a renewable chemical feedstock for biofuels and various industrial applications (van Erp et al., 2011). Although global production of vegetable oils has increased in recent decades, a wider gap between the production and consumption still

### Edited by:

Eugenio Benvenuto, Italian National Agency for New Technologies, Energy and Sustainable Economic Development (ENEA), Italy

### Reviewed by:

Enrique Martinez Force, Instituto de la Grasa (IG), Spain Joaquín J. Salas, Instituto de la Grasa (IG), Spain

> \*Correspondence: Runzhi Li rli2001@126.com

### Specialty section:

This article was submitted to Crop and Product Physiology, a section of the journal Frontiers in Plant Science

Received: 23 October 2019 Accepted: 08 January 2020 Published: 12 February 2020

### Citation:

Yuan L and Li R (2020) Metabolic Engineering a Model Oilseed Camelina sativa for the Sustainable Production of High-Value Designed Oils. Front. Plant Sci. 11:11. doi: 10.3389/fpls.2020.00011

**36**

exists. To meet the ever-growing market demands for vegetable oils, it is much needed to genetically improve seed oil yield and quality from oil crops.

The fatty acid profile and their distribution in TAGs of plant oils determines oil quality, physicochemical properties, and uses. TAGs from commercially grown oilseed crops typically contain mainly five fatty acids including palmitic (16:0), stearic (18:0), oleic (18:1D9), linoleic (18:1D9,12), and a-linolenic (18:1D9,12,15) acids. In contrast, a wide variety of fatty acids with different chain lengths and functional groups were found to be highly accumulated in seeds of many uncultivated plant species (Badami and Patil, 1980). TAG containing modified fatty acids with functionality beyond those found in commercially-grown oil seed crops represents a valued resource for bio-based materials and other diverse uses (Dyer et al., 2008).

Over the years, an increased understanding has been made on plant lipid metabolism and its regulation, coupled by well characterization of fatty acid biosynthesis, modification and assembly into TAGs (Cahoon et al., 2002; Kagale et al., 2014). With this information and the wealth of genetic diversity for synthesis of novel fatty acids and storage oils, plant seeds have been developing as platform for the design and tailoring of biochemical pathways to synthesize diverse nutritional and industrial oils not currently found in oilseed crops (Haslam et al., 2016). Various genetic modification tools have been developed including gene editing and synthetic biology techniques, which allow to rapidly assembly novel pathways in oilseed crops for commercially producing high levels of designed lipids/oils and high-valued compounds.

Until recently, much of this work was made in metabolically engineering the model plant Arabidopsis thaliana for the production of various modified fatty acids (Arondel et al., 1992; Farmer et al., 1998; Mu et al., 2008; Bates and Browse, 2011). However, this model plant has poor agronomic traits such as small seed yield and unable large-scale field cultivation, which has limited the functional testing of the modified oil. In contrast, Camelina sativa (L.) Crantz, an important oilseed crop in the family Brassicaceae, possesses a number of valuable agronomic traits that recommend it as both a new model system and an ideal crop platform for lipid metabolic engineering (Zubr, 1997; Kagale et al., 2014; Ruiz-Lopez et al., 2015; Bansal and Durrett, 2016; Malik et al., 2018). Camelina has a relatively short life cycle, low water and fertilizer requirements. Camelina seed yield is comparable to other oil seed crops, particularly under stress conditions. It's simple, effective transformation system, combined with the availability of abundant transcriptomic and genomic data, has allowed the generation of engineered camelina lines capable of synthesizing high levels of novel oils or UFAs, further enabling subsequent field testing of such traits at a large scale (Lu et al., 2011; Nguyen et al., 2013; Ruiz-Lopez et al., 2014; Malik et al., 2018).

This review was conducted to investigate why camelina is particularly attractive as an ideal model oilseed for metabolic engineering and a platform for commercial production of highvalued bioproducts. We will briefly overview advances in the metabolic engineering of unusual lipids or novel oils in this oil seed crop, combined with author group's work on camelina functional genomics and genetic improvement (Li et al., 2010; Wu et al., 2012; Yuan et al., 2017a; Yuan et al., 2017b). Main description in the selected examples focus on the pathway reconstruction for high synthesis and accumulation of acetyl triacylglycerols, hydroxylated fatty acids, medium-chain fatty acids, w-3 long-chain polyunsaturated fatty acids, w-7 monounsaturated fatty acids, and other novel lipids having beneficial functional groups or properties. Moreover, we discuss the cutting-edge research directions in camelina such as genome editing application, a flexible and useful substrate for applied synthetic biology, and future outlook for camelina industry development.

# AN IDEAL MODEL OILSEED FOR LIPID METABOLIC ENGINEERING

Camelina has been identified as a promising new crop for oil production due to its several excellent characteristics of low requirements, a short crop cycle (80–100 days), high disease-pest resistance and stress tolerance (Zubr, 1997; Zanetti et al., 2013). In terms of performance, camelina has high yield in favorable environments, and camelina seeds accumulate high levels of oil (40%) and protein (30%) compared to other Brassicaceae crops (Vollmann and Eynck, 2015). Particularly, in camelina oil, UFAs make up 90%, including 40% of a-linolenic acid (w-3), 25% of linoleic acid, 15% of oleic acid, and 15% of eicosenoic acid. This desirable fatty acid composition enables camelina to be developed as nutritionally enhanced oils. With several agronomic advantages, camelina could be easily developed for commercial production of vegetable oil as much healthy food and a renewable resource for green manufacture of high-quality biofuels (Lu and Kang, 2008).

Meanwhile, camelina has been considered as a platform for the production of specific oils (Collins-Silva et al., 2011; Bansal and Durrett, 2016; Haslam et al., 2016). Camelina shares many (> 90%) of the genes involved in lipid metabolism with the genetic model plant Arabidopsis (Nguyen et al., 2013; Kagale et al., 2014). The rational design of lipid pathways in camelina had made more reliable with the reference genome in 2014 (Kagale et al., 2014). Moreover, the expanded lipid gene family in C. sativa provides greater diversity in enzyme expression and substrate specificity. Many of the shortcomings associated with model species, discussed later, can be overcome with camelina, as it has the ability to be both an experimental model system and recognized oilseed crop. Currently, camelina is getting the rising interest and increasing expanding of cultivation across the world (Gugel and Falk, 2006; Mcvay and Khan, 2011; Guy et al., 2014; Malik et al., 2018). A number of lipid metabolic engineering in seeds of C. sativa in recent years were summarized in Table 1.



(Continued)

TABLE 1 | Continued


# SELECTED EXAMPLES OF METABOLIC ENGINEERING FOR PRODUCTION OF THE DESIGNED OILS IN C. SATIVA

## Redesigning Acetyl Triacylglycerol (acetyl-TAG, acTAG) Synthesis for the Production of Superior Biodiesel and Lubricant Oils

AcTAGs (3-acetyl-1,2-diacyl-sn-glycerols) are unique and valuable triacylglycerols. Their molecular characteristics include two long-chain fatty acid acyl groups bound to the sn-1 and sn-2 positions of the glycerol molecule, respectively, and one acetyl group linked to the sn-3 position of the glycerol molecule (Durrett et al., 2010). Unlike ordinary TAGs (lcTAG) (three long-chain fatty acid acyl groups were bond on the three carbon atoms of the glycerol molecule, respectively), acTAGs have unique physical and chemical properties, showing their utility in a variety of applications. For example, their kinematic viscosity is 40% lower than that of ordinary TAGs. Therefore, acTAGs are excellent oils for use in the production of lowviscosity biofuels. The biodiesel produced can be used as a premium fuel for ships, trains, and generators. Another advantage of acTAGs is their low temperature tolerance. The biodiesel made from them is less prone to agglomeration and burning defects in low-temperature climates. AcTAGs can also be used to produce high-quality biodegradable lubricants (Liu et al., 2015a).

These high-value industrial oils cannot be synthesized in ordinary field oilseed crops, but in some wild plants, they are synthesized at high levels. For example, 98% of the seed oil of Euonymus alatus (burning bush) is acTAGs (Milcamps et al., 2005). Studies have shown that EaDAcT (E. alatus diacylglycerol acetyltransferase) catalyses the formation of acTAGs by the binding of a diacylglycerol (DAG) to an acetyl group at the sn-3 position. In the developing seeds of common oil crops, DGAT (diacylglycerol acyltransferase) or PDAT (phosphatidylcholine: diacylglycerol acyltransferase) catalyses DAG binding to a longchain fatty acid acyl group to generate "regular"long-chain TAG (lcTAG) but does not catalyse the formation of acTAGs. The gene encoding EaDAcT was overexpressed in Camellia seeds, resulting in the seed oil containing up to 55% acTAGs. EaDAcT and DGAT or PDAT share the same substrate, DAG. If the activity of endogenous DGAT or PDAT can be silenced, more DAG can be used to generate acTAGs for EaDAcT. Silencing the expression of three DGAT1 genes by RNAi suppression in C. sativa and overexpressing EaDAcT resulted in increasing acTAGs in the seed oil of transgenic C. sativa to 85%, and it was stably inherited (Liu et al., 2015a). Continuous field trials showed that compared with the wild-type, the seed weight and oil and protein content of this transgenic C. sativa were not significantly different, and the seed germination was normal. During seed germination, AcTAGs can also be degraded like regular lcTAGs for seedling growth (Liu et al., 2015b).

In order to further improve the physicochemical properties of acTAGs and expand their industrial applications, some studies have attempted to replace the long-chain polyunsaturated fatty acid acyl groups at the sn-1 and sn-2 positions of the acTAG molecule with other long-chain fatty acid acyl groups. The combination of monounsaturated oleic acid (18:1) acyl groups at the sn-1 and sn-2 positions of acTAGs improves the oxidation resistance of acTAGs. The oleic acid-rich strains of Camelina obtained by RNAi silencing of GsFAD2 and were used as recipients, and vectors to over-express EaDAcT and RNAisilence CsDGAT and CsPDAT were transferred to the receptor, resulting in an acTAG content in the seed oil of transgenic C. sativa as high as 70%, particularly, the content of 3-acetyl-1,2-dioleoyl-sn-glycerol was 47% (Liu et al., 2015a). Such oil showed a significant increase in the oxidation resistance. In addition, the middle chain fatty acid acyl group is bonded to the sn-1 and sn-2 positions of the acTAG molecule to further reduce the viscosity of the acTAG lipids (Durrett et al., 2010). This shows that changing the fatty acid acyl species at the sn-1 and sn-2 positions gives the acTAGs more desirable properties and broader industrial applications. Future efforts are needed to design a pathway to incorporate acetate into sn-1 and sn-2 positions of glycerol backbone (Figure 1).

Additionally, the introduction of acTAG into edible oilseed crops may provide an opportunity to develop reduced-calorie fats and oils with a molecular structure similar to that of existing commercial products such as SALATRIM (short and long acyl triglyceride molecule).

### Redesigning Hydroxylated Fatty Acid Biosynthesis for the Production of Oxidation Resistance Oils

The oxidation level of vegetable oils depends on the fatty acid composition. If the content of polyunsaturated fatty acid is high, the oxidizability of the vegetable oils is high. But high contents of saturated fatty acids, although more resistance to oxidation, reduce the fluidity of the oil, making it easily solidified. Vegetable oils containing high level of monounsaturated fatty

acids (e.g. oleic acid,18:1D9) have high oxidation resistance and other good properties.

Camelina seed oil consists of about 45% polyunsaturated fatty acids, namely linoleic acid (18:2D9,12) and linolenic acid (18:1D9,12,15), while only 17% of seed oil is monounsaturated oleic acid (18:1D9). This seed oil containing high levels of polyunsaturated fatty acids is easily oxidized. RNAi technology was used to specifically silence the FAD2 (Fatty acid desaturase 2), FAD3 (Fatty acid desaturase 3), and FAE1 (Fatty acid elongase 1) genes in camelina, obtaining a transgenic camelina seed oil with significantly reduced content of polyunsaturated fatty acids and a high level of oleic acid (from 18% in wild-type to 65% in the transgenic seeds). Such camelina oil exhibited significantly improved oxidation resistance (Cahoon et al., 2007). In addition, another strategy is to express an enzyme that catalyses the formation of monounsaturated hydroxylated fatty acids in the seeds of camelina, promoting the biosynthesis and accumulation of high levels of hydroxylated fatty acids. Because this fatty acid is extremely resistant to oxidation, the inoxidizability of the seed oil can be increased. Hydroxylated fatty acids have been widely used in the industrial production of resins, waxes, nylons, plastics, lubricants, and cosmetics.

Castor bean (Ricinus communis) seed oil consist of up to 90% ricinoleic acid (18:1D9,12OH) (a type of monounsaturated hydroxylated fatty acid, HFA). A fatty acid hydroxylase FAH12 was found to catalyze the oleic acid molecule (18:1D9) bound to PC to generate a hydroxyl group at the D12 carbon atom. Seedspecific expression of the RcFAH12 gene from R. communis resulted in ricinoleic acid in camelina seeds reaching up to 6% (Lu and Kang, 2008). Correspondingly, the oxidation resistance is significantly higher than that of non-transgenic camelina oil. Lesquerelic acid is another hydroxylated fatty acid similar to ricinoleic acid and accumulates in high levels in plant seeds of the Physaria genus and Cruciferae family. A fatty acid condensing enzyme, LfKCS derived from Physaria fendleri, specifically catalyses the elongation of ricinoleic acid to hydroxyalkanoic acid. The synergistic expression of RcFAH12 and LfKCS in camelina seeds not only increased ricinoleic acid from 14% to 19% but also resulted in hydroxyarsenoic acid reaching 8%. The total amount of hydroxylated fatty acids reached 27%. The oxidation resistance of this camelina oil was greatly improved, and seed vigor was not affected (Snapp et al., 2014). Clearly, LfKCS expression accelerates the removal of hydroxylated fatty acids from the phosphatidylcholine (PC) synthesis pool into the acyl-CoA pool to finally form TAGs. A phospholipase C-like protein (RcPLCL1) from castor bean was identified to have hydrolyzing activities on both PC and phosphatidylinositol (PI) substrates (Aryal and Lu, 2018). Co-expression of RcPLCL1 and RcFAH12 resulted in accumulation of HFAs up to 24% of total FAs in C. sativa seeds (Aryal and Lu, 2018) with less detrimental effect on seed germination, showing that RcPLCL1 can promote the transfer of RcFAH12-formed HFAs on PC into DAG to generate TAGs containing HFAs.

With the Arabidopsis model plant as a receptor, the coexpression of R. communis RcFAH12 with RcDGAT2 and RcPDAT which controls the final acylation reaction of TAG synthesis can further increase the hydroxylated fatty acid content in seed oil up to 29% (Burgal et al., 2008; van Erp et al., 2011). Co-expression of RcFAH12 and RcPDCT (phosphatidylcholine: diacylglycerol choline phosphotransferase) increased the HFA accumulation in Arabidopsis seeds from approximately 10% to 20% (Hu et al., 2012). Unlike conventional DGAT, PDAT, and PDCT, R. communis homologs are specific for hydroxylated fatty acid substrates. These castor enzymes can accelerate the transfer of HFAs from the PC pool and CoA pool into DAG to form hydroxylated TAGs. It is hypothesized that the coexpression of these three enzyme genes with RcFAH12 in camelina seed will allow the amount of hydroxylated TAG to accumulate to levels appropriate for commercial use. Since fatty acid thioesterase A (FatA) and fatty acid thioesterase B (FatB) in camelina plastid were identified to be specific for oleoyl-ACP and palmitoleic acid-ACP, respectively (Rodriguez-Rodriguez et al., 2014), in the future, the molecular manipulation of these two enzymes as targets will allow cells to selectively accumulate oleic acid (18:1D9) or palmitoleic acid (16:1D9) in TAGs and increase the resistance to oxidation of camelina seed oil.

### Assembling Medium-Chain Fatty Acid Biosynthesis for the Production of High-Quality Jet Oils

Jet fuels (Jet A and Jet-A1 fuels) consist mainly of C8-C16 alkanes and aromatic hydrocarbons (Kallio et al., 2014). The main fatty acids of most common oilseeds, such as camelina, are 18C fatty acids, which are directly used to process aviation fuels with poor quality and lengthy processes. Vegetable oils rich in caprylic acid (8:0), capric acid (10:0), lauric acid (12:0), myristic acid (14:0) are excellent resources for the production of aviation biofuels (Dyer et al., 2008). Medium-chain fatty acids (MCFAs) are also widely used in the production of detergents, soaps, cosmetics, surfactants, and lubricants.

The palm kernel of the tropical palm plant (Elaeis guineensis Jacq.) and the coconut meat of coconut (Cocos nucifera L.) are rich in lauric acid (46% to 52%) and decanoic acid (16% to 19%), serving as the main sources of commercial MCFAs. Some of the Lythraceae Cuphea plants can be enriched with >90% of MCFAs. For example, the seed of Cuphea viscosissima contains about 25% caprylic acid and about 64% capric acid. The seed of Cuphea pulcherrima is enriched with about 95% caprylic acid (Knothe, 2014). These plants have indeterminate agronomic traits such as infinity of inflorescence, shattering, and seed dormancy, being difficult to use in the commercial production of seed oil. However, they can be used as an excellent gene source for MCFA biosynthesis (Filichkin et al., 2006).

The de novo synthesis of plant fatty acids occurs in the plastids. Acetyl-CoA and malonyl-ACP are catalyzed by b-ketoacyl-ACP synthase III (KASIII) to form 4C b-ketoacyl-ACP. Then, under the action of the fatty acid synthase (FAS) complex, each cycle adds 2 carbon atoms until 16-carbon palmitate-ACP (16:0-ACP) is formed. 16:0-ACP can be further extended to stearic acid-ACP (18:0-ACP) under the action of KASII. It is also possible to dissociate palmitate from ACP and terminate the elongation of the fatty acid carbon chain under the catalysis of acyl-ACP thioesterase FatB. The 18C fatty acids synthesized in the plastids, namely stearic acid (18:0) and oleic acid (18:1), are catalyzed by FatA thioesterase to dissociate from ACP (Figure 2). Therefore, acyl-ACP thioesterases are the major determinants of the synthesis of fatty acid carbon chain lengths in the plastids (Li-Beisson et al., 2013).

After being transferred from the plastids, MCFAs and palmitic acid (16:0) bind to CoA and then enter the endoplasmic reticulum of the cytoplasm. After a series of reactions, they eventually bind to the sn-1, 2, and 3 positions of the glycerol carbon skeleton and turn into TAG. Glycerol-3 phosphate acyltransferase (GPAT), lysophosphatidic acid acyltransferase (LPAT), and DGAT in turn catalyze the esterification of MCFAs-CoA and 16:0-CoA into TAG molecules, and DGAT is thus the key enzyme to accumulate high levels of MCFAs and palmitic acid (Lu et al., 2011). It has been found that LPAT in most oil crops has substrate selectivity for unsaturated fatty acid-CoA (such as oleic acid-CoA) and no selectivity for MCFAs-CoA (Nlandu Mputu et al., 2009). CnLPAT derived from Cocos nucifera has a strong substrate specificity for lauric acid (12:0)-CoA (Kim et al., 2015).

To date, genes encoding FatB enzymes with higher substrate specificity for MCFAs than for palmitoyl-ACP has been isolated from Cuphea and Umbellularia californica seeds containing high levels of MCFAs. Heterologous overexpression of these FatBs resulted in the synthetic accumulation of MCFAs in Brassica napus and Arabidopsis seeds (Tjellstrom et al., 2013). RNAi was used to silence KASII to block the production of 18:0-ACP from 16:0-ACP, obtaining transgenic seeds that accumulated high levels of 16:0-ACP (Pidkowich et al., 2007). The transgenic camelina seeds specifically expresses the CpFatB2 gene derived from C. palustris accumulated 25% myristic acid (14:0). The C. sativa seed respectively accumulated lauric acid (12:0) to 18%, and capric acid (10:0) to 10% following the overexpression of UcFatB1, and ChFatB2 (C. hookeriana FatB2), respectively. The UcFatB1 gene from U. californica and the CnLPAT gene from coconut were co-expressed in camelina seeds, resulting in up to 30% accumulation of lauric acid (12:0) in camelina seeds (Collins-Silva et al., 2011).

Kim et al. (2015) performed the functional identification of FatBs from the C. viscosissima and C. pulcherrima, and subsequently used these FatBs for assembly of MCFA synthesis pathway in camelina seeds. Transcriptome analysis revealed that three FatB cDNAs, namely CpuFatB3, CvFatB1, and CpuFatB4, were abundantly expressed in developing seeds, showing positive association with the accumulation of MCFAs.

CpuFatB4 is selective for 12:0-ACP, 14:0-ACP, and 16:0-ACP. The content of palmitic acid (16:0) in camelina seeds overexpressing CpuFatB4 rose to 43.5% (5 times higher than in the wild-type), and the content of myristic acid (14:0) reached 8%. Similar to CpuFatB4, CpuFatB3 has broad-spectrum substrate specificity for various MCFA-ACPs. The

accumulation of capric acid (10:0) in the seeds of C. sativa overexpressing CpuFatB3 reached 1.2%, while only small amounts of other MCFAs (8:0, 12:0, 14:0) were synthesized. The accumulation of capric acid and palmitic acid in camelina seeds overexpressing CvFatB1 reached 9% and 16%, respectively, and the contents of other MCFAs (8:0, 12:0, 14:0) were also low. Two or three genes encoding FatB enzymes were co-expressed, resulting in the accumulation of various MCFAs such as C8-C16 in transgenic C. sativa seeds, but each fatty acid content was lower than the corresponding fatty acid content in the seeds with only one gene for FatB enzyme. Further over-expression of CpFatB2 or UcFatB1 and coconut CnLPAT in camelina seeds resulted in the synthesis of more MCFAs. More importantly, coexpressing MCFA-specific FatB and CnLPAT not only increased the MCFA content but also had no negative effect on the total oil content. The above-mentioned metabolically modified engineered camelina strains enriched with MCFAs can be directly used for the production of high-quality jet fuel that is highly resistant to low temperatures.

### Assembling w-3 Long-Chain Polyunsaturated Fatty Acid Biosynthesis for the Production of Oils With Healthcare Applications

Omega-3 fatty acids (w-3 FAs) are fatty acids that have one double bond at the 3rd carbon atom of the methyl terminal of the carbon chain, including a-linolenic acid (ALA,18:3D9,12,15), eicosapentaenoic acid (EPA,20:5D5,8,11,14,17), and docosahexaenoic acid (DHA, 22:6D4,7,10,13,16,19). In particular, long-chain polyunsaturated fatty acids (w-3-LC-PUFAs), such as EPA and DHA, derived from fish oil are extremely important for human health, dietary nutrition, and brain development. In order to establish a renewable resource that can replace fish oil for the production of EPA and DHA, many studies have been devoted to the assembly of the EPA and DHA biosynthetic pathways in the developing seeds of common oil crops in order to achieve the "factory" production of EPA and DHA to meet growing market demand (Ruiz-Lopez et al., 2014; Betancor et al., 2015; Malik et al., 2015; Ruiz-Lopez et al., 2015). The EPA and DHA pathways have been assembled in the seeds of oil crops such as soybeans. However, the accumulation of EPA and DHA is low, and it is difficult to commercialize. Camelina seed contains >30% ALA, which is the starting substrate required for the synthesis of EPA and DHA, making it a good platform for assembling the w-3-LC-PUFA synthesis pathway (Figure 3).

There are two ways for biosynthesis of w-3-LC-PUFA: the conventional D6 pathway and the unconventional D8 pathway (Ruiz-Lopez et al., 2015). The D6 pathway begins with ALA. The synthesis of EPA requires one D6 and D5 desaturase (D6Des and D5Des) and one ELO-type D6 carbon chain elongase (D6Elo); the further synthesis of DHA requires catalysis by one ELO-type D5 carbon chain elongase (D5Elo) and one D4 desaturase (D4Des).

The seed-specific coexpression vectors for five genes were constructed and introduced into camelina to assemble the conventional D6 pathway of EPA synthesis in developing seeds. The genes used include OtD6 desaturase from the eukaryotic microalgae Ostreococcus tauri, TcD5 desaturase from marine fungus Thraustochytrium sp, Piw-3 desaturase from Phytophthora infestans, PsD12 desaturase from Phytophthora sojae, and D6 fatty acid carbon chain elongation enzyme PSE1 from Physcomitrella patens. The co-expression of these five genes led to 31% EPA accumulation in camelina oil (average EPA 24%) (Ruiz-Lopez et al., 2014). EhD4 desaturase from Emiliania huxleyi and D5 elongase OtElo5 from eukaryotic microalgae were then inserted with these five gene vectors to construct seven gene coexpression vectors. The co-expression of these seven genes in the seed of C. sativa resulted in the accumulation of up to 14% DHA (average 8%) (Ruiz-Lopez et al., 2014). The contents of EPA and DHA in this transgenic camelina reached the levels in fish oil, representing the highest EPA and DHA accumulation levels to date have been successfully obtained by assembling biosynthetic pathways in commercial oil crop seeds. More importantly, there is no accumulation of other harmful intermediate metabolites in the seed oil of camelina with these complete EPA and DHA synthetic pathways. The accumulation of EPA and DHA does not negatively affect other agronomic traits, and thus the transgenic plants exhibit normal growth and development. This new camelina germplasm obtained by such metabolic engineering can serve as a high-quality renewable resource for EPA and DHA, and it will be further developed in the future for the commercial production of a series of medicines and nutraceuticals rich in the long-chain w-3 fatty acids EPA and DHA.

The D8 pathway of the unconventional synthesis of EPA is essentially a by-pass pathway, found in some microalgae (e.g. Pavlova salina and Isochrysis galbana) cells (Ruiz-Lopez et al., 2015). The starting substrate for the D8 pathway is also alinolenic acid (ALA). ALA is elongated two carbon atoms by the ELO-typeD9 carbon chain elongase (D9Elo) to produce eicosatrienoic acid (20:3D11,14,17; w-3, ERA). Next, D8 desaturase catalyses ERA to form arachidonic acid (20:4D8,11,14,17; w-3, ETA). Finally, ETA is catalytically converted to EPA by D5 desaturase (20:5D5,8,11,14,17; w-3). The D8 pathway has now been successfully assembled and expressed in the developing seeds of camelina, and the accumulation of long-chain w-3 fatty acids (EPA and ETA) has reached levels up to 26.4% in transgenic C. sativa seed oil (Ruiz-Lopez et al., 2015). In addition, linoleic acid (18:2D9, 12, LA) is also the substrate for the elongase D9Elo. D9Elo catalyzes the addition of two carbon atoms to LA to extend it to eicosenoic acid (20:2D11,14; w-6, EDA). Under the action of D8 desaturase, EDA is converted to eicosatrienoic acid (20:3D8,11,14; w-6, DGLA). Finally, the D5 desaturase catalyzes the conversion of DGLA into arachidonic acid (20:4D5,8,11,14; w-6, ARA), which in turn is acted on by D5Elo, D4Des, and w-3Des to form DHA. Camelina expressing the D8 pathway also requires a precise metabolic modification to reduce the synthesis of the two w-6 fatty acids DGLA and ARA and promote synthesis to accumulate more EPA and DHA.

For the plant-based production of w-3 LC-PUFAs, two reviews (Napier et al., 2015; Haslam et al., 2016) described several examples of "proof-of-concept" and progresses in developing transgenic plants enriched w-3 fish oils. Although promising achievements were obtained in recent years, the metabolic bottleneck remains to be resolved, such as low levels of these non-native fatty acids

accumulated in the transgenic seeds and less knowledge on interaction between the introduced pathway and the endogenous metabolic network. With the success of the field testing of w-3 LC-PUFAs-enriched camelina, large-scale production of the novel oil by plants will provide an alternative, sustainable source of w-3 fish oils for increasing end-users.

### Redesigning w-7 Monounsaturated Fatty Acid Biosynthesis for the Production of High-Valued-Added Oils

Omega-7 fatty acids (w-7 FAs), including palmitoleic acid (16:1D9), 11-octadecenoic acid (18:1D11) and 13-eicosenoic acid (20:1D13), have important industrial, nutritional, and pharmaceutical values. w-7 FAs are also the best fatty acids for producing high-quality biodiesel (Durrett et al., 2010). These rare fatty acids are mostly synthesized in the seeds of some wild plants such as cat's claw (Dolichandra unguis-cati, 64% 16:1D9 and 15% 18:1D11) and sea buckthorn (Hippophae rhamnoides, 32% 16:1D9). Because of the poor agronomic traits of these nonergonomic plants, they have not been commercialized to date. Seeds of common oil crops accumulate very small amounts of w-7 FAs. Metabolic engineering techniques could be employed to assemble w-7 FA synthesis pathways in common oilseed crops in order to achieve the commercial production of w-7 FA-rich vegetable oils (Wu et al., 2012; Nguyen et al., 2015). To assemble this pathway in soybean, our laboratory constructed a seed-specific expression vector for co-expression of the D9 desaturase gene that catalyzes palmitic acid (16:0) to palmitoleic acid (16:1D9) and the gene encoding a DGAT enzyme. The content of w-7 FAs in the transgenic seeds with the co-expression of the two genes was greater than 29%. With this soybean oil, a series offunctional foods rich in w-7 FAs can be produced. In addition, in order to develop a new-type of tobacco specifically used for biofuel production, our laboratory has also assembled w-7 FA biosynthetic pathways in tobacco vegetative organs. The transgenic tobacco leaves accumulated high levels of w-7 FAs (Xue et al., 2013).

The starting substrate for the synthesis of w-7 FAs like palmitoleic acid is palmitoyl (16:0)-ACP, which is synthesized in the plastid. There are two downstream enzymatic reaction pathways following the 16:0-ACP generated in the plastid of common oil crops. In the first pathway, under the action of KASII, 16:0-ACP plus two carbon atoms are extended to produce stearoyl (18:0)-ACP. 18:0-ACP can be directly transferred from the plastid into the cytoplasmic endoplasmic reticulum as 18:0- CoA. 18:0-ACP can also be catalyzed by the acyl-ACP-Δ9 desaturase (acyl-ACP-Δ9Des) in the plastid to generate monounsaturated oleoyl (18:1Δ9)-ACP, which is then transferred from the plastid and bound to CoA (18:1Δ9-CoA) to enter the cytoplasmic endoplasmic reticulum. In the second pathway, 16:0-ACP is dissociated from ACP by the action of FatB, and palmitic acid (16:0) is transferred from the plastid and bound to CoA to enter the cytoplasmic endoplasmic reticulum as 16:0- CoA. In some wild plant seeds rich in w-7 fatty acids, 16:0-ACP is first converted to palmitoleoyl (16:1Δ9)-ACP (w-7) in the plastid under the action of acyl-ACP-Δ9Des with substrate specificity for 16:0-ACP, followed by the action of the fatty acid elongate (FAE) to produce 11-octadecenoyl (18:1Δ11)-ACP (w-7). Finally, under the influence of FatA, 16:1Δ9 and 18:1Δ11 are dissociated from ACPs and transferred out of the plastid and bound to CoA to enter the cytoplasmic endoplasmic reticulum as 16:1Δ9-CoA and 18:1Δ11-CoA, respectively (Figure 4).

Currently, a successful strategy for assembling w-7 FA biosynthesis in C. sativa seeds involves expressing an acyl-ACP-Δ9Des that specifically catalyzes 16:0-ACP to produce 16:1Δ9-ACP in the plastids to increase plastid 16:1 Δ9-ACP synthesis. The simultaneous expression of one or two acyl-CoA-Δ9 desaturase (acyl-CoA-Δ9Des) capable of specifically catalyzing 16:0-CoA produced in the cytoplasmic endoplasmic reticulum to form 16:1Δ9-CoA will further increase the accumulation of 16:1Δ9 and its elongation products (18:1Δ11 and 20:1Δ13) in the transgenic camelina seeds.

Nguyen et al. (2015) applied this strategy to assemble the w-7 FA biosynthetic pathway in camelina and obtained an engineered strain that accumulated high levels of w-7 FAs. The seed-specific co-expression of one acyl-ACP-Δ9Des with strong specificity for 16:0-ACP and two acyl-CoA-Δ9Des with strong specificity for 16:0-CoA resulted in an increase in the accumulation of w-7 FAs in the seeds of transgenic camelina from 0.6% in the wild-type to 17%. In this transgenic camelina, resynchronizing the RNAi silencing of endogenous CsKASII and CsFatB further increased the w-7 FA content to an average of 60% of total lipids, and the saturated fatty acid content was correspondingly reduced from 12% in wild-type to 5% in transgenic seeds. This engineered camelina line with high accumulation of w-7 FAs did not exhibit poor agronomic traits. Seed weight, protein content, and oil content did not differ from those in the control, and seed germination and seedling development were normal. Camelina oil rich in w-7 fatty acid can be used to more efficiently produce various medicines and functional health products and foods. Physicochemical tests have shown that the thermodynamic properties of these camelina oils rich in w-7 FAs are significantly improved compared to those of wild type camelina oils (Nguyen et al., 2015). The w-7 FA-enriched oils can be used to produce high-value-added industrial products such as high-value biodiesel and polyurethanes.

### CRISPR/Case9 Gene Editing Technology in Designer Oil Production

The CRISPR/Cas9 gene editing has evolved as the most powerful tool for efficient and specific genome engineering to create genetic models for both fundamental research and crop genetic improvements although this new technology is being developed rapidly (Shan et al., 2013; Van Campenhout et al., 2019; Wolter et al., 2019). This gene editing system has been successfully used to generate a broad range of stable mutagenesis at specific locus in several important crops such as rice (Zhang et al., 2014), wheat (Wang et al., 2019), maize (Svitashev et al., 2015), soybean (Li et al., 2015), tomato (Ito and Nakano, 2015), potato (Butler et al., 2015),

barley and Brassica oleracea (Lawrenson et al., 2015). The CRISPR-Cas9 system can also be employed to achieve simultaneous editing multiple loci by expressing different sgRNAs targeting the genes (Lowder et al., 2015; Ma et al., 2015). Many agronomic traits of crops were modified by CRISPR-Cas9, including yield and quality-related traits, plant nutrition and development, resistances to various stresses, and domestication (Chen et al., 2019). Notably, simultaneous targeting genes related to meiosis and fertilization by this technology produced a system enabling the clonal reproduction from F1 hybrid rice, thus stably preserving the favorable high degree of heterozygosity (Wang et al., 2019).

As highlighted by recent studies, CRISPR-Cas9 system was successfully used in C. sativa for seed oil quality improvement (Jiang et al., 2017; Morineau et al., 2017; Ozseyhan et al., 2018). The allohexaploid camelina contains three closely-related subgenomes with each gene having three pair of alleles. Camelina seed oil is dominated by polyunsaturated fatty acids (linoleic and linolenic acids) and the development of new varieties rich in monounsaturated fatty acids (particularly oleic acid) is desirable. By targeting all three homeologs of the CsFAD2 gene, Jiang et al. (2017) obtained a diverse set of genetic combinations with single, double and triple knockouts. In these mutant lines, oleic acid content was increased from 16% to over 50% of total fatty acids, whereas polyunsaturated fatty acids were significantly reduced. Similarly, Morineau et al. (2017) also obtained selectively targeted mutagenesis of CsFAD2 genes by CRISPR-Cas9, showing the mutants with decreased polyunsaturated fatty acids and concomitant increase of oleic acid in the oil. Another lipid-related gene, FAE1(fatty acid elongase1), was effectively deactivated in camelina by CRISPR-Cas9 targeting three FAE1 alleles simultaneously (Ozseyhan et al., 2018). Homozygous knockout mutants without growth defect were generated in a single generation by an egg cell-specific Cas9 expression, exhibiting that content of C20-C24 very long-chain fatty acids are reduced to less than 2% of total fatty acid compared to over 22% in the wild type. However, knocking out FAE1 increased levels of oleic acid or a-linolenic acid in camelina oils, which are desirable for industrial or food/feed utilizations. The different allelic combinations generated by the gene editing above allowed an unbiased characterization of gene dosage and function in this allohexaploid species, providing a valuable resource of genetic variability for precise plant breeding. It is promising that CRISPR-Cas9 technology will be used to effectively mutate other oil/fatty acid-related genes in camelina for genetic finetunning of oil trait.

# FUTURE PROSPECTS

Arabidopsis has long been used as a model plant although it is in fact a "wild grass", not a cultivated agricultural plant. For comprehensive understanding of the complexity of many commercial crops, it is necessary to develop a model crop. In this regard, rice (Oryza sativa) is recognized as the model plant for cereal crops and species in Gramineae. Similarly, Camelina sativa has emerged as a new model plant for agricultural plants, especially for oil crops. Due to its short life cycle and an efficient genetic transformation like Arabidopsis, camelina is also developed as a "platform crop" with its seeds as "bioreactors" for commercial production of many high-valued products, followed by a new round of camelina commercial planting and a research boom on camelina-based industry worldwide.

As described above, several strategies are available to design high-valued fatty acids/oils by metabolic engineering in camelina seeds. These tools include optimizing enzymatic activity of the lipid-related enzymes, modifying a key gene, and/or coordinately expressing multiple genes associated with fatty acid synthesis and oil accumulation. Future achievements in this field require to obtain more knowledge on the key genes in lipid synthesis and accumulation, the networks responsible for regulation of metabolic carbon partitioning and target lipid synthesis, the rational redesign of novel enzymatic activities to synthesize the target FAs, and the effective assembly of lipid pathways consisting of multiple genes in the host plant.

Genome-wide association studies (GWAS) and multiple "omics" tools including genome resequencing, transcriptomics, and metabolomics are increasingly used to identify new genes involved in FA synthesis and modification and their incorporation into TAGs. For example, a GWAS using 391 wild Arabidopsis accessions revealed four to nineteen genomic regions responsible for FA composition in seed oil, showing that the most regions identified contained candidate genes that had never before been implicated in lipid metabolism (Branham et al., 2015). This tool was also used to examine a number of loci containing candidate genes involved in wax biosynthesis in camelina (Luo et al., 2019). By mining sequence data, the candidate enzymes for target FAs were identified, and subsequently used to manipulate C. sativa oil composition towards a superior biofuel and bio-based lubricant oil (Kim et al., 2015; Nguyen et al., 2015).

To enrich the designed FAs/oils in the host, systems biology approaches are needed to obtain a greater understanding of the networks responsible for metabolic carbon partitioning and the transcription factors regulating the related metabolism. More promising, synthetic biology tools like GoldenBraid 2.0 that enable interchangeable, modular assembly of transgene cassettes (Sarrion-Perdigones et al., 2013) can also be employed to effectively introduce the complex pathways involved in multiple genes to generate high-value oil products in this easily transformable camelina host. Synthetic scaffolds could provide a modular and highly flexible tool for rationally organizing multiple function-related enzymes in a controllable manner (Pröschel et al., 2015). It is envisioned that C. sativa can be developed as the adaptable chassis plant for commercial production of high-valued compounds by synthetic biology tools.

As CRISPR/Cas9 advance for targeted genome editing in recent years (Wolter et al., 2019), this tool makes it possible to introduce the assembled transgene modules into specific loci in the host crop, leading to an enhancement in the predictability of transgenesis and metabolic outcomes. Like the cases discussed above, CRISPR/Cas9 was used to "knock out" or disable the target gene in the competing

Yuan and Li Camelina Metabolic Engineering

pathways so as to direct metabolic flux toward the desired route. Our laboratory used this technology to construct a library of generelated mutants associated with the oil metabolism in camelina seeds and obtained various mutants with a single copy of the gene. This shows that the CRISP/Case9 technology has multi-faceted application values in C. sativa to overcome the tripled orthologous genes. Undoubtedly, advances in synthetic biology could make it more effectively to perform modular transgene assembly and its targeting insertion into the host genome. Improvements in genome editing could allow the accurate optimization of the host metabolism for the desired pathway. Combination of these two technologies could significantly increase the efficiency of plant lipid redesign and pathway assembly.

In order to obtain various high-valued lipid products, it is also required to redesign the enzyme with improved activities or specific properties. Several approaches can be used to modify or generate novel enzymes necessary for the target lipid synthesis and accumulation, including directed protein mutation, codonoptimization and post-translation modifications. Finally, re-design of metabolic pathways for enriching novel and high-valued lipids should be combined with other breeding programs in camelina such as increasing yield and seed oil content, and improving tolerance to various stress conditions. By doing this, new types of engineered camelina stacking the target FAs and other improved agronomic traits will be created, greatly promoting sustainable production of the designed oil and its commercialization.

### CONCLUSIONS

Camelina, an important oil crop in the world, is being developed as an ideal "platform crop" for production of high-valued lipids/ oils by metabolic engineering. This review described genetic improvements of the seed oil quality in camelina, particularly focusing on the metabolic pathway assembly of novel fatty acids/ oils with multiple nutritional and industrial applications. Several successful examples were provided here to show how lipid

### REFERENCES


biosynthesis in camelina seeds can be redesigned to enable the high accumulation of the target oils that have beneficial functional groups or properties. These target products include acTAGs, hydroxylated FAs, medium chain FAs, w-3-LC-PUFAs and w-7 FAs. Technological strategies used in this regard include introduction of a new pathway by synergistical overexpression of multiple genes, blocking the competitive pathway by RNAimediated suppression or genomic editing, and optimizing the related metabolic pathways using mutant enzymes. New knowledge and technology, particularly synthetic biology and targeting genome editing will be collectively employed in this field to achieve a more complete orchestration of fatty acid synthesis and metabolism as well as sustainable yields of target lipids in camelina and other oil crop seeds although some significant challenges remain to address.

### AUTHOR CONTRIBUTIONS

LY was involved in the review writing. RL was involved in manuscript refinement. All authors read and approved the final manuscript.

### FUNDING

This work was financially supported by grants from the National Natural Science Foundation of China (Grant No. 31801400), Sci-Tech Key Research Project of Jinzhong City (Grant No. Y182009), "1331 project" for key innovation team of Jinzhong University (Grant No. jzxycxtd2019009), University Sci-Tech Innovation Project of Shanxi Province (Grant No. 2016171), Basic Research for Application Project of Shanxi Province (Grant No. 201601D202060), Coal-based Key Sci-Tech Project of Shanxi Province (Grant No. FT-2014-01), and the Key Project of The Key Research and Development Program of Shanxi Province, China (Grant No. 201603D312005).


from Euphorbia lagascae seed. Plant Physiol. 128, 615–624. doi: 10.1104/ pp.010768


long chain polyunsaturated fatty acid accumulation in transgenic Camelina sativa. Sci. Rep. 7 (1), 6570. doi: 10.1038/s41598-017-06838-0


Conflict of Interest: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Yuan and Li. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Methylobacterium sp. 2A Is a Plant Growth-Promoting Rhizobacteria That Has the Potential to Improve Potato Crop Yield Under Adverse Conditions

### Edited by:

Briardo Llorente, Macquarie University, Australia

### Reviewed by:

Vasvi Chaudhry, University of Tübingen, Germany Nurettin Sahin, Mugla Sitki Kocman University, Turkey

### \*Correspondence:

Rita María Ulloa rulloa@dna.uba.ar; ulloa.rita@gmail.com

### † Present address:

Elisa Fantino, Laboratoire de Recherche Sur le Métabolisme Spécialisé Végétal, Département de Chimie, Biochimie et Physique, Université du Québec à Trois-Rivières, Trois-Rivières, QC, Canada

### Specialty section:

This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science

Received: 15 October 2019 Accepted: 17 January 2020 Published: 14 February 2020

### Citation:

Grossi CEM, Fantino E, Serral F, Zawoznik MS, Fernandez Do Porto DA and Ulloa RM (2020) Methylobacterium sp. 2A Is a Plant Growth-Promoting Rhizobacteria That Has the Potential to Improve Potato Crop Yield Under Adverse Conditions. Front. Plant Sci. 11:71. doi: 10.3389/fpls.2020.00071 Cecilia Eugenia María Grossi <sup>1</sup> , Elisa Fantino1† , Federico Serral <sup>2</sup> , Myriam Sara Zawoznik <sup>3</sup> , Darío Augusto Fernandez Do Porto<sup>2</sup> and Rita María Ulloa1,4\*

<sup>1</sup> Laboratorio de Transducción de Señales en Plantas, Instituto de Investigaciones en Ingeniería Genética y Biología Molecular (INGEBI), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Ciudad Autónoma de Buenos Aires, Argentina, <sup>2</sup> Plataforma de Bioinformática Argentina, Instituto de Cálculo, Ciudad Universitaria, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires (UBA), Ciudad Autónoma de Buenos Aires, Argentina, <sup>3</sup> Cátedra de Química Biológica Vegetal, Departamento de Química Biológica, Facultad de Farmacia y Bioquímica, Universidad de Buenos Aires (UBA), Ciudad Autónoma de Buenos Aires, Argentina, <sup>4</sup> Departamento de Química Biológica, Universidad de Buenos Aires (UBA), Ciudad Autónoma de Buenos Aires, Argentina

A Gram-negative pink-pigmented bacillus (named 2A) was isolated from Solanum tuberosum L. cv. Desirée plants that were strikingly more developed, presented increased root hair density, and higher biomass than other potato lines of the same age. The 16S ribosomal DNA sequence, used for comparative gene sequence analysis, indicated that strain 2A belongs to the genus Methylobacterium. Nucleotide identity between Methylobacterium sp. 2A sequenced genome and the rest of the species that belong to the genus suggested that this species has not been described so far. In vitro, potato plants inoculated with Methylobacterium sp. 2A had a better performance when grown under 50 mM NaCl or when infected with Phytophthora infestans. We inoculated Methylobacterium sp. 2A in Arabidopsis thaliana roots and exposed these plants to salt stress (75 mM NaCl). Methylobacterium sp. 2A-inoculated plants, grown in control or salt stress conditions, displayed a higher density of lateral roots (p < 0.05) compared to noninoculated plants. Moreover, under salt stress, they presented a higher number of leaves and larger rosette diameter. In dual confrontation assays, Methylobacterium sp. 2A displayed biocontrol activity against P. infestans, Botrytis cinerea, and Fusarium graminearum, but not against Rhizoctonia solani, and Pythium dissotocum. In addition, we observed that Methylobacterium sp. 2A diminished the size of necrotic lesions and reduced chlorosis when greenhouse potato plants were infected with P. infestans. Methylobacterium sp. 2A produces indole acetic acid, solubilizes mineral phosphate and is able to grow in a N2 free medium. Whole-genome sequencing revealed metabolic pathways associated with its plant growth promoter capacity. Our results suggest that Methylobacterium sp. 2A is a plant growth-promoting rhizobacteria (PGPR) that can alleviate salt stress, and restricts P. infestans infection in potato plants, emerging as a potential strategy to improve crop management.

Keywords: Methylobacterium, plant growth-promoting rhizobacteria (PGPR), salt stress, Phytophthora infestans, potato

### INTRODUCTION

In 30 years, the world population will be close to 10 billion people, there will be a further 2–3 billion people to feed. According to FAO Statistical Database1 , 50% of the habitable land is nowadays used for agriculture, most of which is used for the rearing of livestock and only 23% (11 million km2 ) is for food crop production. These 11 million km2 supply more calories and proteins for the global population than the almost four-time larger area devoted to livestock (Our World in Data<sup>2</sup> ). Today, our challenge is to increase crop productivity at a faster rate than population growth. Most countries have managed to achieve this goal in recent decades; a combination of agricultural technologies, irrigation, improved crop varieties, fertilizers, and pesticides, were used to obtain higher yields (Roser and Ritchie, 2017). However, it is possible that yield gains in the decades to follow will be offset by a growing population. In the near future, especially in the developing world, there will be an increasing prevalence offarming on marginal, arid, and semiarid lands (Coleman-Derr and Tringe, 2014). Therefore, crop yield has to be further improved and crop varieties should be adapted to hostile environments.

Potato is produced in over 100 countries and is the third most important food crop in the world after rice and wheat in terms of human consumption. One hectare (ha) of potato can yield two to four times the food quantity of grain crops; potatoes produce more food per unit of water than any other major crop and are up to seven times more efficient in using water than cereals (CIP International Potato Center<sup>3</sup> ). From 1994 to 2014, productivity gains in potato increased 15 points above the population growth rate (Our World in Data<sup>2</sup> ). In Argentina, 70 to 80 thousand ha are allocated to potato production, yield values are around 30 to 35 TN/ha, and total production is 2.1 to 2.5 million TNs per year (Garzón and Young, 2016). However, it was estimated that potato productive yield is only 40% to 76% or 47% to 81% of the potential yield, depending on the method used to estimate potential yield, the year and the region (Cantos de Ruiz et al., 1989; de la Casa et al., 2014).

Potential yields can be calculated assuming that there are no abiotic limitations to the growth and there are no biotic factors that reduce growth. The potato plant is well adapted to a number of environmental conditions but certain biotic and abiotic stresses cause significant reductions in growth and yield. Among abiotic stresses, salinity is one of the main constraints that limit plant productivity and cause loss of arable land (Isayenkov and Maathuis, 2019). In particular, potato plants are glycophytes moderately sensitive to salt stress. Although there are differences between cultivars, all cultivars showed a reduction in shoot length, reduced root system development, and reduced tuber yield due to salinity (Dahal et al., 2019). Among biotic stresses, late blight of potato caused by the hemibiotrophic oomycete Phytophthora infestans is still a devastating disease worldwide. Chemical management is a popular strategy to control late blight but it may have a negative impact on the environment (Lal et al., 2018).

Numerous techniques have been used to understand the mechanisms and provide tools to enhance plant tolerance under environmental stresses. To ensure long-term food production, we must develop sustainable agricultural practices with minimal adverse impact on the environment. In this context, the use of microbial inoculants plays a key role. When introduced to seeds, roots or into the soil, plant growthpromoting rhizobacteria (PGPR) can solubilize insoluble phosphates, produce plant growth hormones, convert atmospheric nitrogen to ammonia or suppress the growth of plant pathogens (Pérez-Montaño et al., 2014). Naturally occurring PGPR were shown to be effective in enhancing plant growth and development, and in promoting crop productivity and disease management under stress conditions (García et al., 2017; Singh and Jha, 2017; Cordero et al., 2018). This environmentally friendly approach could be among the most efficient methods for minimizing the use of chemicals.

In this work, we have identified and characterized Methylobacterium sp. 2A, a PGPR that promotes plant growth and is able to mitigate the harsh effect of salinity on in vitro potato and Arabidopsis plants. Moreover, it can reduce P. infestans infection on potato greenhouse plants. Genome sequencing allowed us to identify genes that could be involved in its plant growth-promoting (PGP) capacity. We confirmed that this rhizobial can produce indole acetic acid (IAA), is able to grow in N2 depleted media, and can solubilize inorganic phosphate.

### MATERIALS AND METHODS

### Plant and Pathogen Material

Solanum tuberosum L. cv. Desirée wild-type (WT) and Arabidopsis thaliana Columbia (Col-0) plants were used. Fresh internodes (the first three from the top) from 1-month-old virusfree potato plants were micropropagated in MS medium (Murashige and Skoog, 1962) with the addition of 2% (w/v) sucrose and 0.7% (w/v) agar. Surface disinfected Arabidopsis seeds were vernalized for two days at 4°C and, were then allowed to grow in MS medium (0.5X, 0.8% (w/v) agar). Plants were grown for the indicated times in a growth chamber under a 16 h light photoperiod at 21°C–23°C.

<sup>1</sup> http://fao.org

<sup>2</sup> http://ourworldindata.org

<sup>3</sup> http://cipotato.org

Certified pathogen tested Solanum tuberosum, L. var. Spunta (Diagnósticos Vegetales S.A., Mar del Plata, Argentina) tubers were planted in 1-L pots filled with MultiProTM substrate (GrowMix®, Terrafertil S.A., Bs. As., Argentina). Soil-grown plants were cultivated in a greenhouse under a 16-h light photoperiod; natural light was supplemented by sodium lamps providing 100–300 µmols-1m-2; the temperature was set at 25°C during the day and 20°C in the night.

P. infestans isolate Pi-60 was kindly provided by Dr. Natalia Norero (Laboratorio de Agrobiotecnología, INTA EEA-Balcarce), maintained on rye sucrose agar (RSA) medium at 19°C ± 1°C in the dark. Suspensions of Phytophthora zoospores were performed as described in Fantino et al. (2017). The concentration was adjusted to 25 sporangia µl-1 to be used as an inoculum. The fungi Fusarium graminearum, Botrytis cinerea, Phytium dissotocum, and Rhizoctonia solani were maintained in potato dextrose agar (PDA) medium at room temperature in the dark, or at 19°C ± 1°C in the case of P. dissotocum.

### Isolation and Identification of Methylobacterium sp. 2A

Bacterial isolation was conducted from roots of potato plants established in vitro that were previously grown in the greenhouse at Instituto de Investigaciones en Ingeniería Genética y Biología Molecular (INGEBI-CONICET, Argentina; 34° 33' 28.0" S 58° 27' 32.0" W). Fresh root samples were collected, surface sterilized, and placed in LB agar (Luria-Bertani) at 28°C for 4 days. Pink-pigmented colonies were selected and the isolate was maintained on LB medium without the addition of NaCl (LBNS). To determine the optimal growth conditions, different temperatures (18°C, 24°C, 28°C, 30°C, and 37°C) and culture media (LBNS, PDA, pea sucrose agar (PSA), and RSA) were tested. pH tolerance was determined in LB adjusted to different pH values (4.5 to 7.5). A catalase slide test was performed and oxidase activity was determined by the oxidation of discs impregnated with N, N-dimethyl-p-phenylenediamine oxalate (Britania S.A., Argentina). Antibiotic resistance to streptomycin, hygromycin, rifampicin, ampicillin, kanamycin, chloramphenicol, and cefotaxime were tested in liquid LBNS.

# 16s Ribosomal DNA Gene-Based Analyses

Genomic DNA (gDNA) was extracted manually according to Chen and Kuo (1993) from a 48-h culture. The 16S ribosomal DNA (16S rDNA) gene was amplified by Polymerase Chain Reaction (PCR) with universal primers fD1 and rP2 (Table S1) (Genbiotech, Buenos Aires, Argentina) (Weisburg et al., 1991) and sequenced at Macrogen Sequencing facility (Seoul, South Korea). Phylogenetic analysis was performed comparing Methylobacterium sp. 2A 16S rDNA against representative type species of the Methylobacterium genus, and against eleven species that have been recently reclassified into the new Methylorubrum genus (Green and Ardley, 2018). Multiple sequence alignment and phylogenetic tree reconstruction were performed with MEGA7 (Molecular Evolutionary Genetics Analysis 7, Kumar et al., 2016) software using the Maximum Likelihood method based on the Kimura two-parameter model with 1,000 bootstrap values. Members of the Methylocystaceae family were included as outgroups. Sequence similarity was also compared to those on the EzBioCloud3 database (Yoon et al., 2017).

### Plant Inoculation With Methylobacterium sp. 2A and Salt Treatments

In vitro potato plants were inoculated with Methylobacterium sp. 2A by root contact; plant internodes of potato plants showing root pink-pigmentation were equidistantly placed in flasks containing solid MS media with noninoculated potato internodes. Bacterial colonization was evident to the naked-eye two weeks later.

Internodes from potato plants were grown in solid MS media with the addition or not (controls) of increasing NaCl concentrations for 21 days; root and shoot length (cm), and total chlorophyll content (Hiscox and Israelstam, 1979) were assessed. Significant stress was evidenced at 50 mM, but it was still possible to determine chlorophyll content and root growth (Figure S1). Therefore, this concentration was chosen to perform further studies. Internodes from potato plants, inoculated or not with Methylobacterium sp. 2A, were grown for 21 days in solid MS media with the addition or not of 50 mM NaCl. The experiment was performed three times, using two flasks with four plants each, for each condition (control, C; inoculated, I; salt stress, S; inoculated plants with salt stress, I+S). Root and shoot length, the number of leaves and chlorophyll content were determined at the time of harvest in both control and salt treatments. When measuring root and shoot length, the original internode was not considered. Chlorophyll content was measured on fully expanded leaves as SPAD units using a portable chlorophyll spectrophotometer (Clorofilio®, Cavadevices, Argentina).

One-week-old Arabidopsis plants with similar growth were selected and established in Petri dishes containing MS 0.5X, or MS 0.5X with 75 mM NaCl (salt stress conditions; Ruiz Carrasco et al., 2007). After 48 h, plants were inoculated with 2 µl of a sterile saline solution (0.85% NaCl) or with a suspension of Methylobacterium sp. 2A cells (0.05 OD600nm units in 0.85% NaCl). Ten days later, the number of leaves and rosette diameter were determined and lateral root density was counted in each plant as the number of lateral roots/primary root length. For this, Petri dishes were photographed; the primary root length was measured and lateral roots were counted using Fiji software (Schindelin et al., 2012). Samples were stored in liquid nitrogen for further studies. Protein extracts were obtained and catalase activity was determined as described in Aebi (1984). Hydrogen peroxide (H2O2) quantification was performed according to Junglee et al. (2014). Protein content was estimated by Bradford using bovine serum albumin as standard. The experiment was conducted three times using two plates with five plants each for each condition (control, C; Methylobacterium sp. 2A-inoculated, I; salt stress, S; Methylobacterium sp. 2Ainoculated plants with salt stress, I+S).

### Plant Inoculation With Methylobacterium sp. 2A and Infection With P. infestans

In vitro four-weeks-old potato plants inoculated or not with Methylobacterium sp. 2A, were infected with P. infestans isolate Pi-60; 10-µl droplets of zoospore suspension were pipetted on three leaves per plant. Two flasks containing five plants each were used for each condition (Pi-60 and I+Pi-60). Five days later, P. infestans aggressiveness was observed (Figure S2). The result obtained encouraged us to conduct experiments in greenhouse plants. To this end, three-weeks-old potato plants were sprayed with Methylobacterium sp. 2A bacterial suspension (0.05 OD600nm units in 0.85% NaCl), or with saline solution (control). Infection with Pi-60 was performed two days later; 10-µl droplets of sporangia suspension or water were pipetted on the abaxial side of apical leaflets of three leaves per plant (two equidistant spots per leaf). Five days post-infection (dpi), leaves were observed and photographed with a phone camera. Necrotic lesions were counted and Fiji software was used to measure the lesion area. The experiment was conducted twice using three plants for each condition (control, C; Methylobacterium sp. 2Ainoculated, I; infected, Pi-60; Methylobacterium sp. 2Ainoculated and infected, I+Pi-60).

Leaf samples were collected and total RNA was extracted using the TRIZOL Reagent (Invitrogen) following the manufacturer's instructions. Total RNAs (1 µg) were pretreated with DNAse (RQ1 RNAse-free DNAse, Promega) and reverse transcribed with M-MLV-Reverse Transcriptase (Promega) using an oligo-dT primer and random hexamers. Expression levels of StPR-1b and StPAL genes were analyzed by RT-qPCR on an Applied Biosystems 7500 Real-Time PCR System, with the indicated primers (Table S1) (Genbiotech, Buenos Aires, Argentina) and FastStart Universal SYBR Green Master Rox (Roche). Elongation factor 1 alpha (EF-1a) was used as a reference gene. PCR reactions were incubated at 95˚C for 10 min followed by 40 cycles of 95˚C for 10s; and 60˚C for 1 min. PCR specificity was checked by melting curve analysis. Expression data was analyzed using the 2–DDCt method (Livak and Schmittgen, 2001).

### In Plate Confrontation Assay

Dual culture assays were performed in PDA or in RSA. Methylobacterium sp. 2A was striked on one half of the plate and, after 3 days of incubation at 25°C, a 1 cm2 plug from an actively growing culture of the oomycete P. infestans or the fungi F. graminearum, B. cinerea, P. dissotocum, and R. solani, was placed at the other half of the plate. Petri dishes with PDA or RSA containing only the corresponding plugs served as controls. The radial mycelial growth of the pathogens toward Methylobacterium sp. 2A (Ri) and that on a control plate (Rc) were measured and mycelial growth inhibition was calculated according to the formula: (Rc-R)/Rc x 100 (Lahlali and Hijri, 2010).

### De Novo Genome Assembly and Annotation

Whole Genome Sequencing (WGS) of Methylobacterium sp. 2A was carried out by an Illumina TruSeq Nano platform at Macrogen Laboratories. De novo assembly was done using the standard procedures from our own prokaryotic assembly pipeline (Sosa et al., 2018), based on SPAdes version 3.9.0 (Bankevich et al., 2012) and SSPACE version 3.0 (Boetzer et al., 2011). Genome annotation was done using the Rapid Annotations Subsystems Technology (RAST) server (Aziz et al., 2008).

### Whole-Genome Computational Analysis

The ANI (Average Nucleotide Identity) values between Methylobacterium sp. 2A and the other reference strains were calculated with the JspeciesWS web service (Richter et al., 2015). In silico DNA-DNA hybridization (DDH) was conducted between Methylobacterium sp. 2A and the reference strains with the Genome-to-Genome Distance Calculator web service (GGDC 2.1; Meier-Kolthoff et al., 2013).

### Search for Metabolic Pathways Associated With PGP Traits

On the basis of the annotated genome, computational prediction of metabolic pathways was made through automatic reconstruction by Pathway Tools v23.0. The Pathologic software of Pathway Tools (Karp et al., 2002) and the MetaCyc database (Caspi, 2005) were used to automatically generate a pathway-genome database (PGDB) from the GenBank file of the Methylobacterium sp. 2A annotated genome. The PGDB links the coding sequences and potential genes to enzymatic reactions and biochemical pathways. In addition, gene clusters were identified with SnapGene<sup>4</sup> software (GSL Biotech) and with antiSMASH 5.0 (Blin et al., 2019) and manually curated.

### Assays for Detection of PGP Abilities

Dinitrogen fixation, phosphate solubilization, and IAA production were analyzed. The amount of IAA produced by Methylobacterium sp. 2A was estimated using Salkowski's method (Ehmann, 1977). The IAA-producing strains Azospirillum brasilense Az39 and Pantoea sp. were used as positive controls (Díaz Herrera et al., 2016). The different strains were grown in LB broth plus Trp (0.1 mg ml-1) and incubated at 28°C for 3, 4, or 5 days. After incubation, 2-ml aliquots were centrifuged and 1-ml supernatant samples were mixed with 1 ml of Salkowski's reagent (2% 0.5 FeCl3 in 35% H2SO4 solution) and kept in the dark. OD was recorded at 530 nm after 30 min. Nitrogen fixation was qualitatively determined by culturing single colonies in semisolid Nfb media, as described by Hartmann and Baldani (2006). An assay to evaluate phosphate solubilization was performed using tricalcium phosphate (TCP) in NBRIP solid and liquid media. Phosphate determination was made by the vanadomolybdate colorimetric method according to Pearson (1976). The strain Pseudomonas fluorescens BNM 233 currently used as a biofertilizer (Okon et al., 2015) was used as a positive control for phosphate solubilization.

### Statistical Analysis

Statistical analysis was performed by one-way or two-way ANOVA followed by Tukey's HSD test (p < 0.05) or by T-test as indicated in the figures, using GraphPad Prism<sup>5</sup> version 5.03 (GraphPad Software, La Jolla, California, USA).

<sup>4</sup> http://snapgene.com

<sup>5</sup> http://graphpad.com

# RESULTS

# General Features of Methylobacterium sp. 2A

One-week-old potato plants micropropagated in vitro attracted our attention due to the pink-pigmentation of its roots (Figure 1A). MS media was not contaminated but a strong association with a microorganism was evident. One month later, these plants were more developed than other potato lines of the same age (Figure 1B) and the density of root hairs was increased (Figure 1C). A gram-negative bacillus, named Methylobacterium sp. 2A, was isolated from the roots and characterized. Colonies were pinkpigmented and circular, reaching a diameter of 0.2 mm after 3 days of incubation, denoting slow growth. It had a better performance in solid media containing plant extracts (PDA, PSA, and RSA media) while growth on LB agar was lower. Optimal growth was observed at 28°C, and the optimal pH growth range was between 5 and 7, being 6 the optimal pH value (Table S2). Oxidase and catalase reactions were positive. Resistance to hygromycin (10 µg/ µl), ampicillin and chloramphenicol (20 µg/µl) was evidenced; however, Methylobacterium sp. 2A was sensitive to the other antibiotics tested.

### Phylogenetic Analysis of Methylobacterium sp. 2A

The nearly full-length 16S rDNA sequence (1,218 nt; GenBank accession number: MG818293.1) used for comparative gene sequence analysis, indicated that strain 2A belongs to the genus Methylobacterium, family Methylobacteriaceae, order Rhizobiales. EzBioCloud comparison revealed that 16S rDNA from Methylobacterium sp. 2A shared high sequence similarity (>97%) with fourteen validated type species of the genus (Table S3). The phylogenetic tree (Figure 2) indicated that this isolate is most closely related to M. fujisawaense, M. phyllosphaerae, M. oryzae, M. radiotolerans, M. tardum, M. longum, and M. phyllostachyos (sequence similarity between 98.52% and 99.10%). Characteristics of Methylobacterium sp. 2A, such as colony pigmentation, cell size, growth conditions, and catalase and oxidase reactions, were compared (Table S2) with those of the most closely related species (Ito and Iizuka, 1971; Kato et al., 2008; Knief et al., 2012; Madhaiyan and Poonguzhali, 2014; Madhaiyan et al., 2007; Madhaiyan et al., 2009). As observed, though isolated from different sources, all strains were positive for catalase and oxidase reactions, had a similar pigmentation, and shared several growth conditions.

### Inoculation of Methylobacterium sp. 2A Conferred Stress Tolerance Under Salt Conditions

Salinity impairs plant growth and development via water stress, cytotoxicity due to excessive uptake of ions (Na+ and Cl<sup>−</sup> ), and nutritional imbalance; it is accompanied by oxidative stress due to the generation of reactive oxygen species (ROS) (reviewed in Isayenkov and Maathuis, 2019). In vitro potato plants were grown in MS media with or without 50 mM NaCl for 21 days. Strong inhibition of shoot and root growth, and total biomass, together with a significant decrease in chlorophyll content (SPAD units) and leaf number (Figures 3A, B) was observed compared to control conditions (C vs. S). On the other hand, when the Methylobacterium sp. 2A-inoculated potato plants were grown in control (I) or salt stress conditions (I+S), the inhibition observed in shoot and root length and the decrease in the number of leaves and total biomass was less severe (I vs. I+S). Moreover, chlorophyll content was not reduced. When comparing

2A, were grown in solid MS media (C or I) or in MS media with the addition of 50 mM NaCl (S or I+S). (A) Shoot and root length (cm), SPAD units, the number of leaves, and total biomass (g) were determined. Mean ± SEM of three biological replicates, with three technical replicates each, were plotted. Two-way ANOVA analysis was performed and Tukey´s HSD test was applied. Different letters above the bars indicate significant differences (p < 0.05). (B) Representative pictures of each condition are shown.

noninoculated versus Methylobacterium sp. 2A-inoculated plants under salt conditions (S vs. I+S), a significant difference was observed in most parameters, indicating that this isolate is able to mitigate the negative effect of salinity.

Arabidopsis plantlets inoculated or not with Methylobacterium sp. 2A were grown under control or salt stress conditions. After a week of conducting the experiment, a significant increase in lateral root density (p < 0.05) and in the number of leaves (p < 0.01) was observed in Methylobacterium sp. 2A-inoculated plants, both under control and stress conditions, compared to noninoculated ones (C vs. I, and S vs. I+S; Figures 4A, B). The reduction in rosette diameter observed in control plants under salt stress (C vs. S; Figure 4A) was not perceived when Methylobacterium sp. 2A was present (I vs. I+S). As depicted in Figure 4A, a sixfold increase in catalase activity was observed upon salt stress in noninoculated plants (C vs. S), but not in Methylobacterium sp. 2A-inoculated ones (C vs. I+S). However, upon inoculation, catalase activity increased three-fold under control conditions (C vs. I). The increase observed in catalase activity in stressed plants (S) was not sufficient to reduce H2O2 content, in fact, these plants present fivefold more peroxide than control ones (p = 0.01). However, no significant difference in peroxide content was observed when comparing control with Methylobacterium sp. 2A-inoculated plants under control (p = 0.656) or under salt stress conditions (p = 0.651). Furthermore, a similar peroxide content was observed in Methylobacterium sp. 2A-inoculated plants grown under control or saline conditions (p = 0.796). Our results indicate that this isolate can exert a salt-protective effect on different plant species.

grown in solid MS media (C or I) or in MS media with the addition of 75 mM NaCl (S or I+S). (A) Lateral root density, the number of leaves, rosette diameter (mm), and catalase activity were determined. Catalase activity is expressed as pmoles H2O2 destroyed min-1 mg-1. Mean ± SEM of three biological replicates, with three technical replicates each, were plotted. Two-way ANOVA analysis was performed and Tukey´s HSD test was applied. Different letters above the bars indicate significant differences (p < 0.05). (B) Representative pictures of each condition are shown.

# Methylobacterium sp. 2A Displays Biocontrol Activity Against P. infestans

PGPR can promote plant growth indirectly by preventing the deleterious effects of plant pathogens. Our result with in vitro plants suggested that Methylobacterium sp. 2A protected potato plants against P. infestans (Figure S2). In order to evaluate the biocontrol effect of this strain, an in plate confrontation assay against P. infestans, B. cinerea, Fusarium sp., R. solani, and P. dissotocum was performed. As shown in Figures 5A, B and S3, Methylobacterium sp. 2A inhibited mycelial growth of P. infestans (24.8%), B. cinerea (42.1%), and Fusarium sp. (34.7%). On the contrary, no notorious effect was observed against R. solani (7.2%) and P. dissotocum (1.2%).

To further analyze the antagonistic effect of Methylobacterium sp. 2A, P.infestans infection assays were conducted on greenhouse plants that were previously sprayed with Methylobacterium sp. 2A (I) or with water (controls, C). At 5 dpi we evaluated the presence and area of necrotic lesions (Figures 5C, D). No lesions were observed in leaves sprayed with Methylobacterium sp. 2A (I), while lesions were evident in P. infestans treated ones (Pi-60 and I+Pi-60). However, lesion size was significantly bigger in the absence of Methylobacterium sp. 2A and 87.5% of infected leaves presented chlorosis. On the other hand, in Methylobacterium sp. 2A-sprayed plants, only 28.5% of the leaves were chlorotic. This result suggests that Methylobacterium sp. 2A is able to restrict P. infestans growth.

Induced systemic resistance (ISR) is a physiological "state of enhanced defensive capacity" elicited by PGPR where the plant's innate defenses are potentiated against subsequent biotic challenges (reviewed in Pieterse et al., 2014). We previously reported that the pathogenesis-related protein 1b (StPR-1b) and phenylalanine ammonia-lyase (StPAL) genes were induced in infected and in distal potato leaves upon P. infestans infection (Fantino et al., 2017). Therefore, we decided to analyze their expression in leaves of plants sprayed with Methylobacterium sp. 2A that were then infected or not with Pi-60 and compare them with leaves infected with Pi-60 that were not previously inoculated (Figures 5E, F). At the time point analyzed, no induction of StPR-1b or StPAL expression was observed in inoculated leaves (I). StPR-1b expression was strongly upregulated (ca. 35-fold, p < 0.001) in the presence of the oomycete (Pi-60), however, no upregulation was observed when P. infestans zoospores were spotted in leaves previously sprayed with Methylobacterium sp. 2A (I+Pi-60). On the other hand, StPAL was induced tenfold in infected leaves (Pi-60), and fourfold in I+Pi-60 leaves. The reduction observed in the expression of PR-1b upon P. infestans infection in Methylobacterium sp. 2A-inoculated plants suggests that the defense mechanisms triggered by this isolate do not involve this gene.

### General Genome Features of Methylobacterium sp. 2A

Our data indicated that this isolate has the potential to be a PGPR, so we decided to sequence its genome in order to identify the genes

FIGURE 5 | Methylobacterium sp. 2A is effective in controlling P. infestans. (A) Biocontrol activity of Methylobacterium sp. 2A against different phytopathogens in a dual confrontation assay. The mycelial growth inhibition (%) was calculated comparing the radial size of the pathogen colony in the dual culture and in control plates. (B) Illustrative image showing Methylobacterium sp. 2A antagonistic effect against P. infestans; left plate: pathogen; right plate: dual culture (pathogen + Methylobacterium sp. 2A). Images illustrating the antagonistic effect of Methylobacterium sp. 2A against the other pathogens are shown in Figure S3. (C) Greenhouse potato plants were inoculated or not with Methylobacterium sp. 2A 48 h prior to infection with P. infestans. Illustrative images show control, inoculated (I), infected (Pi-60) and inoculated and infected (I+Pi-60) apical leaflets. (D) Histogram depicting the percentage of necrotic area (%) in leaves infected with P. infestans that were previously inoculated (I+Pi-60) or not (Pi-60) with Methylobacterium sp. 2A. RT-qPCR analysis of StPR-1b (E) and StPAL (F) in inoculated leaves (I), in leaves infected with P. infestans (Pi-60) and in leaves inoculated with Methylobacterium sp. 2A and infected with P. infestans (I+Pi-60). EF-1a was used as a reference gene. Mean ± SEM of three biological replicates, with three technical replicates each, were plotted. Two-way ANOVA analysis was performed and Tukey´s HSD test was applied. Different letters above the bars indicate significant differences in transcript levels between treatments (p < 0.001).

that could be responsible for this effect. Methylobacterium sp. 2A genomic sequence is 6,395,352 bp in length, and G+C content is 69.34 mol%. The ANI values between Methylobacterium sp. 2A and M. phyllosphaerae CBMB27 (GenBank accession no. NZ\_CP015367.1), M. oryzae CBMB20 (GenBank accession no. NZ\_CP003811.1), M. radiotolerans JCM 2831 (GenBank accession no. NC\_010505.1), and M. phyllostachyos BL47 (GenBank accession no. NZ\_FNHS00000000.1) were 84.53%, 84.37%, 85.21%, and 87.79%, respectively. These results were lower than the established threshold of 95%–96% ANI for prokaryotic species boundary (Chun et al., 2018). Furthermore, in silico DDH values between M. phyllosphaerae, M. oryzae, M. radiotolerans, and M. phyllostachyos were 30.50, 30.50, 31.00, and 37.00, respectively. The abovementioned genomic results suggest that this species has not been described so far.

A total of 6,142 coding DNA sequences (CDSs) and 55 structural RNAs (50 tRNAs) were predicted; 2,022 (33%) were classified as hypothetical and 2,419 CDSs (40%) were assigned to RAST subsystems (Figure S4). Using the GenBank file of the annotated genome as input, the Pathologic software automatically assigned 306 pathways and 2,116 enzymatic reactions.

# Plant Growth Promotion Traits in Methylobacterium sp. 2A Genome

Genes involved in chemotaxis and motility were found in Methylobacterium sp. 2A genome; there are 52 genes encoding methyl-accepting chemotaxis proteins (MCP), a cheV, the cheAWRB gene cluster, and cheY. The chemotaxis response regulator cheA gene is essential for motility toward root exudates (De Weert et al., 2002). Moreover, it has 31 flagellarrelated genes grouped in at least five clusters (Table S4). As other methylotrophic bacteria, Methylobacterium sp. 2A genome contains the mxa cluster (12,475 bp) responsible for methanol oxidation.

In addition, several genes were found in the Methylobacterium sp. 2A genome attributable to the PGP traits observed in potato and Arabidopsis plants (Figure 6 and Table S4). We identified the five enzymes involved in L-tryptophan biosynthesis that are encoded by seven genes (trp A-G) in this isolate as in all microbial genomes. Though the chemical reaction steps are highly conserved, the genes of the pathway enzymes show considerable variations in arrangements, operon structure, and regulation in diverse microbial genomes (Priya et al., 2014). In Methylobacterium sp. 2A, TrpEGDC genes are clustered in an operon while trp A, B, and F are unconnected. L-tryptophan is the auxin (IAA) precursor.

Two proposed IAA biosynthesis pathways were identified: the indole-3-acetamide (IAM) and the indole-3-acetonitrile (IAN) pathways. The IAM pathway consists of two-steps were Ltryptophan is first converted to IAM by the enzyme tryptophan-2-monooxygenase (IaaM), and then IAM is converted to IAA by an aliphatic amidase (AmiE). In the IAN pathway, IAN can first be converted to indole-3-acetamide (IAM) by nitrile hydratase and then IAM is converted to IAA by an aliphatic amidase. Also, a nitrilase enzyme has been suggested to convert IAN to IAA directly without the IAM intermediate step. The production of IAA was confirmed using the Salkowski reagent. In fact, Methylobacterium sp. 2A produced higher levels of IAA than the A. brasilense Az39 and Pantoea sp. strains (Figure 7A). This tightly correlates with the increase observed in root hair development in potato and in lateral root density in Arabidopsis. As many PGPR associated with plant roots, Methylobacterium sp. 2A contains the gene encoding 1-aminocyclopropane-1-carboxylate (ACC) deaminase that can metabolize ACC, the immediate precursor of ethylene, thereby decreasing plant ethylene production and increasing plant growth.

Cobalamin (Vitamin B12) has been suggested to stimulate plant development and could be synthesized either via de novo or salvage pathways. For de novo biosynthesis more than thirty enzymatic steps are required, while the salvage pathway is a costeffective way for bacteria (Palacios et al., 2014). In the Methylobacterium sp. 2A genome, all the genes involved in the aerobic de novo biosynthesis were identified (Table S4); some of them are grouped in clusters of genes: cobP/UA/OQ, cobST, cobSC, and cobWNGHIJKLMB (Figure 6). Methylobacterium sp.

### FIGURE 6 | The genome of Methylobacterium sp. 2A contains genes that may contribute to plant growth stimulation and fertilization. Metabolic pathways related to L-tryptophan and indole acetic acid (IAA) biosynthesis, and N2 metabolism and ammonia assimilation pathways are shown. The cob gene clusters involved in vitamin B12 biosynthesis and the pyrroloquinoline quinone (PQQ) operons are indicated. PQQ is a cofactor of GDH, involved in phosphate solubilization. 1 aminocyclopropane-1-carboxylate (ACC)-deamination and organic phosphate acquisition reactions are included. Gene annotation, EC number, and KEGG gene names of the enzymes mentioned in the diagram are listed in Table S4.

2A harbors two copies of cobS. In gram-negative bacteria, exogenous corrinoids are transported into the cell through an ATP-binding cassette (ABC) transport system consisting of Btu genes. BtuB, BtuC, and BtuN genes coding for receptors were identified suggesting that this strain can also synthesize cobalamin by absorbing corrinoids via the salvage pathway (Table S4).

On the other hand, Methylobacterium sp. 2A presents the two-component regulatory genes involved in nitrogen fixation, NtrBC, and NtrXY (Table S4). This system allows the survival of the bacteria in a nitrogen-depleted medium. In fact, we confirmed that Methylobacterium sp. 2A is able to grow in NFb medium (Figure 7B). A two-component nitrogen fixation transcriptional regulator, FixJ and a FixL gene were identified in its genome, together with nifJ, nifP, nifS, and nifU genes (Table S4); however, other nif genes were not assigned by the RAST annotation. In addition, it harbors the enzymes involved in nitrogen metabolism and ammonia assimilation (Figure 6, Table S4).

Moreover, there are four glucose dehydrogenase (gdh) genes present in Methylobacterium sp. 2A genome (Table S4). GDH requires pyrroloquinoline-quinone (PQQ) as a redox cofactor for direct oxidation of glucose to gluconic acid, which diffuses and helps in acidic solubilization of mineral phosphates in soil. Synthesis of PQQ requires the expression of six genes, pqqA-G (Puehringer et al., 2008). Methylobacterium sp. 2A harbors the complete pqqABC/DE operon where pqqC and pqqD are fused into a single polypeptide designated pqqC/D (Figure 6, Table S4) as in Methylorubrum extorquens AM1 (Toyama et al., 1997). In M. extorquens the cluster containing pqqF/G is a group of six genes transcribed in the same direction, the first is predicted to encode an isoleucyl tRNA synthetase and the sixth a dioxygenase (Zhang and Lidstrom, 2003). In Methylobacterium sp. 2A we identified this same region (Figure 6). The DNA sequence encoding the last five genes shared 84.65% identity with the M. extorquens cluster. In addition, there are three kinds of microbial enzymes that can solubilize organic phosphate: a nonspecific acid phosphatase, a phytase, and C-P lyase or phosphonatase. Methylobacterium sp. 2A genome has three genes encoding acid phosphatases that can release phosphate from phosphoric ester or phosphoric anhydride (Figure 6).

Phosphate solubilization is an important trait to increase the availability and uptake of mineral nutrients for plants. From the genomic data, we can infer that Methylobacterium sp. 2A is able to solubilize phosphate, so we grew it in NBRIP solid and liquid media containing TCP. In the solid media, a yellow halo indicative of phosphate solubilization was observed around the positive control, P. fluorescens BNM 233 colony, but not around Methylobacterium sp. 2A. On the other hand, when grown in liquid media, Methylobacterium sp. 2A was able to solubilize mineral phosphate but to a lower extent than the positive control (Figure 7C).

In addition, antiSMASH analysis identified 10 genes associated with terpene biosynthesis (four phytoene synthases, four phytoene desaturases, one phytoene cyclase, and one phytoene dehydrogenase) in three genomic regions. Recent reports highlight the ecological importance of microbial terpenes emitted by beneficial microorganisms that function as volatile organic compounds (mVOCs) altering plant development (reviewed in Schulz-Bohm et al., 2017).

The dual confrontation assays and the in planta infection assays suggest that this isolate may produce compounds to reduce the growth of phytopathogens. In fact, its genome carries a chitinase, a phenazine biosynthesis PhzF gene, two genes involved in 4-hydroxybenzoate synthesis (the transferase, ubiA, and a transporter), a penicillin acylase (PAH), and a CvpA gene encoding a Colicin V production protein. Several copies of the genes responsible for translocating colicins across the membrane (TolA, TolB, TolC, TonB) were also present (Table S4). In addition, genes responsible for aerobactin and enterobactin biosynthesis are present in its genome (Table S4). The presence of a siderophore biosynthetic gene cluster (IucABCD) and a siderophore transporter protein was confirmed with antiSMASH analysis. Concerning antibiotic resistance, Methylobacterium sp. 2A has two chloramphenicol acetyltransferases consistent with the observed chloramphenicol resistance, and a B-lactamase gene that could be responsible for the ampicillin resistance.

Furthermore, its genome encodes ROS scavenging enzymes that could alleviate oxidative damage: eleven catalases, two glutathione reductases, two superoxide dismutases, and five peroxidases, among them, are two glutathione, a cytochrome c, and a thiol peroxidase (Table S4).

### DISCUSSION

PGPR stimulate plant growth by either providing plant hormones (auxin or cytokinin), lowering plant ethylene levels through the action of the enzyme 1-aminocyclopropane-1 carboxylate (ACC) deaminase, helping in the acquisition of nutritional resources (nitrogen, iron, and phosphorus), or antagonizing phytopathogens (Glick, 1995; Glick et al., 1999). In vitro potato and Arabidopsis plants inoculated with Methylobacterium sp. 2A were more developed than noninoculated plants of the same age, suggesting that this isolate could stimulate plant growth. Genome sequencing indicates that Methylobacterium sp. 2A presents several traits related to phyto-stimulation and phyto-fertilization, such as Ltryptophan and IAA biosynthesis, phosphate acquisition and solubilization, siderophore and vitamin B12 biosynthesis, N2 fixation and ammonia assimilation. In fact, the increase observed in root hair development and in lateral root density are auxin-associated phenotypes (Overvoorde et al., 2010) that are in agreement with the IAA produced by this isolate.

The experiments presented in this work were conducted using in vitro plants; however, preliminary experiments were performed with tubers inoculated with Methylobacterium sp. 2A planted in the greenhouse. These plants were more developed than noninoculated ones (data not shown). Apart from modifying IAA levels, this isolate could stimulate plant growth either by fixing N2 or by improving phosphate acquisition. We recognized several genes involved in fixing N2 in Methylobacterium sp. 2A; however, we could not identify the NifH gene of nitrogenase. Multiple independent horizontal gene transfer events took place in the evolution of nitrogen-fixing bacteria. Leaf isolates from the Methylobacterium genus present sequences highly divergent from NifH consensus sequence, more related to the Pfam NifH/frxC-family protein, such as BchX (Madhaiyan et al., 2015). Methylobacterium sp. 2A harbors a chlorophyllide reductase iron protein subunit X (BchX); this fact, together with its ability to grow in Nfb medium, suggests that it is a diazotrophic bacterium. Nevertheless, the nitrogenase activity of Methylobacterium sp. 2A should be assessed by the acetylene reduction assay to conclude this.

Potato is a high fertilizer-demanding crop, which requires 250 kg ha−<sup>1</sup> of nitrogen and 150 kg ha−<sup>1</sup> of phosphorus to get an optimum yield (Khan et al., 2012). Phosphorus availability is often limited due to the formation of insoluble inorganic and organic phosphate complexes (Adesemoye and Kloepper, 2009). The genomic data indicate that Methylobacterium sp. 2A is able to mobilize organic sources of phosphorus via acid phosphatases, and solubilize mineral phosphate, thus contributing to plant phosphate acquisition. The solubilization assay confirmed that this isolate is able to solubilize inorganic phosphate. Therefore, Methylobacterium sp. 2A emerges as a potential biofertilizer for potato plants to improve phosphorus and nitrogen uptake as reported for other PGPR (Hanif et al., 2015; Naqqash et al., 2016).

Furthermore, our results indicate that Methylobacterium sp. 2A mitigates the deleterious effects of salt stress in potato and Arabidopsis plants. Under stress conditions, ethylene regulates plant homeostasis and reduces root and shoot growth. However, under saline conditions, Methylobacterium sp. 2A-inoculated plants presented higher biomass, increased root and shoot growth, and more chlorophyll content than noninoculated plants. The modification in root-architecture promoted by auxin possibly increases the uptake of water and nutrients, thus explaining the improved fitness of the plants. In addition, like many other PGPRs, this isolate encodes an ACC deaminase; it was reported that degradation of ACC by bacterial ACC deaminase releases plant stress and rescues normal plant growth (Glick et al., 2007). In particular, the ACC deaminase-producing bacteria Achromobacter piechaudii ARV8, lowered the level of ethylene and prevented inhibition of plant growth when inoculated in tomato plants grown in the presence of high salt (Mayak et al., 2004a) and drought stress (Mayak et al., 2004b). Therefore Methylobacterium sp. 2A could alleviate salt stress by increasing auxins and decreasing ethylene content in the rhizosphere.

Another important feature observed was that when grown under salt stress, Methylobacterium sp. 2A-inoculated Arabidopsis plants presented lower levels of peroxide than noninoculated ones. Its genome contains several ROS scavenging enzymes and was positive for catalase reaction. It was reported that ROS degradation could be the result of enhancing plant antioxidant activities by PGPR thus protecting plants from salt toxicity (Chu et al., 2019). Catalase activity increased upon inoculation with Methylobacterium sp. 2A in control conditions, but its activity was lower than in noninoculated plants when exposed to salt stress. A metaanalysis performed by Pan et al. (2019) suggests that PGPR helps host plants to alleviate oxidative stress mainly through reducing the generation of ROS formed on the onset of ionic stress, not via scavenging ROS by accumulating antioxidant enzymes in host plants. This might be the case for Methylobacterium sp. 2A.

To successfully thrive within bacterial populations, Methylobacterium sp. 2A has to be highly competitive. Its genome harbors bacterial chemotaxis and motility genes, and several genes involved in the production of antimicrobial compounds (listed in Table S4). The first could be responsible for root association, and the later could function as antibiotics conferring an advantage against other microorganisms while protecting the plant against phytopathogens. In this regard, chitinase has been associated with protection against plant fungal pathogens (Kumar et al., 2018); colicins are the most representative bacteriocins produced by gram-negative bacteria (Beneduzi et al., 2012), while chemicals such as phenazine and 4 hydroxybenzoate act as antibiotics and suppress plant pathogenic microbes (Gupta et al., 2015). As an example, phenazine is known to suppress the plant pathogen F. oxysporum (Chin-A-Woeng et al., 2003). It was reported that PAH participates in acyl-homoserine lactones degradation thus interfering with quorum sensing of competing bacteria (Mukherji et al., 2014). Genes related to aerobactin and enterobactin biosynthesis were also identified in the Methylobacterium sp. 2A genome. Siderophore producing bacteria are of significant importance in the field of agriculture; in addition to supplementing iron to the plant, siderophores prevent the growth of the soil-borne phytopathogens (Fernández Scavino and Pedraza, 2013).

The biocontrol capacity of Methylobacterium sp. 2A was evidenced in the dual confrontation assays against B. cinerea, F. graminearum and P. infestans, and in greenhouse plants infected with P. infestans. Lesion size was smaller and chlorosis was seldom observed in Methylobacterium sp. 2A-inoculated plants infected with Pi-60 compared to noninoculated plants. In addition to producing antimicrobial compounds, PGPR are capable of potentiating plant defense mechanisms against pathogens. ISR has been reported in potato after inoculation with a Rhizobium strain (Reitz et al., 2001). This response differs from systemic acquired resistance (SAR) and is not salicylic acid (SA) dependent (Pieterse et al., 2014). In fact, at the time points analyzed we observed that St-PR1b was not induced in Methylobacterium sp. 2A-inoculated leaves after infection with Pi-60. On the contrary, a very strong induction of StPR-1b was observed when noninoculated plants were infected with P. infestans. This induction, however, did not restrain the oomycete growth. On the other hand, induction of StPAL was observed in Methylobacterium sp. 2A-inoculated plants upon infection with P. infestans. While its antagonizing effect is clear, at the moment, we cannot infer which plant defense mechanisms are primed by Methylobacterium sp. 2A.

In this work, we present a new PGPR isolate capable of promoting plant growth under control and salt stress conditions in two dicots. Probably this beneficial effect is not limited to potato and Arabidopsis plants. In addition, in dual confrontation assays, Methylobacterium sp. 2A restricted the growth of two necrotrophic fungi and of the hemibiotrophic oomycete P. infestans, denoting broad spectrum antagonism. In planta assays confirmed that it was able to reduce P. infestans deleterious effect in potato plants. These promissory results allow us to envisage that Methylobacterium sp. 2A has the potential to be used as a substitute for chemical fertilizers and fungicides, preventing pollution in farmlands. This research is the first approach to understand the PGP capacities of Methylobacterium sp. 2A; we have to test this isolate in the field in order to assess the productivity, efficacy, and viability of this inoculum. Certainly, this eco-friendly microbial technology will contribute to sustainable agricultural practices and is an alternative strategy to improve crop production for an increasing world population.

### DATA AVAILABILITY STATEMENT

This whole-genome shotgun project has been deposited at DDBJ/EMBL/GenBank under the BioProject PRJNA560067. The version described in this paper is version VUOK00000000.1.

### AUTHOR CONTRIBUTIONS

RU conceived and designed the research. CG, EF, and MZ performed and designed the experiments. CG, FS, and DF performed computational analyses. RU and CG analyzed the genomic data and wrote the manuscript. All authors have read and approved the final manuscript.

### FUNDING

RU and DF are members of Carrera de Investigador Científico from Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET, Argentina); RU is Associate Professor at Universidad de Buenos Aires (UBA). MZ is Professor and Researcher at UBA. CG and FS are recipients of a doctoral scholarship from CONICET and EF is a postdoctoral fellow from CONICET. This work was funded by CONICET (PIP 0455), Universidad de Buenos Aires (UBACYT), and Agencia Nacional de Promoción Científica y Tecnológica (PICT-2014 3018).

### ACKNOWLEDGMENTS

The strains used in this work were kindly provided by Instituto de Microbiología y Zoología Agrícola (IMyZA, CICVyA, INTA Castelar; A. brasilense Az39 and P. fluorescens BNM 233), Dr. Fernando Pieckenstain (IIB-INTECH, UNSAM-CONICET; B. cinerea T4), M. Sc. Pablo Grijalba (Departamento de Producción Vegetal, Facultad de Agronomía, UBA; P. dissotocum), and Dr. Natalia Almasia (Instituto de Biotecnología, INTA Castelar; R. solani). Also, we want to acknowledge Dr. Diego Wengier (INGEBI-CONICET) for the Arabidopsis Col-0 seedlings and kind advice.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2020.00071/ full#supplementary-material

## REFERENCES


identification of indole derivatives. J. Chromatogr. 132, 267–276. doi: 10.1016/ S0021-9673(00)89300-0


Methylobacterium longum sp. nov. Antonie van Leeuwenhoek. Int. J. Gen. Mol. Microbiol. 101, 169–183. doi: 10.1007/s10482-011-9650-6


Conflict of Interest: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Grossi, Fantino, Serral, Zawoznik, Fernandez Do Porto and Ulloa. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Transgene Biocontainment Strategies for Molecular Farming

### Michael Clark<sup>1</sup> and Maciej Maselko1,2,3 \*

<sup>1</sup> Applied Biosciences, Macquarie University, North Ryde, NSW, Australia, <sup>2</sup> CSIRO Health and Biosecurity, Canberra, ACT, Australia, <sup>3</sup> CSIRO Synthetic Biology Future Science Platform, Brisbane, QLD, Australia

Advances in plant synthetic biology promise to introduce novel agricultural products in the near future. 'Molecular farms' will include crops engineered to produce medications, vaccines, biofuels, industrial enzymes, and other high value compounds. These crops have the potential to reduce costs while dramatically increasing scales of synthesis and provide new economic opportunities to farmers. Current transgenic crops may be considered safe given their long-standing use, however, some applications of molecular farming may pose risks to human health and the environment. Unwanted gene flow from engineered crops could potentially contaminate the food supply, and affect wildlife. There is also potential for unwanted gene flow into engineered crops which may alter their ability to produce compounds of interest. Here, we briefly discuss the applications of molecular farming and explore the various genetic and physical methods that can be used for transgene biocontainment. As yet, no technology can be applied to all crop species, such that a combination of approaches may be necessary. Effective biocontainment is needed to enable large scale molecular farming.

### Edited by:

Domenico De Martinis, Italian National Agency for New Technologies, Energy and Sustainable Economic Development (ENEA), Italy

### Reviewed by:

Raquel Lia Chan, CONICET Santa Fe, Argentina Linda Avesani, University of Verona, Italy

### \*Correspondence:

Maciej Maselko maciej.maselko@mq.edu.au

### Specialty section:

This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science

Received: 04 December 2019 Accepted: 11 February 2020 Published: 03 March 2020

### Citation:

Clark M and Maselko M (2020) Transgene Biocontainment Strategies for Molecular Farming. Front. Plant Sci. 11:210. doi: 10.3389/fpls.2020.00210 Keywords: biocontainment, molecular farming, pharmaceuticals, plant synthetic biology, metabolic engineering, transgene, industrial enzymes, biofuel

### MOLECULAR FARMING

The potential of engineered plants as low-input production platforms for large-scale production of pharmaceuticals is an area of active research. Examples of plant made pharmaceuticals (PMPs) with global markets include human insulin, human serum albumin (HSA) and HIVneutralizing antibodies. There is a large need for human insulin due to the high incidence of diabetes world-wide, which includes a substantial undersupplied market in Asia. Plant production of insulin could meet this shortfall at a price diabetics in this region could afford (Stoger et al., 2014). Over 500 tons per year of HSA are necessary to treat fetal erythroblastosis, fluid loss due to burn injuries, hypoproteinemia, and ascites caused by cirrhosis of the liver (Chen et al., 2013). InVitria, a division of Ventria Bioscience has developed Optibumin, a rice-derived HSA that has already been commercialized (He et al., 2011) 1 . An estimated 5 tons of HIVneutralizing antibody is needed to supply 10 million women with the minimal amount necessary to prevent HIV (Shattock and Moore, 2003). Two publicly funded projects, Pharma-Planta and Future-Pharma have produced HIV-neutralizing antibodies in tobacco and corn seeds for clinical trials. It is hoped their production platform could be used in affected areas to produce microbicides "in the region for the region" (Stoger et al., 2014). For all three examples, there is a greater demand for these products than there is a supply of them, and this is particularly the

<sup>1</sup>https://invitria.com/cell-culture-products/optibumin-lipid-reduced-recombinant-human-albumin-rhsa/

case in under-developed countries. The immense scalability of molecular farming could meet the demand at a price that matches the economic situation of the target areas.

Molecular farming also has the potential to enhance the production of pharmaceuticals naturally produced in plants such as the anti-cancer drug Taxol (paclitaxel) and artemisinin, a crucial anti-malarial compound. The plants that synthesize these compounds do so in low concentrations and grow slowly resulting in only minute quantities of the desired compound (Buyel, 2018). Taxol was originally extracted from the Pacific yew, Taxus brevifolia, where the bark from a single 100 year old tree yields about 300 mg of Taxol, enough for only one dose (Horwitz, 1994). Today Taxol is produced by Bristol-Myers Squibb using a semi-synthetic process starting with Taxol intermediates extracted from the Yew tree needles. Using the needles rather than the bark is non-destructive but the Yew tree is still slow growing and the intermediates require expensive purification (Howat et al., 2014). Similarly low amounts of artemisinin – 0.01–1.4% dry weight (DW) – accumulate in sweet wormwood, Artemisia annua (Ikram and Simonsen, 2017). Engineering the biosynthetic pathways for these compounds into heterologous plants optimized for molecular farming could boost supplies and reduce costs (Wurtzel et al., 2019).

Although the complete biosynthetic pathway for the production of Taxol hasn't been elucidated the biosynthetic pathway to produce the first committed product, taxadiene, has been engineered into Nicotiana benthamiana. The full biosynthetic pathway for artemisinin has also been engineered into N. tabacum. Both tobacco species are production platforms for molecular farming due to their fast growth and high biomass production. The most successful attempt to biosynthetically produce artemisinin took two large sections of the metabolic pathway for artemisinin and genetically engineered them separately into three different cellular compartments (chloroplast, nucleus, and mitochondria). The resulting heterologous expression of artemisinin at ∼0.8 mg/g DW was less than in the native plant, which can reach 31.4 mg/g DW (Zhang et al., 2009; Malhotra et al., 2016). This can in part be explained by the complexity of the gene expression and regulation of the biosynthetic pathway (Ikram and Simonsen, 2017). A mg/g comparison also doesn't reflect N. tabacum's faster growth and higher biomass production when compared to A. annua. The genetic engineering of N. benthamiana to express a taxadiene synthase gene, which produces taxadiene from geranylgeranyl diphosphate (GGPP), produced 11–27 µg/g DW taxadiene. Further suppression of the phytoene synthase gene and addition of methyl jasmonate increased taxadiene accumulation to 35 µg/g DW (Hasan et al., 2014). The successful de novo production of taxadiene could lead to the development of a heterologous plant system that biosynthesises Taxol. Future improvements in metabolic engineering could see a breakthrough in how these high value compounds are produced.

Using plants for the production of enzymes or other proteins impacts both the safety and the potential activity of the isolated products. Plant production is also free from human pathogens – a major concern in mammalian cell culture production systems – and free from endotoxins, which are a risk in bacterial systems (Commandeur et al., 2002). Protein glycosylation patterns can be manipulated in plants, including to produce 'humanized' glycosylation patterns (Hanania et al., 2017; Mercx et al., 2017). This is important for complex glycoproteins such as monoclonal antibodies or membrane proteins as glycosylation can affect protein stability, subcellular targeting, biological activity, and immunogenicity (He et al., 2014). The glycosylation of asparagine or arginine side-chains is similar for plants and mammals until the glycan reaches the Golgi apparatus. In plants the side-chain can be modified by the attachment of an α(1,3) linked fucose or β(1,2)-linked xylose, whereas in mammals there can be the attachment of an α(1,6)-linked fucose, β(1,4) linked galactose or sialic acid (Gomord et al., 2010). In some cases, plant glycosylation produces proteins with higher pharmacological activity than proteins produced by bacterial or mammalian cells. For example, plant production systems produce taliglucerase alfa, a mannose-terminated glycoprotein for the treatment of Gaucher's disease, where terminal mannose residues are needed to bind to macrophage mannose receptors. In contrast, mammalian cell system production requires postproduction glycosylation modifications to expose terminal α-mannose residues (Grabowski et al., 2014). There is, however, the possibility alternative glycosylation will increase the chance of immunogenicity. Several plant production systems have been engineered to give the recombinant protein human glycosylation patterns (Kallolimath et al., 2016).

Plants can also produce large volumes of industrial compounds. Examples of plant made industrial compounds (PMIs) include cellulases and amylases for bioethanol production, xylanases to enhance animal feed and oxidation/reduction enzymes such as laccases and peroxidases for paper manufacturing (Van Der Maarel et al., 2002; Bailey et al., 2004; Clough et al., 2006; Shen et al., 2012; Hood and Requesens, 2014). Currently, bioethanol is produced by using starch derived from corn. To enhance this process Syngenta's genetically modified (GM) corn, Enogen, expresses an α-amylase enzyme, which catalyzes the breakdown of starch into glucose (Que et al., 2014). Corn is also used as animal feed or human food, meaning that there is competition for agricultural space. Plant biotechnology could enable utilizing more of the cellulose and hemicellulose to be used to produce biofuels. The US company Agrivida increased ethanol production by 55% by engineering corn to express cell wall degrading enzymes in planta (Zhang et al., 2011).

Transgenic plants have been developed to be a source of fibrous animal proteins such as collagen, keratin, silk, and elastin (Börnke and Broer, 2010). The Israeli biotechnology company CollPlant developed a tobacco line to produce recombinant human collagen (Stein et al., 2009). Typically, medical collagen comes from animal or human cadavers which pose an infection risk from prions (Pammer et al., 1998). Additionally, the extraction process forms unwanted inter- and intra-molecular bonds, which reduce the solubility and the ability of the collagen to form into more desirable highly structured scaffolds (Zeugolis et al., 2008). Whereas, the plant-derived collagen is cross-link and pathogen free, so it can be modified for the desired application.

For maximal scalability and cheap production molecular farming is conducted with field grown crops. A good case study is the plant biotechnology company Infinite Enzyme, which uses field grown corn to heterologously produce 1.5 million kg of cellulase annually (the amount needed for a 190 million liter per year cellulosic biofuel facility). To produce and process field grown corn only \$2 million in capital investment – for dry milling and defatting equipment – was required; with \$11.7 million per year in operating costs (\$7.8/kg enzyme). In contrast, a microbial fermentation system, which requires tanks and the associated infrastructure, would require \$100 million in upfront capital investment. A further \$15 million per year would also be needed in operating costs (\$10/kg enzyme)<sup>2</sup> . However, the economic advantages of using a field grown crop, must be balanced out by the possibility of the transgene contaminating other crop production.

### THE TROUBLED HISTORY OF TRANSGENE ESCAPE

While molecular farming has the potential to lower the cost of medications and industrially useful compounds, the growth of these technologies is contingent on the containment of the transgenes. Challenges of transgene biocontainment are not just hypothetical; there are two salient examples of the need for effective containment – the ProdiGene and StarLink affairs (Murphy, 2007). ProdiGene produced a transgenic corn that expressed a vaccine for preventing bacteria-induced diarrhea in pigs, and while the vaccine protein was non-toxic to humans, strict exclusion from the human food chain was required (Hileman, 2003). StarLink's corn crop was genetically engineered with a gene for resistance to the herbicide glufosinate, and it contained a variant of the pest control Bacillus thuringiensis (Bt) protein (Cry9C) – it also lacked approval for food use. In 2000, StarLink's transgenic corn contaminated millions of tons of non-transgenic corn throughout the United States. Government officials have said StarLink's developer, Aventis CropScience, failed to ensure farmers kept StarLink corn separate from other varieties<sup>3</sup> . The contaminated corn was recalled for disposal, costing Aventis an estimated \$500 million (Murphy, 2007). In 2002, ProdiGene failed to eradicate plants that had seeded from their previous season's transgenic corn crop. This led to the contamination of non-transgenic soybeans. ProdiGene's failure to manage their transgenic corn crop resulted in 12,000 tons of soybean being destroyed. The combined cost to ProdiGene was about \$3.5 million with an additional US government fine of \$250,000 (Thayer, 2002).

The fallout from the ProdiGene and StarLink affairs was lasting. In response the molecular farming industry pushed for tighter regulations regarding the approval process for molecular farming crops (Murphy, 2007). In 2003, the Animal and Plant Health Inspection Service (APHIS) of the US

<sup>3</sup>https://web.archive.org/web/20070711190925/http://archives.cnn.com/2000/ FOOD/news/10/18/conagra.grain.ap/

Department of Agriculture (USDA) introduced the requirement that crops engineered to produce PMIs be grown under permit. Previously, a GM PMI producing crop could be cultivated under notification, which expedited the permitting procedure (Federal Register [FR], 2005). A full discussion of the interplay between regulation and molecular farming is beyond the scope of this review. Although, it is worth making the point that regulatory hurdles remain a barrier to molecular farming. For example, Syngenta's development of Enogen cost several 100 million dollars, a lot of which was due to it taking almost 6 years to pass USDA's regulatory review process (Wang and Ma, 2012). It is promising though that in 2011 Enogen met USDA's requirements to be fully deregulated. In doing so Enogen became the first plant genetically engineered for industry to be granted this status (Wang and Ma, 2012). The success of Enogen shows a pathway to the commercialization of a PMI production platform.

Inefficient transgene biocontainment has impacted international trade. Japan and South Korea halted imports of corn from the United States during the StarLink corn incident. Exports of wheat to Japan and South Korea were also briefly stopped in 2013, after a GM wheat event MON71800 – developed by Monsanto to be glyphosate-tolerant, was found growing in a field. Monsanto paid \$2.1 million to farmers to compensate the loss of export income and reputational damage, and paid \$250,000 to several wheat growers' associations<sup>4</sup> . In 2016, a sister event (the same DNA was inserted into a different genomic location) – MON71700 – was found to have contaminated a field in the state of Washington. The 22 plants descend from a field trial conducted by Monsanto from 1998 to 2001<sup>5</sup> . In both cases the reoccurrence of the GM wheat was unexplained. The precedent of a GM crop re-emerging more than a decade after a trial stokes public concern over food safety and biosecurity. Such concerns will continue to impact the adoption and development of plant biotechnologies (Murphy, 2007). In order to foster acceptance of transgenic plant production systems there must be proper containment and security at all levels of production.

### IMPORTANCE OF BIOCONTAINMENT

There are concerns from the public and from within the scientific community that molecular farming could threaten non-GM agriculture, the environment, and human health. Without adequate biocontainment, neighboring non-GM crops or weeds could receive transgenes and transgenic seeds could contaminate seed storage (Mallory-Smith and Sanchez Olguin, 2011; Gressel, 2015). Contamination worries many in the food industry, who are not involved with molecular farming, but could suffer financially and in terms of public confidence if theirs or any other major edible crop became contaminated (Murphy, 2007). Contamination can impact international trade between countries that have legal restrictions on importing transgenic products (Lu,

<sup>2</sup>https://infiniteenzymes.com/technology-2/

<sup>4</sup>https://time.com/3582953/monsanto-wheat-farming-genetically-modifiedsettlement/

<sup>5</sup>https://monsanto.com/company/media/statements/statement-gmo-wheatplants/

2003). There are also environmental concerns stemming from the possibility of crop-to-wild transgene flow. In most cases, the few resulting offspring from crop × wild crosses will be outcompeted due to being less locally adapted than the wild type (Gressel, 2015) although the transfer of herbicide resistance genes to weeds, including invasive species, could increase the difficulty of eradicating them. It is improbable, but a transgene could also spread from an engineered crop to a weed and then from that weed to another crop. In this way, weeds that contain the transgene could act as a reservoir for that transgene allowing spread to non-GM crops.

In some cases, molecular farming could potentially pose a risk of humans or animals being harmed through inadvertent exposure to an unsafe level of recombinant protein (Breyer et al., 2012). The majority of PMPs currently in production, such as antibodies, growth hormone, insulin and most other proteins, are expected to have no pharmacological effect when ingested (Goldstein and Thomas, 2004). Instead the gastro-intestinal tract will degrade most PMPs to harmless peptides or amino acids. However, many exceptions may exist in the future, and some plant pharmaceuticals, such as oral vaccines, are designed to be active when ingested. There is also potential for skin or eye contact and inhalation of the recombinant protein as well as the potential allergenicity of the plant itself (Breyer et al., 2012). The human health threats are heightened by the fact that a plant product could enter the human food or animal feed chain. An event that is more likely if the transgenic crop is also a food crop, as was seen for ProdiGene.

As well as potentially exposing humans or animals to a harmful compound, contamination can affect the quality of related crops. The North American Miller's Association were concerned that the transgene for amylase expression in Enogen could spread into other corn varieties and result in lower quality tortillas, corn puffs, and bread (Waltz, 2011). The advance of agriculture will likely see new crop varieties generating novel products such as cotton engineered to be red in color. In order to maintain the phenotypic integrity of transgenic and non-transgenic cultivars effective biocontainment will be required.

The potential economic, environmental, and health threats from molecular farming can be greatly reduced through controlling the flow of the transgene. It's also important to point out that the level of threat from transgene escape depends on the nature of the contamination. Trace mixing of seed that contains a toxic protein is unlikely to be harmful due to dilution. However, the introgression of a transgene, which expresses a toxic protein, into a neighboring crop or weed could seriously contaminate human food or animal feed chains. Although any contamination, regardless of risk, will likely impact public support for GM agriculture.

# TRANSGENE CONTAINMENT TECHNOLOGIES

Gene flow is a process where the frequency of a gene changes in a population and can occur through gametes, an organism or groups of organisms moving from one population to another. The potential for there to be gene flow into or from a crop depends on the crop's pollination strategy, on the size of the crop, seed size and viability, and whether there are compatible species within pollination distance (Mallory-Smith and Sanchez Olguin, 2011). **Figure 1** details the three main ways that transgenes can spread into the environment. Volunteer plants – plants that have self-seeded from a previous season's crop – can contaminate the next season's crop if they are accidentally harvested alongside the intended crop (Michael et al., 2010). Transgenes may also spread in seeds that can be spilled during the harvest and transfer of seed. Lastly, cross-pollination can lead to either transgenes escaping into neighboring plants or introgression from neighboring plants into the transgenic crop (Gressel, 2015). As we are primarily concerned with the movement of genes into another population, pollen transfer is the form of gene flow that is of most concern.

There are essentially two approaches for minimizing gene flow: containment and mitigation. Containment aims to stop the flow of the gene from the crop and mitigation focuses on preventing the gene from establishing in a significant proportion of the population (Gressel, 2015). Containment can be physical or biological. Physical containment provides a barrier, such as a greenhouse, filters in the lab or isolation distances in the field. There are also efforts to conduct molecular farming underground, e.g., in unused mines, which provide an even higher degree of physical containment<sup>6</sup> . So far there are no documented cases of physical containment failing in the laboratory or greenhouse (Gressel, 2015). Whereas, the shortcomings of geographic isolation were shown when transgenes from GM glyphosate-resistant creeping bentgrass, Agrostis stolonifera, were found in nonagronomic bentgrass up to 3.8 km beyond the control area perimeter (Reichman et al., 2006). With the unreliability of geographic isolation in many situations it is preferable to avoid the use of crop plants grown for human or animal consumption.

Alternative plant production platforms have been developed to reduce the risk of contamination. Some examples of non-food and non-feed crops include tobacco (N. benthamiana), duckweed (Lemna minor), microalgae (Chlamydomonas reinhardtii), and moss (Physcomitrella patens) (Yao et al., 2015). As can be seen from **Table 1**, tobacco and moss are popular production platforms. The use of these plants prevent introgression of a transgene into a plant used for food or feed. If a crop plant is to be used, crops that can be crossed with weedy relatives, such as the sunflower, Helianthus annuus, should be avoided.

Sound biocontainment and rapid production of recombinant protein can be achieved using a transient expression system which does not result in a transgene integrated in the germline. One method to establish a transient expression system is agroinfiltration where the bacteria Agrobacterium tumefaciens – acting as a vector for the gene of interest – is injected or vacuum infiltrated into leaf cells (Whaley et al., 2011). Another approach is to use plant RNA viruses (Yusibov et al., 2006). Both of these approaches can be combined, where agroinfiltration

<sup>6</sup>https://www.wired.com/2004/05/drug-farms-forced-underground/

is used to deliver RNA viral vectors into the leaves of a plant. This process, called 'magnifection' combines the transfection efficiency of A. tumefaciens, the post-translational modifications of a plant and the high expression yield obtained with viral vectors (Marillonnet et al., 2005). In all of these approaches the transferred DNA is expressed but not integrated into the germline. The tobacco N. benthamiana is most often used as the production platform due to the ease with which it can be transformed. Compared to the time it takes to establish a stable transgenic plant line – 6 months to a year – transient expression systems can produce recombinant protein within 3– 5 days (Yao et al., 2015). This is ideal for combating sudden viral epidemics, such as severe acute respiratory syndrome (SARS) or Ebola. Transient expression systems, as a consequence of not introducing transgenes into germline tissue, don't risk contaminating food through transgene outflow into non-GM crops or their wild relatives (Huafang Lai and Jake Stahnke, 2013). However, Agrobacterium infiltration is labor intensive, which was a barrier to transient expression supplying sufficient supplies of an Ebola vaccine (Yao et al., 2015).

Whole-plant production platforms remain attractive due to their scalability but for some applications in vitro systems are preferable. Current in vitro technologies include plant–cell suspension and hairy root cultures. Plant–cell suspensions are typically derived from new tissue formed over a plant callus, which has been cultivated on solidified media. The clumps that easily break apart can be transferred to liquid media. If a homogenous culture forms, the fermentation of the plant cells can be conducted using similar techniques to fermenting lower eukaryotes (Fischer et al., 1999). Cell suspension cultures have sound containment and have a quick development cycle but are a much less scalable production platform, when compared to transgenic plants (Santos et al., 2016). Hairy root cultures are differentiated cultures of transformed roots generated by infection with Agrobacterium rhizogenes (Häkkinen et al., 2014). Hairy root culture can be grown with simple defined media like undifferentiated cells, but it has greater genetic stability and it is highly scalable. These features make it suitable for producing pharmaceutical proteins at an industrial-scale (Guillon et al., 2006). However, in vitro techniques require sophisticated and sterile laboratory settings. If the scalability and low-cost potential of plant production of PMPs or PMIs is to be realized, plants need to be grown in fields.

The higher contamination risk from growing plants in fields can be reduced by genetic containment which may exploit existing reproductive limitations or introduce them via genetic engineering. Many genetic approaches for containing plant transgenes have been investigated including cleistogamy, maternal inheritance, gametic transgene excision, synthetic auxotrophy, total sterility, and genetic use restriction technologies (GURT or Terminator). Other genetic containment technologies in development could be extended to plants, such as engineered genetic incompatibility (EGI), genetic recoding and targeted transgene removal (see **Table 2**). Many of these technologies work well for specific types of plants and can be enhanced by pairing them with other technologies.

### TABLE 1 | Examples of plant made pharmaceuticals.

fpls-11-00210 February 28, 2020 Time: 20:29 # 6


TABLE 2 | The important features of genetic biocontainment technologies.


Cleistogamy, where there is self-pollination within a closed flower, is a promising tool to limit gene flow. Currently it suffers from some flowers opening, which allows for cross-pollination. Cleistogamy requires that the plant's flower contain male and female parts and that there can be self-fertilization. Crops like rice have such flowers, but plants that have separate male and female flowers, like asparagus and spinach, or with unusual flower anatomies, such as corn, aren't suited for cleistogamy. It was

found that for imidazolinone herbicide resistant rice a few flowers opened enabling hybridization with weedy rice (Gealy, 2005). To combat this, rice was genetically engineered to enhance the percentage of cleistogamous flowers through incorporating the cleistogamous gene, 'superwoman1.' The engineered cultivar, in a variety of plots, had an outcrossing rate of 0.000% compared to the non-engineered cultivar, which ranged from 0.005 to 0.200% (Ohmori et al., 2012). The potential of cleistogamy is limited for GM food production as current practices tend to use higher-yielding hybrid rice varieties, which require parental lines that aren't cleistogamous (Gressel, 2015). However, this would not be an issue for the molecular farming of high value compounds where cleistogamy could be used to restrict pollen mediated gene flow.

Synthetic auxotrophy works by genetically engineering a strain to depend on an externally supplied compound. The dependence can come from deleting essential genes that are needed, for example, to synthesize amino acids or co-factors that are necessary for crucial biological functions (Moe-Behrens et al., 2013). So far, this approach has found little traction for use in plants. There are isolated cases such as the duckweed Lemna, which has been engineered to be dependent on the addition of isoleucine through inactivating threonine deaminase expression (Nguyen et al., 2012). However, the genetic redundancy that is a common feature of plant genomes increases the difficulty in engineering recessive auxotrophic mutations. For most plants there are likely multiple proteins that catalyze the same reaction, which requires a large number of genetic changes to confer metabolic dependence (Last et al., 1991). The addition of potentially expensive chemicals, in itself a drawback, also requires changes to normal cultivation techniques. Synthetic auxotrophy can also fail due to introgression of genes from non-transgenic plants, which could restore the knocked out metabolic pathway.

Another approach exploits the maternal inheritance of plastids (e.g., chloroplasts). For the vast majority of higher plants, which display maternal inheritance, transgenes located in the plastid genome are unlikely to be transmitted to other plants by pollination (Maliga, 2004). Plastid engineering has therefore been employed to locate the transgene in the plastid genome, however, the advance of plastid engineering has been stymied by poor transformation protocols for plants other than tobacco. Transformation relies on many essential factors unique to the species and sometimes unique to the cultivar (Lu et al., 2013). There must be detailed knowledge of the plastid genome sequence including the regions in between genes suitable for transgene integration, there also needs to be an optimized DNA delivery system, as well as effective antibiotic selection and selectable marker genes. For several years the chloroplast genome sequences have been available for monocots, such as wheat and corn, but the chloroplast hasn't been transformed due to the engineering complexity (Wani et al., 2015). This approach may also be less efficient than envisioned considering that species that were thought to strictly engage in maternal plastid inheritance still had about 0.4% plastid transmission via pollen (Avni and Edelman, 1991; Svab and Maliga, 2007). Additional problems are: proteins expressed in the chloroplast undergo different posttranslational modifications, meaning that enzyme function might be altered (Grabsztunowicz et al., 2017); plastid transformation can also be laborious and time-consuming (Ruf et al., 2001).

Total sterility offers a sound basis for genetic biocontainment. Several crops are already sterile or have sterile varieties, such as cassava (Manihot esculenta), potatoes (Solanum tuberosum), and banana (Musa acuminata) (Celis et al., 2004; Heslop-Harrison and Schwarzacher, 2007; Sayre et al., 2011). As long as the sterility is not leaky, these crops would be safe candidates for molecular farming. A totally sterile plant can also be engineered by deleting genes that encode for gamete production (Kwit et al., 2011). The downsides are that total sterility requires plants to be vegetatively propagated by either tubers, tissue culture, cuttings, or artificial seed (Gressel, 2015). Total sterility could be used with tuber or bulb propagated crops, leafy vegetable crops and forestry. Whereas crops that are harvested for compounds accumulated in seeds would not be candidates for total sterility.

Gametic transgene excision uses a site-specific recombination system to excise a transgene. Currently, the efficiency of the recombinase is quite low, where 99% excision is considered to be high performing (Moon et al., 2011). This level of efficiency is too low to restrict transgene escape, however, it could be used to excise selectable marker genes used in the engineering of transgenic plants (Hu et al., 2013). Farmers are also not able to collect seed containing the transgene for future seasons unless the recombinase can be externally controlled, which alters normal cultivation practices (Ryffel, 2014; Gressel, 2015).

Genetic use restriction technologies were originally developed to prevent farmers from infringing on patents by saving seed. They have been some of the most controversial GM biotechnologies due to the widespread perception that they were designed to entrench a multinational corporation seed monopoly (Lombardo, 2014). GURTs use a tightly controlled genetic system to regulate the expression of a target gene. There are typically four components to this genetic system: the target gene, the target gene's promoter, the trait switch and the genetic switch. The target gene needs to be activated by the promoter. In order to prevent leaky expression from unwanted promoter activity a blocker sequence separates the promoter from the target gene. The blocking sequence can in turn be removed through a cascade beginning with an external input, which will be amplified by the genetic switch. The amplified input becomes a biological signal that activates the trait switch. The trait switch usually encodes an enzyme, such as a site-specific recombinase that removes the blocker sequence (Lombardo, 2014). Without the blocker sequence there can then be transcription and expression of the target gene.

In part due to public opposition GURTs have never been commercialized. However, there is scope for GURTs to be used for biocontainment. For this, the GURT system would be linked to the transgene, so that when the GURT is activated there is expression of a disrupter gene that drives cell death. Disrupter genes typically encode for cytotoxins such as barnase and ribonuclease A (Mariani et al., 1990; Burgess et al., 2002; Gils et al., 2008; Zhang et al., 2012). There is no evidence that disrupter genes generate products that are toxic to humans or animals. However, it is possible that the potential health risk will add to the already controversial nature of using GURTs (Conner et al., 2003;

Gressel, 2010). The other disadvantages to GURT are that it is a more expensive system, requiring exogenous inputs and there is greater difficulty in propagating a GURT crop.

# FUTURE BIOLOGICAL CONTAINMENT TECHNOLOGIES

The biocontainment technologies that have been developed in microbes could in some cases be extended to plants. Some of these technologies include, genetic recoding, targeted transgene removal and EGI.

Genetic recoding removes every instance of at least one codon for an amino acid in an organism's genome and replaces it with another. The codon that has been removed can be replaced with a synonymous codon or it can then encode for a non-standard amino acid (NSAA) (Mukai et al., 2017). If an essential gene was recoded to require an amino acid not found in nature this would increase the stringency of an auxotrophy. Further, the genetic recoding could create reproductive isolation and block gene flow with non-recoded organisms due to incompatible genetic codes. Escherichia coli has been recoded so that the UAG stop codon instead incorporated a NSAA in the cores of essential enzymes. This conferred a dependence on synthetic metabolites for proper protein function, such that the bacteria were less capable of mutational escape and metabolic supplementation (Mandell et al., 2015). Following on from this the genetic recoding of plant genomes could confer better biocontainment. Despite the advance of the technology, we are unlikely to see recoding of higher organisms with ease in the near future due to the scale of changes needed in large genomes.

Another strategy could be to precisely remove the engineered genes instead of killing the whole organism. The spread of transgenes from volunteer plants or inadvertent seed dispersal could be mitigated by using a CRISPR-based system to selectively remove the transgene after the desired protein has been produced. In one such method, a genetically encoded device, termed DNAi, responds to a transcriptional input by degrading DNA adjacent to a synthetic CRISPR array. The DNAi system was shown to be non-toxic when carried in E. coli, and when activated it was able to reduce the number of viable cells by 1.9 × 10−<sup>8</sup> making it one of the most effective switches for programmed cell death (Caliando and Voigt, 2015). This same mechanism could be engineered so that with the addition of a transcriptional input the transgene is degraded. An advantage of this system is that the removal of the transgene applies little selective pressure toward deactivating the genetic machinery; whereas directing whole organism death selects for mutations that lead to an organism's survival.

The aforementioned biological containment technologies, with the exceptions of cleistogamy, genetic recoding, and total sterility, don't prevent the flow of genes into the transgenic plant. This is an important consideration as unwanted gene flow can alter important traits in a genetically engineered organism. In order to restrict gene flow in both directions, plants could be engineered to be genetically incompatible with related plants such that the hybrid is less fit – this is known as underdominance. The model organism Drosophila melanogaster has been engineered such that engineered-WT hybrids display underdominance. This was achieved using a genetic construct to encode for two genes: the first encodes for a RNAi knockdown of the WT version of the gene Rpl14; the second gene is a refactored version of Rpl4 such that it isn't susceptible to RNAi knockdown. When the engineered organism was mated with WT flies there was a marked fitness reduction in the heterozygotes (Reeves et al., 2014). However, in order to be effective for biocontainment the underdominance must result in total sterility or death of the hybrids.

An artificial reproductive barrier, where the hybrids are non-viable, has been engineered in Saccharomyces cerevisiae using EGI. This system utilizes programmable transcriptional activators (PTAs) to overexpress a gene leading to lethality. Lethality in the engineered organism is avoided by editing the target sequence of the PTA, such that the PTA is unable to bind and overexpress the gene (Maselko et al., 2017). When there is a cross between the WT and the engineered organism, the PTA targets the WT PTA binding sequence and drives lethal levels of gene expression. Attempts have also been made at constructing a synthetic species of D. melanogaster, where an artificial reproductive barrier is engineered, however the goal of complete genetic isolation wasn't achieved. The main difficulty proved to be getting strong activation of a lethal gene without the fitness costs associated with broad expression of the transactivating CRISPR machinery (Waters et al., 2018).

There is an inherent versatility to the use of PTAs, so that lethal overexpression of a target gene could theoretically be engineered in any sexually reproducing organism (Maselko et al., 2017). Proof of concept has so far only been established in S. cerevisiae. Although, it is conceivable that EGI could be extended to plants, where it could be used to generate many orthogonal strains of the same parent species which could each be used as production platforms for different compounds. If interbreeding can be prevented then the phenotypic integrity of transgenic cultivars could be protected. EGI could also be used to make synthetic auxotrophy more robust by preventing introgression from neighboring plants, which would otherwise compromise the auxotrophy.

# TRANSGENE MITIGATION TECHNOLOGIES

Even the most stringent containment system can fail. Technologies are therefore needed to reduce the chances of a transgene becoming established after escape. Transgenic mitigation involves linking the transgene to genes that confer a selective disadvantage. Weedy traits such as a propensity toward shattering, bolting, and greater height can be targeted (Gressel, 2015). Transgenic mitigation reduced the reproductive fitness of transgenic-weed oilseed rape hybrids. A dwarfing mitigator gene was linked to a herbicide resistance transgene, which reduced the reproductive fitness of the transgenic-weed hybrid to 0.9% of the competing weed's reproductive fitness (Al-Ahmad and Gressel, 2006). However, there is the potential for the linkage

of the mitigator gene to the transgene to be broken through meiotic crossing over. Additionally, there can be mutation of the mitigator gene so that it ceases to confer the deleterious phenotype. Both of these issues can in some part be addressed through linking another mitigator gene to the transgene, such that there are mitigating genes either side of the transgene (Gressel, 2015).

### CONCLUSION

Molecular farming has the potential to lower the cost of medication and industrial enzymes. However, in cases where the recombinant protein is potentially toxic, there are environmental and human health risks. The introgression of the transgene into a neighboring crop or weed may contaminate food or feed supplies. Any contamination event, such as in the high-profile cases of StarLink and ProdiGene, could jeopardize confidence in molecular farming. For these reasons there must be effective containment of transgenes.

There has been considerable progress in the development of biological containment technologies. For some species

### REFERENCES


such as rice, cleistogamy could contain gene flow. For tubers and bulb propagated crops total sterility is practical. But for many species these technologies aren't applicable. There is some promise that technologies like EGI combined with synthetic auxotrophy could contain gene flow. Further work in this area is needed to ensure the safety and widespread adoption of field grown molecular farming crops.

### AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

### FUNDING

This work was funded through the CSIRO Synthetic Biology Future Science Platform and a Macquarie University Research Excellence Scholarship.




**Conflict of Interest:** MM is a co-founder and chief technical officer of NovoClade LLC.

The remaining author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Clark and Maselko. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# An Interdisciplinary Approach to Study the Performance of Second-generation Genetically Modified Crops in Field Trials: A Case Study With Soybean and Wheat Carrying the Sunflower HaHB4 Transcription Factor

### Edited by:

Domenico De Martinis, Energy and Sustainable Economic Development (ENEA), Italy

### Reviewed by:

Daisuke Todaka, The University of Tokyo, Japan Renata Fuganti-Pagliarini, Embrapa Soybean, Brazil

### \*Correspondence:

María Elena Otegui otegui@agro.uba.ar Raquel Lía Chan rchan@fbcb.unl.edu.ar

### Specialty section:

This article was submitted to Crop and Product Physiology, a section of the journal Frontiers in Plant Science

Received: 21 October 2019 Accepted: 05 February 2020 Published: 06 March 2020

### Citation:

González FG, Rigalli N, Miranda PV, Romagnoli M, Ribichich KF, Trucco F, Portapila M, Otegui ME and Chan RL (2020) An Interdisciplinary Approach to Study the Performance of Second-generation Genetically Modified Crops in Field Trials: A Case Study With Soybean and Wheat Carrying the Sunflower HaHB4 Transcription Factor. Front. Plant Sci. 11:178. doi: 10.3389/fpls.2020.00178 Fernanda Gabriela González <sup>1</sup> , Nicolás Rigalli <sup>2</sup> , Patricia Vivian Miranda3,4, Martín Romagnoli <sup>2</sup> , Karina Fabiana Ribichich<sup>5</sup> , Federico Trucco<sup>3</sup> , Margarita Portapila<sup>2</sup> , María Elena Otegui 6\* and Raquel Lía Chan5\*

<sup>1</sup> Estación Experimental Pergamino, INTA, CITNOBA, CONICET-UNNOBA, Pergamino, Argentina, <sup>2</sup> CIFASIS, Universidad Nacional de Rosario—CONICET, Rosario, Argentina, <sup>3</sup> Instituto de Agrobiotecnología Rosario (INDEAR)/BIOCERES, Rosario, Argentina, <sup>4</sup> CONICET Buenos Aires, Argentina, <sup>5</sup> Instituto de Agrobiotecnología del Litoral, Universidad Nacional del Litoral— CONICET, Facultad de Bioquímica y Ciencias Biológicas, Santa Fe, Argentina, <sup>6</sup> CONICET-INTA-FAUBA, Estación Experimental Pergamino, Facultad de Agronomía Universidad de Buenos Aires, Pergamino, Argentina

Research, production, and use of genetically modified (GM) crops have split the world between supporters and opponents. Up to now, this technology has been limited to the control of weeds and pests, whereas the second generation of GM crops is expected to assist farmers in abiotic stress tolerance or improved nutritional features. Aiming to analyze this subject holistically, in this presentation we address an advanced technology for drought-tolerant GM crops, upscaling from molecular details obtained in the laboratory to an extensive network of field trials as well as the impact of the introduction of this innovation into the market. Sunflower has divergent transcription factors, which could be key actors in the drought response orchestrating several signal transduction pathways, generating an improved performance to deal with water deficit. One of such factors, HaHB4, belongs to the homeodomain-leucine zipper family and was first introduced in Arabidopsis. Transformed plants had improved tolerance to water deficits, through the inhibition of ethylene sensitivity and not by stomata closure. Wheat and soybean plants expressing the HaHB4 gene were obtained and cropped across a wide range of growing conditions exhibiting enhanced adaptation to drought-prone environments, the most important constraint affecting crop yield worldwide. The performance of wheat and soybean, however, differed slightly across mentioned environments; whereas the improved behavior of GM wheat respect to controls was less dependent on the temperature regime (cool or warm), differences between GM and wild-type soybeans were remarkably larger in warmer compared to cooler conditions. In both species, these GM crops are good candidates to become market products in the near future. In anticipation of consumers' and other stakeholders' interest, spectral analyses of field crops have been conducted to differentiate these GM crops from wild type and commercial cultivars. In this paper, the potential impact of the release of such market products is discussed, considering the perspectives of different stakeholders.

Keywords: transgenic wheat, transgenic soybean, HaHB4, sunflower transcription factor, drought tolerance, grain yield determination

### INTRODUCTION

The challenge imposed by the expected increase in food demand by 2050 will not be accompanied by the necessary increase in the relative rate of yield progress of major grain crops (wheat, maize, rice, and soybean), even considering breakthroughs as photosynthesis improvement via bioengineering (Hall and Richards, 2013). Additionally, grain yield (GY) losses of 8–43% respect to present-day yields are estimated for these crops, as climate change is predicted to increase temperatures and the frequency of extreme events such as drought (IPCC, 2014). This scenario will be accompanied by a largely augmented demand of water for direct human use, all trends that will require technologies aimed to improve and secure crop production with maximum resource-use efficiency and low environmental impact.

The difference between potential and actual GY varies extensively depending on the production area and the unpredictable climate. Water deficits of variable duration and intensity are among the main determinants of mentioned GY losses (Aramburu et al., 2015; Rattalino Edreira et al., 2018), and breeding efforts to increase crop tolerance to abiotic stress represent an environment-friendly avenue to reduce this gap.

Researchers worldwide have been working during decades applying different breeding strategies to increase crops GY under unfavorable environments (Campos et al., 2004), in a process that traditionally takes a long way from the initial stage in nurseries or gene discovery in labs to final adoption by farmers (Hall and Richards, 2013).

The present study contributes to a better understanding of the possibilities, difficulties and significant time requirements that occur when a transgenic technology developed in a model plant such as Arabidopsis is upgraded to evolutionary distant and economic important crops like wheat or soybean. A successful case is presented here: wheat and soybean transformed with the sunflower transcription factor gene HaHB4 (Helianthus annuus HomeoBox 4). Transformed cultivars of both species are expected to be released to the market soon (probably during 2020), and this success was achieved through common and cooperative efforts of molecular biologists and agronomists from public institutions and private companies, who were able to overcome the additional obstacle usually imposed by epistemological barriers (Sadras and Richards, 2014).

## ALTHOUGH GENETICALLY MODIFIED CROPS HAVE BEEN ADOPTED WORLDWIDE, SECOND GENERATION IS ABSENT IN THE MARKET

Since 1996, genetically modified (GM) crops have been adopted by farmers worldwide because they increase food and feed production efficiently by generating plants with higher GY in reasonably short times. The main advantage of transgenic technologies is the possibility to overcome sexual incompatibilities between plants and species barriers allowing the introduction of genes from unrelated organisms such as bacteria, fungi or other plants and also from viruses. Genetically modified organisms (GMOs), however, triggered controversies both in adopting and non-adopting countries, although stringent regulatory processes for food/feed and environmental safety were implemented and applied. Furthermore, several countries have established mandatory GMO labeling whereas voluntary labeling is preferred by others (Kamle et al., 2017). Most of the recently developed commercial GM crops exhibit herbicide tolerance, insect resistance or both traits stacked.

The second generation of GM crops was projected to mitigate abiotic stress effects. However, such crops are not commercially available so far and this is due to several reasons. Firstly, most evaluated events failed to translate benefits observed in controlled environments to field conditions (Passioura, 2012). Additionally, a long process is needed to release a GM product to the market, and this is due to regulation requirements mostly as a consequence of the bad public perception about GMOs (Blancke et al., 2015; Fernbach et al., 2019). Huge investments are needed to accomplish such requirements and have limited often attempts to advance GMOs at different stages of development.

An exception to the rule is the maize hybrid expressing a bacterial RNA chaperone that was released for use in a limited, drought-prone region of the United States (Castiglioni et al., 2008). Other droughttolerant GMOs were developed but not released, such as sugarcane expressing a betaine gene and exhibiting augmented sugar content under water deficits (GM Approval Database, 2019).

### UPSCALING DROUGHT TOLERANCE FROM THE POT TO THE CROP

The advent of molecular genetics brought a pronounced increase in the number of studies involving plant transformation aimed to improve crop performance under water deficit conditions. The initial excitement was soon followed by the striking evidence of serious difficulties to scale up from individual plants grown in pots to communal plants in the field (Passioura, 2012). Lack of success has been usually linked to two weaknesses. One was the lack of a clear understanding of the benefits/drawbacks of a gene/ trait at the crop rather than at the single plant level. For instance, compensations that usually take place when moving from plants to crops are of high importance (Pedró et al., 2012). Also, breeders and agronomists do not deal usually with growing conditions for which survival traits may represent an advantage (i.e. very low yielding environments). Traits of value are expected to represent an actual benefit in GY under stressful conditions with no penalty in the high yielding ones. The second weakness was poor knowledge of the variability in drought scenarios (i.e. opportunity, extent and intensity of drought) and their frequency (Chapman et al., 2000) in what breeders describe as the target population of environments (Cooper et al., 2005). Such variability usually receives little if any attention by molecular biologists (Passioura, 2006).

The water budget of an environment depends upon rainfall distribution and soil characteristics, which combined with evaporative demand, regulate the capacity of plants to hold maximum transpiration or reduce it (Sadras and Milroy, 1996). Reduced transpiration leads to the occurrence of water deficits of variable intensity and duration (Chapman et al., 2000; Connor et al., 2011). Rainfall distribution, together with temperature patterns set the limits to crop choice by farmers in rainfed systems, basically between the monsoon climate type (prevalence of summer crops) and the Mediterranean climate type (prevalence of winter crops). There are, however, humid and sub-humid areas where total rainfall allows year-round cropping systems but many times with large intra-seasonal variability if there is an occurrence of drought (Harrington and Tow, 2011). Simultaneously to climate, the soil type (i.e. texture) and its condition (e.g. compaction) affect total plant-available soil water storage as well as the amount that is readily available to plants (Passioura, 1991; Dardanelli et al., 2004).

Assuming cycle duration and its partitioning between vegetative and reproductive phases have been optimized to the water budget of each environment (Passioura, 2006), variable conditions experienced by the soil–canopy–atmosphere continuum along the cycle may pose an additional challenge when breeding crops for drought-prone regions. Passioura (2006) proposed to consider three main issues when evaluating a trait for these regions. The trait should increase (i) water use by transpiration (WUt, in mm) of the limited water supply, (ii) transpired water use efficiency for biomass production (WUEt: biomass produced per unit of water transpired, in kg ha−<sup>1</sup> mm−<sup>1</sup> ), and/or (iii) biomass allocation into harvestable products, namely harvest index in grain crops (HI: grain biomass/total biomass). These recommendations are based on our understanding of the physiological determination of GY at a crop level (Eq. 1)

$$\text{GY (kg }\text{ha}^{-1}\text{)}=\text{WUt}\times\text{WUEt}\times\text{HI}\tag{1}$$

Therefore, the ability of a gene/trait to cope with water constraints should be analyzed within the conceptual framework of Eq. 1 and the characteristics of the target environment (Table 1). The latter will settle whether a trait is favorable or unfavorable for a breeding program.

### THE STORY OF HaHB4, FROM THE BENCH TO THE FIELD

It is well known and documented that there are plant species exhibiting a high tolerance to certain abiotic stress factors and others more susceptible to them (Boscaiu et al., 2012). Moreover, it is possible to find different varieties of the same species with differential tolerance to a stress factor for which traits conferring tolerance have been pyramided along centuries, first by farmers and subsequently by professional crop breeders. For example, there are rice lines tolerant to drought, extreme temperatures or salinity (https://www.irri.org/climate-change-ready-rice). Among crops, the sunflower is a species exhibiting broad adaptation (Debaeke et al., 2017), a characteristic for which responsible genes have been identified. Unfortunately, genetic tools as characterized mutants are not available and the complex sunflower genome was revealed only very recently (Badouin et al., 2017). Due to the complexity of molecular networks displayed when plants sense stressing conditions, transcription factors (TFs), as master switches, were good candidates to start the research.

TFs are proteins able to recognize and bind specific short DNA sequences present in their target regulatory regions; i.e.



Adapted from Passioura (2006); Reynolds et al. (2007), Tardieu (2012) and Sadras and Richards (2014).

promoters, enhancers, introns, etc. These proteins are particularly abundant in the plant kingdom, representing about 6% of total proteins (Riechmann, 2002; Ribichich et al., 2014). Plant TFs have been classified in families, subfamilies, and subgroups especially according to their DNA binding domains. However, other features are also important for such classification including gene structure, the presence of other motifs and domains as well as their role in plant development. Many TF families are shared between animals and plants and among them, 19 are more expanded in the plant kingdom suggesting a frequent adaptive response, besides a higher genome duplication rate (Shiu et al., 2005).

Among TFs families, there is the superfamily of homeodomain (HD) containing proteins. The HD was defined as a conserved 60 aminoacid sequence that folds in three alpha helixes bound by a loop and a turn (reviewed in Viola and González, 2016). This domain was discovered (and named as HD) in mutants of Drosophila melanogaster exhibiting the ectopic expression or mutation of a HD encoding gene which caused a homeotic effect, i.e. the change of a body segment by another. Examples of these TFs are Antennapedia and Bithorax (Gehring et al., 1994). In plants, HD TFs have not been assigned homeotic functions, but they play many roles in development, hormone signaling and the response to environmental factors (Viola and González, 2016).

The HD superfamily is divided into several subfamilies and among the latter, there is the HD-Zip (homeodomain-leucine zipper) family which is subdivided into four groups, named I to IV, according to their different structures and roles. Even though HD and leucine zipper form part of TFs in other kingdoms, their association in a sole protein is exclusive of plants and this characteristic is shared by the four groups (Ariel et al., 2007). Among these four subfamilies, members of the so-called HD-Zip I subfamily have been associated initially with abiotic stress response (Ariel et al., 2007). Subsequently, several works [revised by (Perotti et al., 2017)] described the role of particular members in developmental events not necessarily associated with stress and also in biotic responses. There are 17 HD-Zip I members in the model plant Arabidopsis, a number that varies among species (Perotti et al., 2017). HD-Zip I TFs have been identified in all the species in which genomes have been sequenced; however, a small portion has been functionally characterized (Perotti et al., 2017). Phylogenetic trees resolved these proteins from different species in six clades (Arce et al., 2011). Coming back to the sunflower, it is noteworthy that this species has several divergent HD-Zip I TFs that cannot be clustered in the trees constructed with proteins from model species or crops (Arce et al., 2011). Among these divergent TFs, HaHB4 presents an abnormally short carboxy-terminal and a short size. Taking only its HD-Zip domain, the closest Arabidopsis members to HaHB4 are AtHB7 and AtHB12, which have been shown to participate as positive regulators in ABA-dependent drought and salinity responses (Olsson et al., 2004; Valdés et al., 2012; Ré et al., 2014). The overexpression of AtHB7 conferred drought tolerance in Arabidopsis and tomato plants (Olsson et al., 2004; Mishra et al., 2012; Ré et al., 2014).

At the beginning of the research reviewed here, and ignoring at that time the Arabidopsis genome as well as the functions of AtHB7 and AtHB12 and other members of the HD-Zip I family, the strategy to reveal the HaHB4 function was to study its binding specificity in vitro and its expression pattern in sunflower (Palena et al., 1999; Gago et al., 2002). The next step was to transform Arabidopsis plants overexpressing this TF (Dezar et al., 2005). Expression studies indicated that HaHB4 is induced by water deficit and ABA (Gago et al., 2002). Arabidopsis transgenic plants, transformed with the sunflower TF, showed a drought-tolerant phenotype (Dezar et al., 2005). A deeper investigation about the mechanism triggered by this gene to confer drought tolerance indicated that it did not implicate stomata closure (which leads to drought tolerance but is usually accompanied by yield penalty in some environments) but a senescence delay via the inhibition of ethylene receptors (Manavella et al., 2006). Other signal transduction pathways are also regulated by this TF such as jasmonic acid enhancement, which leads to herbivory defense and inhibition of photosynthesis-related genes during darkness (Manavella et al., 2008a; Manavella et al., 2008b). However, the most important discovery was that Arabidopsis plants became, in certain form, 'myopic' to water deficit; they continue to grow when the stress was moderate and thus, the impact on productivity was reduced respect to control plants that exhibited stomata closure. In other words, when the plants were subjected to severe stress (not watered during 10-20 days), survival rates were much higher for GM plants expressing HaHB4 than for wild-type controls (Dezar et al., 2005). Such a drought-tolerant phenotype was observed for many transgenic Arabidopsis plants expressing a variety of plant genes. However, the described trend did not hold when the same plants were grown under moderate water deficit (20–40% reduction of rosette area; Skirycz et al., 2011), a condition for which no clear trend was detected in yield penalty between GM and non-GM genotypes. By contrast, HaHB4 transgenics usually outyielded the wild-type controls across a wide range of field-tested growing conditions (Figure 1).

Described observations lead us to transform other species, particularly crops such as soybean and wheat, with constructs able to express HaHB4 (González et al., 2019; Ribichich et al., 2020). Such GM crops expressing HaHB4 outyielded their wildtype counterparts in a network of field trials that included a broad range of growing conditions, particularly in water balance and air temperature during critical reproductive stages (González et al., 2019; Ribichich et al., 2020). Yield data were supported by positive trends in its main physiological determinants (total biomass and biomass partitioning) as well as in floret fertility and grain numbers.

### WHEAT AND SOYBEAN HaHB4: CROP PERFORMANCE AND YIELD IMPROVEMENT

Wheat and soybean expressing the cDNA corresponding to the sunflower HaHB4 gene were tested in 37 and 27 field

FIGURE 1 | Relative grain yield response of transgenic wheat and soybean lines across environments. For each species (triangles for soybean and squares for wheat), symbols represent the combination of (i) the difference in mean temperature of each site respect to the mean across environments (y axis), and (ii) the relative water balance (RWB) of each site (x axis), being RWB=(Rainfall+Irrigation-PET)/PET (PET: potential evapotranspiration). Variation in relative grain yield (RGY) was computed as RGY = (GYtg-GYwt)/GYwt (GYtg: grain yield transgenic; GYwt: grain yield wild-type) and expressed in percent (values next to symbols). Different colors represent cases with (i) RGY ≥ 5% (GYtg > GYwt), in bolded non-black colors that identified the corresponding environmental group, (ii) RGY ≤ −5% (GYwt > GYtg), in bolded black, and (iii) −5% < RGY < 5% (GYtg = GYwt), in plain black.

experiments, respectively (Supplementary Table 1; González et al., 2019; Ribichich et al., 2020) 1 . Genetic constructs used to transform crops shared HaHB4 cDNA but not the promoters which were the UBI for wheat and the own HaHB4 promoter for soybean. The parental wild-type cultivars (Cadenza for wheat and Williams 82 for soybean) together with the GM lines (IND-00412-7 for wheat and b10H for soybean) were sown in 13 or 14 sites during several years covering wide latitudinal (ca. 27°25'S to 39°50'S) and longitudinal (ca. 57°40'W to 65°28'W) ranges. Both crops experienced large differences in water balance as well as in mean and maximum temperatures (Supplementary Table 1). Considering all the range of tested environments (1,000–9,300 and 1,500–4,500 kg ha−<sup>1</sup> average yield for wheat and soybean, respectively), the presence of HaHB4 in the GM lines increased yield by 6% in wheat and by 4% in soybean, with no significant effect in crop phenology. Such a response is outstanding for commercial purposes because it allowed the incorporation of HaHB4 to modern cultivars without altering the crop cycle, which has been already optimized to the target breeding area. When only the dry environments (i.e. negative water balance) were considered, the mean yield benefit from HaHB4 increased to 16% in wheat and to 8.6% in soybean (Figure 1). Moreover, yield improvement was even larger (20% for wheat and 11% for soybean) when the dry environment was associated with warm mean temperatures (>20 and >22°C, respectively), while it remained important (12% and 5% for wheat and soybean, respectively) for the dry-cool condition (Figure 1). Differences in yield were always associated with grain number (GN) produced per unit area for both crops. This trend was partially compensated by a decrease in individual grain weight (GW) in soybean, whereas no clear trade-off effect was registered in this grain yield component for wheat. This contrasting response of GW to the increase in GN agrees with differences in the sourcesink balance experienced by each species during grain filling (Borrás et al., 2004). Such balance recognizes specific (e.g. plasticity in the establishment of maximum seed volume; soybean > wheat) as well as environmental (e.g. irradiance offer during grain filling; wheat > soybean) constraints.

To improve our understanding of HaHB4 effects on the ecophysiological determination of GY, detailed measurements were performed in controlled field experiments. The simplest method to study yield determination is to evaluate the total biomass produced along the crop cycle and the proportion of that biomass allocated to grains as proposed in Eq. 1 and summarized in Figure 2. In both crops, the expression of HaHB4 caused an increase in total biomass with no change in harvest index (HI). As HaHB4 had no impact on crop <sup>1</sup>

All this section is based on González et al., 2019 and Ribichich et al., 2020.

phenology, the observed increase in total biomass could be attributed to increased crop growth rate, particularly during those periods that are critical for the determination of the main driver of grain yield (i.e. GN). This period spans between (i) the start of stem elongation on ca. 20 days before anthesis and grain set at the beginning of grain filling in wheat (Fischer, 1975; Fischer, 1985; Kirby, 1988), and (ii) pod formation and the beginning of grain filling in soybean (Board and Qiang, 1995; Jiang and Egli, 1995). The crop growth rate of wheat GM line exceeded that of the wild-type parental line by 68% during the critical period. This trend was in line with the increase registered in fertile florets per plant observed in GM lines, suggesting an improved floret survival (González et al., 2011). The fertile florets per spike and the number of spikes per plant constitute the fertile florets per plant, being the former similar (tiller spike) or higher (main stem spike), and the latter consistently higher, in the GM line compared to the wild type. For soybean, solar radiation interception and leaf photosynthesis were measured, both conducive to crop growth rate determination (Muchow et al., 1990). These two traits were higher for the GM line, the former during the entire critical period and the latter during grain filling. Improved light interception during the critical period resulted in more pods and branches per plant, whereas enhanced photosynthesis during grain filling was consistent with the clear visual observation of delayed senescence (Figure 3). Described stay-green improvement, which was also observed in model plant Arabidopsis (Manavella et al., 2006), probably prevented a complete trade-off between increased grain

FIGURE 3 | GM soybean exhibits delayed senescence compared with its wild-type control. Upper panel: schematic representation of the soybean life cycle. Lower panel: illustrative picture of one of the field trials performed comparing the wild type genotype (right) with the transgenic HaHB4 one (left).

numbers and final individual grain weight, which can be expected when seed growth takes place under the sharp decrease in irradiance that usually occurs in autumn (Borrás et al., 2004).

When water availability is limiting growth and yield, water use (WU) and water use efficiency (WUE) are the main physiological determinants of crop performance (Eq. 1). In the case of wheat, the average WUE (estimated as the yield produced per unit of rainfall) of the 37 field experiments was 9.4% greater in the GM line than in the wild type, and WUE increased by 14.2% for environments with less than 300 mm rainfall. For soybean, detailed measurements of crop evapotranspiration along the cycle in plots exposed to contrasting water regimes (WW: well-watered; WD: water deficit) showed that the GM line used more water under both conditions, being the difference even higher under irrigation (17.3 and 27.2% increase in water used for WD and WW, respectively). The enhanced water use of the GM line could not be confirmed in a greenhouse experiment, where both cultivars had almost identical water use in pots held at contrasting water regimes (field capacity and 60% field capacity). Such a response is not surprising under the severe root confinement usually experienced in pot experiments, which do not allow for correct comparisons of this type of traits. Concurrently with mentioned differences in water use in the field, the hypocotyl diameter and xylem area were always larger in the GM line than in the wild-type soybean line, traits that have been associated with increased water conductivity and water use (Richards and Passioura, 1981). The fact that water use was reduced under water deficit, even in the GM line, suggests that some degree of stomata closure may have occurred, reducing water loss but with low impact in CO2 exchange (Liu et al., 2005). This response is in line with the increased WUE (≥22%) to produce biomass, and yield of the GM line when exposed to water deficit, and with the augmented photosynthetic rate observed in this germplasm during grain filling (commented in the previous paragraph). The results obtained in wheat and soybean crops are promising, showing that HaHB4 may help to mitigate yield reductions in drought-prone environments.

### THE LONG REGULATORY PROCESS

Developing new technology is a long journey, from the hypothesis to its verification in the plant model, to the posterior projection into agronomic relevant crops and, when successful, the selection of the best candidate fulfilling the expected features (Figure 4). However, when all these stages are completed, another hard challenge starts. Since a precautionary approach is often favored for new technologies, a detailed safety assessment is required before their consumption or introduction into the environment.

While the approvals are obtained, these products are "regulated", which means that they must be under strict control to guarantee no accidental release into the environment until its safety is proven. The regulatory stage is a mandatory process during which a lot of information covering different aspects of the new product must be delivered and presented to the respective authorities.

Focusing on a GM crop, the final objective of the regulatory phase is intended to establish that the new plant is like the conventional version except on the trait that was intentionally introduced by the genetic modification. The assumption under this comparative approach is that, if the conventional plant is considered secure, the genetically modified version would be equally safe. Behind this comparative approach that includes evaluation of different crop aspects (agronomic, reproductive, environmental, compositional, nutritional, etc.), there is a long and thorough process that involves measuring more than 80 parameters in which equivalence to the non-GM counterpart needs to be verified (Ayala et al., 2019; González et al., 2019).

Complementing the comparative approach, there is a set of distinctive (new) features that must be characterized. These include:

• Studying the new expression product. Any new molecule introduced into the crop by the genetic modification (usually a protein) must be deeply studied. In the case of a protein, its function, biochemical profile, putative

allergenicity, and toxicity and digestibility, among other attributes, must be known.


Field trials including the GMO cultivars are developed under different agronomic conditions. Simultaneously, the non-GM parental line and several commercial varieties are grown to control any effect of the genetic modification and provide a reference range of natural variability, respectively. A wide set of agronomic parameters that define the crop characteristics are measured during the life cycle and samples of different tissues are taken for further analysis.

Particular attention is taken to any parameter that may reveal a tendency of the GMO to become a weed or be more invasive than its wild-type counterpart. That is why reproductive physiology, persistence in the environment, sensitivity to stresses, diseases, and plagues are analyzed.

All described studies are completed by the developer since they involve the use of material protected by intellectual property rights and require a great investment of money, other resources and time. The results are presented to the regulatory authority of different countries and evaluated by experts in different areas. A post-submission period of communication between regulators and the developer usually follows, during which requests for additional information or even new studies may be required. The conclusions from these scientific analyses are then considered by governmental authorities, which take into consideration these evidence-based conclusions together with many other local interests.

Regulatory field trials are usually conducted in the main areas where the crop is cultivated. However, countries like China, Brazil, Japan, Colombia, Bolivia, South Africa, and the USA require local trials even when the transportability of the data has been proven (Garcia-Alonso et al., 2014).

Destinations, where a new GMO is presented for approval, include those where the crop will be cultivated and/or where its products will be shipped. Information presented for approval is usually that related to the use of the new GMO in the country of destination (information required for cultivation is different from that associated with consumption or use of imported products), although some countries/regions are extremely precautious and do not follow this rule.

The regulatory road is usually tortuous. Although several countries share the essence of a science-based safety evaluation process, there are still dissimilar requirements among regulatory authorities. Besides, there is a lack of mutual recognition on safety assessments that leads to redundant evaluations and asynchronous approvals that significantly impairs commodities commercialization (Stein and Rodríguez-Cerezo, 2010). In particular, some countries or regions are politically decisive players in this field. Specifically, the European Union embodies the stringency of a regulatory process requiring data not necessarily related to science or intention of use. Alternatively, China regulatory authorities run their own set of experiments to verify the data already generated by the developer.

One distinctive feature of the HB4® technology (commercial name for HaHB4 introduced in different crops) is that the new expression product is a plant TF. Based on this singularity, this technology has faced additional scrutiny related to putative effects on non-target genes. Precedents on the extreme specificity of TF (an absolute requirement for the development of any individual), and evidenced on the different HaHB4-GM crops, do not support such speculation.

As a TF, the levels of HaHB4 on the natural plant (sunflower) as well as in the GM crops expressing it are extremely low (at the nanogram per gram of dry weight; Alloatti et al., 2017). So low, that it can hardly be detected in the plant and, consequently, in its byproducts. If any concern could be raised by the expression of a foreign protein in a crop, in the case of HaHB4 it would be even lower than for other proteins.

Independently of the uniqueness of HaHB4, we have a familiarity with this protein, since sunflower has been in the animal and human diets for centuries. Besides, HaHB4 is similar to proteins already present in animals and plants, even working as TF.

Among the two HaHB4 crops close to reaching the market, HB4® wheat has faced some baseless negative reactions. This is not unexpected considering the anti-GMO movements. However, it is sadly remarkable that those in charge of evaluating the (environmental/commercial) benefits or the safety of new technology, let this putative future public perception determine their decisions (Fernbach et al., 2019). This reaction is supported by the statement that this would be the first GM wheat. However, it is not true. There is a precedent for GM wheat, which completed the approval process in four different countries though it was never commercialized (GM Approval Database, 2019). Concerning this, another false statement is done: this event was rejected by the regulatory authorities. The truth is that the developers shelved the glyphosate-tolerant wheat in 2004 amid market concern about rejection from foreign buyers (Ingwersen, 2019). Besides these two countries where a withdrawal took place (the United States and Australia), this former GM wheat is approved for food and feed use in two other countries (New Zealand and Colombia, GM Approval Database, 2019).

This unique and interrupted intent to introduce GM wheat into the market sustain the feeling that this crop has been kept aside from the improvements that genetic modification may provide (Wulff and Dhugga, 2018; Asseng et al., 2019). Despite the public's distaste for GM foods, especially in Europe, many genetically engineered products have already reached the market (GM Approval Database, 2019). In addition to public resistance to GM crops in general, wheat faces its particular challenges since twenty percent of humans' calories and about the same percentage of protein intake come from wheat. In Argentina, 80% of the cultivated wheat is exported to Brazil. If buyers will not accept GM wheat, farmers will not grow it. According to a news release survey we conducted, based on articles selected through specific keywords, both in national and international media, stakeholders related to wheat exports speculate about potential market reactions if GM wheat is approved in Argentina, fearing a loss of customers based on not being able to deliver wheat that matches non-GM specifications. To get accurate, and credible information about identifying GM wheat, in the following section we perform a classification to distinguish between GM wheat and non-GM wheat, and between different "events" of GM wheat.

Another negative perception targeted GM wheat is that we are talking about human food and the direct consumption of a GMO. However, GM-derived products are already present in the food chain. For example, glyphosate-tolerant soybean MON-ØØ4Ø32-6 has been in the market for more than 20 years and soybean flour is regularly used as a supplement in several food products. Similarly, insect-resistant corn MON-ØØ81Ø-6 is widely cultivated and consumed since it was approved more than a couple of decades ago (GM Approval Database, 2019). Moreover, some of them are already consumed without prior processing (plum, cassava, apple). So, is there a real difference in approving this GMO or is it just a matter of perception?

Checking the history of GMOs, we can find that even in cases where there are no commercial interests behind a GM technology and its modification is attending a health issue as in the case of vitamin A-fortified golden rice (Dubock, 2014; Moghissi et al., 2018), lack of funding and anti-GMO movements can delay (or even prevent) a benefit to reach those who could be assisted by it.

# TRANSGENIC CROPS CAN BE DIFFERENTIATED ON-FARM FROM THE NON-TRANSGENIC ONES BY SPECTRAL ANALYSES

Remote sensing techniques, such as spectrometry, are increasingly used for plant phenotyping. The spectrum of energy reflected by the plant is closely associated with absorption at certain wavelengths that are linked to specific characteristics or plant conditions. Spectrometers can acquire detailed information regarding the electromagnetic spectrum in a short time, making this technology ideal for assessing genotypes within a few hours. This would enable the estimation of multiple morpho-physiological and physicochemical traits, which would be otherwise impossible to evaluate due to the time and cost involved (Garriga et al., 2017).

For the estimation of plant traits, most previous studies have mainly resorted to the use of Vegetation Indices, while less attention was paid to the analysis of the full spectrum. The use of reflectance data and machine learning algorithms for plant phenotyping purposes using the full spectrum has been only recently addressed by the scientific community (Chlingaryan et al., 2018; El-Hendawy et al., 2019; Meacham-Hensold et al., 2019).

The capability of a spectrometer for the characterization of soybean, maize, and wheat in field experiments has been already explored by Rigalli et al. (2018). The objective of this work was to select different wavelengths intervals of the spectral reflectance curve (within the 632–1,125 nm range) as features for on-farm classification using machine learning methods. Two different classifications were presented, species selection and growth stage identification. An accuracy of 92% was reached for species classification, while 99% was obtained for stage classification. Besides, a new index was proposed that outperformed established vegetation indices under analysis, which showed the potential advantage of using this type of device. This fact indicated that a collection of field spectral data could be more representative of plant phenotyping than the information given by single vegetation indices.

In this section, we present spectral analyses of field-grown wheat and soybean crops through field-collected full-spectrum data, in an attempt to differentiate GM genotypes from their wild type and commercial cultivars (Figures 5 and 6). Thirty spectral reflectance curves (ten per plot, with three replications) were collected for each genotype. Each dataset contained spectra from two genotypes, giving 60 spectra per dataset to feed the machine learning algorithm. Detail of methods (ANN: Artificial Neural Networks; SVM: Support Vector Machine; RF: Random Forest) can be found in the supplementary material (Supplementary File 1) whereas details of experiments are summarized in Supplementary Table 1.

### Transgenic Versus Wild Type Identification in Soybean and Wheat Genotypes

The best discrimination between GM and wild-type soybean cultivars was achieved with the ANN algorithm (Table 2), which reached values ≥70% in six out of 13 cases. Regarding the analyzed environments, the best performance was obtained for the water deficit condition (Environment II). Overall, the maximum standard deviation as a measure of uncertainty in the classification was 20%. In the case of wheat (Table 3), collected data belonged to a single environment and included two GM lines and the wild type at different growth stages. In this case, the precision of the classification increased when phenological differences between genotypes were reduced (bolded data in Table 3). The highest result for the GM/wildtype comparison (IND-00412-7(HB4)/Cadenza) was obtained on 130 days after sowing when both genotypes were at the boot stage. The outstanding precision value reached at this stage for wheat (96 ± 4% for ANN and 99 ± 8% for SVM) outperformed those registered for soybean (Table 2). This high precision allowed us to conclusively distinguish the presence of a GM wheat cultivar from its parental wild type through spectral analysis.

FIGURE 5 | Spectral reflectance curve datasets for soybean. Transgenic versus wild-type and transgenic versus commercial. Solid lines represent typical spectral reflectance curves. Shaded regions represent the data range for each genotype. Green regions represent the data range of the non-transgenic genotypes. Blue regions with dashed border lines represent the dataset range of the transgenic genotypes. Spectra were obtained as described in Supplementary File 1 whereas details of experiments are in Supplementary Table 1 and in Ribichich et al. (2020).

reflectance curves. Shaded regions represent the data range for each genotype. Green regions represent the data range of the non-transgenic genotypes. Blue regions with dashed border lines represent the dataset range of the transgenic genotypes. Spectra were obtained as described in Supplementary File 1 whereas details of experiments are in Supplementary Table 1 and in González et al. (2019). Transgenic is cultivar IND-00412-7 (HB), wild-type is cultivar Cadenza (CD), and commercial cultivar is Baguette Premium 11 (BP11).

TABLE 2 | Classification accuracy of three machine learning algorithms for discrimination between soybean genotypes.


DAS, days after sowing; ANN, Artificial Neural Networks; SVM, Support Vector Machine; RF, Random Forest. Growth stages according to Fehr and Caviness (1977). Tested soybean genotypes were b10H (GM) and W82 (wild-type parental). Results expressed as b10H/W82. Bolded data indicate results equal or higher than 70%.

### Identification of Transgenic Versus Commercial Genotypes in Soybean and Wheat

Using the same approach, transgenic (b10H, OECD nomenclature IND-00412-5) vs commercial (NS3228) soybean genotypes were evaluated across three environments (Table 4). According to these results, ANN obtained again the best performance, reaching values ≥70% in nine out of 13 cases. SVM also reached good results, showing values ≥70% in eight out of 13 cases. Classification accuracy across all environments was >70% in at least three developmental stages. Also, spectral reflectance data assessed at R5 (start of seed growth) showed classification accuracy values ≥79% for all analyzed environments. The performance improved under water deficit

### TABLE 3 | Classification accuracy of three machine learning algorithms for discrimination between wheat genotypes.


a 1N, first node visible; Ant, anthesis; BS, boot stage. Other abbreviations as in Table 2.

Wheat genotypes used in this analysis were genotypes (i) IND-00412-7 (GM) and wild-type Cadenza (CD), and (ii) IND-1015 (GM) and CD. The growth stage is indicated as the nearer stage plus or minus calendar days from this stage. Bolded data indicate small variation between growth stages of the compared genotypes.

\* IND-1015 come from the isogenic transgenic HaHB4 event obtained in Cadenza background (IND-00412-7), then introgressed in a pre-commercial advanced line.

TABLE 4 | Classification accuracy of three machine learning algorithms for discrimination between transgenic and commercial soybean genotypes.


a Abbreviations as in Table 2.

Soybean genotypes tested in this analysis were HB (GM) and NS3228 (commercial variety). Results expressed as HB/NS3228 (bolded data indicate ≥0.70).

(Environment II) and high temperature (Environment III), reaching 82–86% and 84% (across methods), respectively. Standard deviations did not go beyond 16%.

As for wheat varieties, we confirmed that a reduced phenological difference increases the discrimination capability between transgenic and non-transgenic cultivars. Accuracy of ANN was always highest (>80%), with standard deviations lower than 18%. Results for SVM were very similar to those obtained by ANN, and with standard deviations of similar magnitude (Tables 5).

### FUTURE PERSPECTIVES AND CONCLUDING REMARKS

An estimated one-quarter of greenhouse gas emissions are associated with anthropogenic activities linked, directly or indirectly, to agriculture. At the same time, increases in frequency and intensity of extreme weather have adversely affected food security and ecosystems, contributing to desertification and land degradation in many regions (IPCC, 2014). While climate change will likely impact a crop´s yield and nutritional value, decreased agricultural outputs will fail to meet demands as population increases. Consequently, agriculture faces a major challenge: to enhance the resilience of global food systems and at the same time move towards carbon neutrality.

The unprecedented challenge of preserving our global environment today means we can no longer afford to increase agriculture production at the expense of environmental stability. This scenario leaves humanity with basically three avenues to reconcile agricultural productivity with environmental sustainability: reduce food waste, shift towards less meat intensive diets in the developed world, and use of the existing resources more sustainably.

Although second-generation GM crops have yet to reach global agriculture, they may contribute significantly to help us use existing resources more sustainably. An HaHB4 derived event for soybean has already been approved for cultivation in major agricultural territories like the United States, Brazil, and Argentina, with regulatory processes well advanced in other important geographies, such as China (developers' public

TABLE 5 | Classification accuracy of three machine learning algorithms for discrimination between transgenic and commercial wheat genotypes.


a Abbreviations as in Tables 2 and 3.

Tested wheat genotypes were HB412 (GM) and commercial varieties Baguette Premium 11 (BP11) and Klein Pegaso (KP). The growth stage is indicated as the nearer stage plus or minus calendar days from this stage. Bolded data indicate small variation between growth stages of the compared genotypes.

information). A similar wheat event is also being considered. This technology is expected to be in thousands of hectares during the 2019-2020 crop cycle, with the potential to be over one million hectares in two cycles after that, subject to farmers' acceptance among other factors.

Drought tolerant wheat and soy crops, such as those described in this study, may yield more per unit of water used by plants. This resiliency may favor water-demanding double-cropping schemes that would otherwise be uneconomic. Sustainable intensification allowed by secondgeneration GM crops will result in improved carbon fixation, while less land is required to sustain current production outputs. These benefits may be of importance to a broader consumer audience, increasingly upset with our collective inability to preserve our terrestrial environment. Demonstrating these benefits at scale may generate an opportunity to re-signify GM perception derived from farmer-centric first-generation GMOs. In doing so, pressure could be mounted to streamline and synchronize global regulatory systems, making the process more affordable to a broader group of scientists and technology developers.

### DATA AVAILABILITY STATEMENT

The datasets generated for this study are available on request to the corresponding authors.

### REFERENCES


### AUTHOR CONTRIBUTIONS

NR, MR and MP performed spectral analyses. PM and FT wrote about the regulatory process. FG, KR and MO wrote about ecophysiological aspects whereas RC described HaHB4 story. RC and MO conceived and designed the manuscript.

# FUNDING

This work was supported by Agencia Nacional de Promoción Científica y Tecnológica, PICT 2015 2671.

### ACKNOWLEDGMENTS

NR is a CONICET Ph.D. fellow; FG, KR, MR, MO and RC are CONICET Career members. MP is Professor at the National University of Rosario.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2020.00178/ full#supplementary-material


expressing the sunflower transcription factor HaHB4. J. Exp. Bot. doi: 10.1093/ jxb/eraa064


### Conflict of Interest: PM and FT belong to INDEAR.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 González, Rigalli, Miranda, Romagnoli, Ribichich, Trucco, Portapila, Otegui and Chan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Advancing Agricultural Production With Machine Learning Analytics: Yield Determinants for California's Almond Orchards

Yufang Jin<sup>1</sup> \*, Bin Chen<sup>1</sup> , Bruce D. Lampinen<sup>2</sup> and Patrick H. Brown<sup>2</sup>

<sup>1</sup> Department of Land, Air and Water Resources, University of California, Davis, Davis, CA, United States, <sup>2</sup> Department of Plant Sciences, University of California, Davis, Davis, CA, United States

Agricultural productivity is subject to various stressors, including abiotic and biotic threats, many of which are exacerbated by a changing climate, thereby affecting long-term sustainability. The productivity of tree crops such as almond orchards, is particularly complex. To understand and mitigate these threats requires a collection of multi-layer large data sets, and advanced analytics is also critical to integrate these highly heterogeneous datasets to generate insights about the key constraints on the yields at tree and field scales. Here we used a machine learning approach to investigate the determinants of almond yield variation in California's almond orchards, based on a unique 10-year dataset of field measurements of light interception and almond yield along with meteorological data. We found that overall the maximum almond yield was highly dependent on light interception, e.g., with each one percent increase in light interception resulting in an increase of 57.9 lbs/acre in the potential yield. Light interception was highest for mature sites with higher long term mean spring incoming solar radiation (SRAD), and lowest for younger orchards when March maximum temperature was lower than 19◦C. However, at any given level of light interception, actual yield often falls significantly below full yield potential, driven mostly by tree age, temperature profiles in June and winter, summer mean daily maximum vapor pressure deficit (VPDmax), and SRAD. Utilizing a full random forest model, 82% (±1%) of yield variation could be explained when using a sixfold cross validation, with a RMSE of 480 ± 9 lbs/acre. When excluding light interception from the predictors, overall orchard characteristics (such as age, location, and tree density) and inclusive meteorological variables could still explain 78% of yield variation. The model analysis also showed that warmer winter conditions often limited mature orchards from reaching maximum yield potential and summer VPDmax beyond 40 hPa significantly limited the yield. Our findings through the machine learning approach improved our understanding of the complex interaction between climate, canopy light interception, and almond nut production, and demonstrated a relatively robust predictability of almond yield. This will ultimately benefit data-driven climate adaptation and orchard nutrient management approaches.

Edited by:

Edward Rybicki, University of Cape Town, South Africa

### Reviewed by:

Valerio Cristofori, Tuscia University, Italy Dirceu Mattos Jr., Instituto Agronômico de Campinas (IAC), Brazil

> \*Correspondence: Yufang Jin yujin@ucdavis.edu

### Specialty section:

This article was submitted to Crop and Product Physiology, a section of the journal Frontiers in Plant Science

Received: 18 December 2019 Accepted: 26 February 2020 Published: 13 March 2020

### Citation:

Jin Y, Chen B, Lampinen BD and Brown PH (2020) Advancing Agricultural Production With Machine Learning Analytics: Yield Determinants for California's Almond Orchards. Front. Plant Sci. 11:290. doi: 10.3389/fpls.2020.00290

Keywords: Prunus dulcis, yield gap, artificial intelligence, big data, light interceptioon, nutrient management

### INTRODUCTION

fpls-11-00290 March 12, 2020 Time: 17:12 # 2

Global food and fiber demand has been projected to double by the mid-century, driven mostly by increasing population and nutrition needs (Tilman et al., 2011; Davis et al., 2013). However, agricultural production has been shown vulnerable to multiple stresses including warming, droughts and floods, extreme weather variability (Rosenzweig et al., 2001; Reynolds and Tuberosa, 2008; Funk and Brown, 2009; Lesk et al., 2016), and degrading soils and water (Elliott et al., 2014). Growers face the grand challenges of increasing food production while minimizing environmental disruption, and improving the resilience of agriculture systems under changing climates (National Academies of Sciences Engineering Medicine, 2019). Optimizing food system requires a new approach that integrates existing datasets for new insights about yield determinants, and resolves the complex and interconnected physical and biological processes affecting yield across different scales. Recent technological advances in artificial intelligence provide promising tools to understand the constraints on potential yield and interpret and predict the variation of yield across space and time by harnessing many unique yet under-utilized datasets.

California's almond acreage has expanded rapidly in recent decades, from 283,280 hectares in 2005 to 538,232 hectares in 2017 (USDA-NASS, 2018), due to the increasing demand for almonds in domestic and international markets. Almond has become the second leading agricultural commodity in California, with a total farm gate value of 5.6 billion US dollars in 2017 (California Department of Food and Agriculture, 2017). California produces about 80% of the world's almonds and 100% of the U.S. commercial almond production. More than 95% of almond acreage is irrigated and growers rely heavily on surface irrigation deliveries and on groundwater when surface water is limited, as occured during the recent prolonged 2013– 2017 drought in California (Faunt et al., 2016). Climate change, including warming and extreme weather, is another threat to almond production. The projected climatic conditions by the middle to end of the 21st century are predicted to threaten the long-term viability of the state's almond production (Luedeling et al., 2009). To optimize yield and ensure the almond industry remains economically viable and environmentally sustainable (Carletto et al., 2015; Tombesi et al., 2017), it is essential to understand key yield determinants and develop appropriate agricultural adaptation and management strategies.

Groundwater quality in California has also been degraded due to nitrogen leaching from agricultural fields (Burow et al., 2013; Baram et al., 2016). Facing with this serious challenge, the state of California has implemented legislatively mandated nitrogen (N) management strategies for all almond growers statewide to meet the goal of minimizing nitrogen losses to the environment. To optimize N management and ensure regulatory compliance, almond growers must now apply N in accordance with the estimated yield determined in each orchard in early spring, taking into account N available from all sources (e.g., fertilizer, composts and manures, and irrigation water nitrogen). Accurate yield prediction is thus critically important to help individual growers with the information required to manage inputs and resources, to schedule on-farm activities and manage harvest and marketing agreements.

Almond yield varies by year and by location; however, the environmental and biophysical factors that underlie these differences are not well understood and have never been systematically characterized. Almond production is known to be highly dependent on a number of factors (Tombesi et al., 2010, 2017; Zarate-Valdez et al., 2015) including (a) biophysical attributes such as tree age, leaf area, tree vigor, and bloom intensity, (b) environmental conditions such as chilling and heat requirements, soil nutrition, and bee foraging activity, and (c) cropping history. To date, a detailed comprehensive assessment of each of these factors and a yield prediction algorithm has not been successfully achieved, especially at a finer spatial scale.

Among the variables that have been shown to impact yield in almond, canopy interception of photosynthetically active radiation (PAR), is directly related to maximum potential yield of almonds (Zarate-Valdez et al., 2015). Lampinen et al. (2012), reported that the maximum sustainable yield in the most productive commercial almond orchards is 56 kernel kg/ha per unit PAR intercepted by the canopy. Percent light interception at the orchard level is determined by canopy structure, e.g., total leaf area and health at the individual tree level, as well as row and tree spacing; while the location of the orchard (latitude) and cloud fraction affect the total amount of PAR incident on the canopy. Management activities such as cultivar selection, tree spacing, pruning practices, nutrition, and irrigation also have direct impacts on canopy interception and thus yield. As almond is a perennial crop, the multi-year photosynthetic accumulation and allocation to reproductive and vegetative organs from previous years also affect its yield (carry over effect), as well as spurs frequency.

Climate, such as temperature and water availability, is known to have an important role in crop growth and flowering, and thus influencing yield variation (Kerr et al., 2018; Pathak et al., 2018). A few prior studies have used relatively simple statistical analysis to understand how temperature and precipitation affected almond yield in California, but were largely limited by the spatial scale, e.g., from county to state levels (Lobell et al., 2007; Lobell and Field, 2011), and temporal coverage, resulting relatively small sample size for analysis (e.g., from tens to hundreds). At the scale of an individual plant, growth models developed by DeJong (2019) as well as knowledge of the role of flower number on yield potential (Tombesi et al., 2017) and modeled carbon budgets all contribute knowledge that can be integrated into a yield prediction model. However, these mechanistic approaches have not been systematically applied at any significant scale.

Moreover, nut production of almond trees is also highly dependent on bee pollination. Most almond cultivars are self-sterile, and two or more cultivars are usually interplanted (Connell, 2000). Bee foraging activity is thus a crucial determinant of the final yield. In addition to being dependent on environmental variables such as temperature, solar radiation, and wind, bee activity is highly reliant on the timing and intensity of flowering, which in turn is also highly affected by weather conditions. Understanding these complicated impacts of environmental factors on almond nut production is therefore

rather challenging, especially at the individual field level, requiring a large spatial and temporal data set and more advanced analytical algorithms.

To address these issues and develop a yield prediction model and descriptor of key yield determinants on almond, we have obtained a 10-year collection of plant and field level biological measurements, management practices, and yield records from 33 locations across the main growing regions of California. Using an advanced machine learning algorithm, we integrated these data with two meteorological datasets to investigate the environmental, biological, and management factors that determine yield variability of almond. Specifically, we aim to answer the following scientific questions: (i) what are the limiting factors that affect yield at a given level of light interception? (ii) Is it possible to predict light interception with orchard age and environmental variables? and (iii) What are the overall impacts of environmental variables on actual yield when controlling for both light interception and the yield gap at a given light interception? An improved understanding of these questions is expected to guide and optimize the lifecycle management of almond production. There is considerable commercial interest in the ability to predict yield and identify production constraints effectively and, as a consequence, the models and information developed in this paper will also be useful to optimize management and hence sustainability.

### MATERIALS AND METHODS

### Study Area

Our study area focused on California's Central Valley, one of the most productive agricultural areas in the world. We have a 10-year collection of field measurements and yield records over a total of 33 individual almond orchards containing 7865 individual experimental plots (**Figure 1**). This region experiences a Mediterranean climate characterized by hot and dry summers and mild and wet winters. Typically, the rainless summer provides ample sunshine for almond growth and limits disease pressures. The cool and wet winter replenishes the soils and reservoirs in bordering mountainous areas, this and groundwater resources provide water for irrigation during the dry season.

### Field Measurements

We collected canopy light interception and yield data over 33 almond orchards, that included a total of 7864 experimental plots, spanning the almond producing areas of the Sacramento and San Joaquin valleys of California, from 2009 to 2018 (Lampinen et al., 2012). The consistent practice of sample collection, supported by Almond Board of California, was designed to evaluate and understand almond production characteristics and drivers from a single tree to orchard scale, for the purpose of improving almond orchard management. For each plot, trees were randomly sampled over a full row length ranging from 50 to 150 individuals for canopy light interception measurement during May to August growing season. A mobile platform (MLB hereafter), consisting of a series of 18 ceptometer segments mounted on a Kawasaki mule utility vehicle, was used to measure PAR below the canopy of both sides of almond trees (PARbelow). Simultaneously, a fixed light sensor recorded the full sun incoming PAR above the canopy (PARabove). All PAR measurements were conducted at solar noon (±1 h), and the light interception was calculated as the fractional PAR intercepted by the canopy:

$$LR = fPR = 1 - \frac{PAR\_{below}}{PAR\_{above}} \tag{1}$$

For each individual experimental plot, average fPAR values of individual trees were calculated to represent the plot-level light interception.

Almond trees were harvested by shaking with a mechanical shaker and the nuts were collected after letting them dry on the ground for about 1 week. Fresh fruit weight was recorded for each individual experimental tree, and a 2 kg sample was used for dry fruit weight (hull plus shell plus kernel) and dry kernel yield (i.e., the yield value used in this study). For each experimental plot, we also recorded its specific orchard site, geographic location (latitude and longitude), planting year, cultivar composition, row and tree spacing (**Table 1**).

### Climate and Weather Data

We used monthly climate record from the Parameter-Elevation Regressions on Independent Slope Model (PRISM) dataset (Daly et al., 2008), including monthly mean values of daily precipitation, daily maximum/minimum/mean temperature, and daily maximum VPD (VPDmax) (**Table 1**). PRISM uses weather station observations, a digital elevation model (DEM), and other spatial datasets to extrapolate the observations from weather stations to ∼ 4-km gridded estimates of monthly climatic variables over the United States (Daly et al., 2008, 2015).

We used the daily weather data at 1km scale from the Daymet Version 3 product, to quantify incoming shortwave radiation flux density (SRAD) at the surface and the duration of the daylight period (Dayl) (Thornton et al., 2017). We further derived the total number of extreme hot days for each month (HotDays). For each month, the threshold of daily Tmax was set as the upper 10-percentile daily maximum temperature from 2009 to 2018, respectively, based on the daily DayMet Tmax product. If the daily Tmax for a certain day exceeded the extreme threshold value of the corresponding month, it was identified as a relatively hot day. All the monthly variables (except for Hotdays) from 2009 to 2018 were further aggregated to derive 10-year mean climatology at both seasonal (i.e., spring, summer, fall, and winter) and annual scales. Climate from both current year and preceding years were also explored for our analysis.

### Yield Potential

Higher light interceptions usually lead to higher yields, but the yield also varies significantly with other environmental stressors (Lobell et al., 2007; Tombesi et al., 2010; Zhang et al., 2019). To understand the maximum yield potential that almond could reach at a given light interception, we grouped all plot-year samples by the associated light interception with an interval of 5%, and selected the upper 10-percentile samples within each light interception bin, as a proxy for the yield potential. The

southern study sites from 2009 to 2018. Also shown are the distributions of (C) cultivars and (D) orchard age for all plots at the sampling years.

light interception and its corresponding yield were then averaged over the subsamples for each group to model the upper bound of the yield at a given light interception percentage. A linear regression model was built with the interception set to zero. We conducted this analysis for all plots (n = 7864), and for a subset of plots (n = 5581) containing the most dominant cultivar, Non-pareil, respectively.

### Environmental Stressors for Yield Gap

To further understand the factors that constrained the almond trees from reaching the maximum yield under a given level of light interception, we normalized the original yield by the modeled yield potentials, as follows:

$$\mathcal{Y}\_{\mathfrak{n}} = \frac{\mathcal{Y}\_o}{\mathcal{Y}\_{\mathbb{P}}} \tag{2}$$

where y<sup>o</sup> is the original yield, y<sup>p</sup> is the modeled yield potential, and y<sup>n</sup> is the final normalized yield, typically ranging from 0 to 1 (with very few samples beyond 1). Samples with y<sup>n</sup> less than 1 indicated productivity under the yield potential. The deviation of y<sup>n</sup> from 1 can therefore be used as a proxy for the yield gap.

We used the random forest machine learning approach to model and analyze the complex relationship between the normalized yield and a suite of meteorological variables, in order to understand what and how environmental stressors limit the yield at given light interception. Random forest is an ensemble learning technique to improve classification and regression trees method by combining a large set of decision trees (Liaw and Wiener, 2002; Belgiu and Dragut˛, 2016 ˘ ; Jeong et al., 2016). In random forest regression, each tree is built using a deterministic algorithm by selecting a random set of variables and a random sample from the training dataset. Specifically, the

### TABLE 1 | Summary of input variables in this study.

fpls-11-00290 March 12, 2020 Time: 17:12 # 5


Meteorological variables (averages over 12 individual months and 4 seasons from daily values)


The Pearson's correlation and its significance between each individual variable and the production were shown here. For time varying meteorological variables, the range of the statistics for each variable among monthly and seasonal parameters were included. <sup>1</sup>Monthly and seasonal variables were mean values of daily maximum, minimum, and mean averaged over each month or season.

"RandomForest" package within R environment software was used in this study<sup>1</sup> .

Conceptually monthly and seasonal meteorological variables, during both the current year and the preceding year, may pose stresses at the different stages of plant growth, including flowering, leaf out, and fruit setting (Tombesi et al., 2010, 2017). Although a large set of explanatory variables is not an obstacle for the functioning of random forest model, the highly correlated meteorological variables may hinder the interpretation of the modeling results (Liaw and Wiener, 2002). We first used Pearson' correlation coefficient (r) to investigate how each individual independent variable was correlated with the yield gap, and how each individual weather variable correlated with each other among different time periods, thereby providing the basis for selecting a subset of more significant meteorological variables for building the model. In this study, we selected representative variables that are highly correlated with yield gap (i.e., r > 0.15) and less cross-correlated with other variables within the same category (i.e., r < 0.50) (**Supplementary Figures S1, S2** and **Supplementary Table S1**).

With random forest modeling, we ranked the variable importance based on how much the modeling accuracy decreased, or the increase in mean-square-error (i.e., IncMSE) of predictions, when a particular variable was excluded from the whole suite of input variables for model building (Grömping, 2009). The IncMSE of predictions, estimated with an outof-bag cross validation, in percentage relative to the full model, is a robust and informative metric, e.g., higher values indicating that the corresponding variable is more important for yield prediction.

We further used partial dependence plots to understand how each of these variables affected the yield (Welling et al., 2016). Intuitively, partial dependence plots show the dependence between the target response and a set of explanatory features, marginalizing over the values of all other features. We can interpret the partial dependence as the expected target response as a function of the explanatory features.

To further examine what conditions or combinations of conditions are associated with relatively higher or much lower normalized yield, we used the regression tree model (i.e., "rpart" package within R environment<sup>2</sup> ) to identify decision rules between explanatory variables and the target response that can best differentiate yield gaps, i.e., representative splitting nodes. We chose the decision tree with a highest predictive accuracy as the most representative tree in this study.

### Determinants of Light Interception

As a dominant influential variable, light interception (or percentage of absorbed PAR) reflected the combined effects of canopy density, structure, and health status, which were again

<sup>1</sup>https://cran.r-project.org/web/packages/randomForest/

<sup>2</sup>https://cran.r-project.org/web/packages/rpart

FIGURE 2 | Almond yield vs. light interception percentages (LI) for (A) all samples (n = 7864) and (C) Non-pareil samples (n = 5581) in the experimental plots. Color represents the density of the samples. The mean values (red circles) of the upper 10-percentile samples (black crosses) within each 5% of LI interval were used for modeling the potential yield, with regression lines shown in red dashed lines in (A,C). Also shown are the corresponding normalized yields (actual yield divided by the corresponding modeled yield potential) for (B) all cultivars and for (D) Non-pareil.

associated with tree age, row and tree spacing at a plot level, and meteorological conditions that affected tree physiology and development (Zarate-Valdez et al., 2015). To understand the dominant factors that affected light interception, we also analyzed the relationship between the light interception percentage and a suite of layers (including orchard characteristics, and current and preceding meteorological variables), using random forest model. Non-pareil was used as an example for this analysis (n = 5581), to exclude the potential confounding factors from different cultivars.

### Drivers for Overall Almond Yield

Besides affecting the light interception via tree growth and health, environmental variables may also affect flower phenology, bee activities, pollination, fruit set, and production. To further examine the complex relationships between yield and biological and environmental controls, we built overall random forest models to predict almond yield at the plot level, driven by four sets of independent variables, respectively. Specifically, these included (A) biological variables including measured light interception percentage and cultivar composition

(**Table 1**), (B) biological variables and full meteorological variables (**Supplementary Figure S4**), (C) biological variables and selected meteorological variables, and (D) biological variables but excluding light interception and full meteorological variables. Model performance was evaluated and compared with a sixfold cross validation. The root mean square error (RMSE) and R-square (R 2 ) were used to quantify the models' accuracy. We also calculated a ratio of performance to interquartile distance (RPIQ), which accounts for both the prediction error and variation of observed values, and therefore it is more objective than the RMSE and easier to compare among models (Bellon-Maurel et al., 2010). A greater RPIQ represents a stronger predictive capacity of the model (Bellon-Maurel et al., 2010).

### RESULTS

### Controls on Almond Yield Potential

Overall almond yield highly depended on light interception (**Figure 2A**), as shown by the Pearson's correlation coefficient of 0.60 (p < 0.001) between the recorded yield and measured light interception percentage across all sample plots. Yield increased from 467.4 ± 432.6 lbs/acre to above 2907.6 ± 1084.2 lbs/acre, when LI increased from below 30% to above 70%. Across each 5% interval of light interception, we found a very strong linear relationship between the maximum yield, as represented by the upper 10-percentile samples, and the light interception (**Figures 2A,C**). The yield potential predicted by the linear regression model agreed well with the observation, with a R 2 of 0.95, when all cultivars were considered. In particular, we found that one percent of increase in light interception led to an increase of 57.9 lbs/acre in the potential yield, as shown by the slope of the regression model (**Figure 2A**). Similar results were found when the analysis was restricted to the cultivar Nonpareil (R <sup>2</sup> = 0.94, slope = 57.7 lbs/acre per LI unit), further supporting that the yield potential was dominated by the light interception (**Figure 2C**).

### Determinants on Almond Yield Gap

Actual almond nut production was found to vary significantly at a given level of light interception (**Figure 2A**), even for the same cultivar (**Figure 2C**). For example, Non-pareil trees had yields ranging from 2278 lbs/acre (lower quantile) to 3267 lbs/acre (upper quantile), and averaged 2790 ± 781 lbs/acre, when LI was between 70% to 75% (**Figure 2C**). Across all plots the majority of almond samples didn't reach yield potential (i.e., red dashed line) for any given light interception percentage (**Figures 2B,D**).

The random forest analysis, as described in section "Yield Potential," showed that the variation of yield gap, 1- actual yield normalized by the potential yield at the corresponding light interception, was mostly driven by tree age, mean June daily Tmax, winter Tmean, SRAD, and mean summer daily VPDmax, among orchard characteristics and climate variables (**Figure 3**). Mature orchards (>5 years old) tended to have lower yield gap than younger orchards for the same amount of light interception and climate (**Figure 4A**). The partial dependence plots also showed that almond yield dropped significantly below the yield potential when the average winter temperature was higher than 10◦C and April SRAD was lower than 450 W m−<sup>2</sup> (**Figures 4C,D**). Daily Tmax averaged in June, daily SRAD averaged over previous September, and daily VPDmax averaged in summer had a more gradual impact rather than a significant thresholding effect (**Figures 4B,E,F**).

A representative decision tree further supported that samples close to potential yield (i.e., yield gap > 0.90) were associated with mature orchards (i.e., age > 5) and when winter Tmean < 10.22◦C, and April SRAD > 478.8 W m−<sup>2</sup> (n = 223) (**Figure 5**). The largest yield gap nodes, e.g., with a normalized yield of 0.27 (n = 596), were found among mature orchards, and when winter Tmean was greater than 10.22◦C and mean June daily Tmax was lower than 34◦C; another grouping of plots with large yield gaps (0.38, n = 520), were associated with young orchards, winter Tmean lower than 10.22◦C, and June Tmax lower than 32.19◦C.

### Determinants for Light Interception

The random forest model explained 82% of variation in light interception (**Supplementary Figure S3**), for Non-pareil (n = 5581), when using field based orchard characteristics and full set of meteorological variables as input. Age was the most important variable in determining light interception as expected, according to variable importance (**Figure 6**), and as shown by the high correlation (r = 0.63, p < 0.001) across all samples. The partial dependence plots further showed that light interception increased significantly with tree age until 7 years old and then plateaued (**Figure 7**). Mean Fall daily Tmin in previous year, long-term mean annual SRAD, February Tmin, summer Tmax, and long-term summer Dayl also affected current year light interception. Fall Tmin lower than 10.5◦C (**Figure 7B**) and longterm annual mean SRAD lower than 380 W m−<sup>2</sup> (**Figure 7B** and **Supplementary Figure S5A**) reduced light interception. We also found that other long-term mean climatic variables such as summer Dayl, mean winter daily Tmax, mean summer daily Tmin had an important role, probably because they affected the general tree growth.

A representative decision tree further revealed that light interception in trees < 7 years old was influenced by a different set of determinant variables than trees older than 7 years. In trees younger than 7 years the lowest light interception nodes were associated with mean March daily Tmax < 19.1◦C (**Figure 8**). In orchards > 7 years old, long-term annual mean SRAD > 378.8 W m−<sup>2</sup> (**Figure 8**), and the majority of them were distributed in middle to southern Central Valley (**Supplementary Figure S5B**). For young orchards, the highest LI (59%) were those samples distributed from norther to middle,

and southern Central Valley over various years (**Supplementary Figure S6A**). For mature orchards, the node with the lowest LI (53%) were 2013 samples clustered in the middle Central Valley (**Supplementary Figure S6B**).

### Overall Yield Prediction and Determinants

The prediction results showed that all models were able to explain more than 78% of yield variation (**Figure 9**), much higher than the linear yield prediction based only on field measured light interception (R <sup>2</sup> = 0.36), and the RF-based prediction using field measured light interception and orchard age (R <sup>2</sup> = 0.60). For example, when adding other orchard characteristics such as age and location (i.e., latitude and longitude), model (A) had a R <sup>2</sup> of 0.79 ± 0.01, RMSE of 530.64 ± 11.77 lbs/acre, and RPIQ of 3.12 ± 0.09, based on the random forest modeling with a sixfold cross validation. By further adding the whole suite of meteorological variables, the full model achieved

long-term annual mean shortwave radiation flux density (SRAD) for Non-pareil, ordered by variable importance.

the more robust and higher accuracy, as shown by higher R 2 (0.82 ± 0.01), lower RMSE (480 ± 9 lbs/acre) and RPIQ (3.45 ± 0.17). After removing highly correlated meteorological variables, the reduced model with selected meteorological variables (**Supplementary Figure S7** and **Supplementary Table S2**) had a similar accuracy with that of full model (**Table 2**).

When excluding light interception, the overall orchard characteristics (like location and age, tree density) and environmental variables (Model D) could explain 78% of yield variation across samples, similar to the model (A) which uses all orchard characteristics plus tree level light interception.

Based on the model with field biological and selected meteorological variables, we found that cultivar, light interception, and age were most important in determining overall almond yield (**Figure 10**). The key meteorological variables that ranked relatively important were mean summer daily VPDmax, mean winter daily Tmin, April SRAD, and summer Tmean (**Figure 10**).

The partial dependence analysis further showed yield difference across different almond cultivars (**Figure 11A**). Among the most popular almond cultivars, Aldrich (Cultivar ID: 2), Monterey (Cultivar ID: 17), and Non-pareil (Cultivar ID: 18) had higher yields than Butte (Cultivar ID: 5) and Carmel (Cultivar ID: 7), with everything else being equal. Yield increased linearly with light interception, but dropped rapidly when the light interception was higher than approximately 82% (**Figure 11B**). Tree age was identified to play an important role mostly during the young stage (i.e., 1–6 years) of almond growth (**Figure 11C**); the impact from tree ages was quite stable after reaching the maturity, but yield decreased for plots over 20 years of age. The contribution from April SRAD to the yield kept stable from 250 to 470 Wm−<sup>2</sup> , but rapidly increased after that threshold. In contrast, Mean summer daily VPDmax limited the yield

when it was higher than 40 hPa (**Figure 11D**). The additional meteorological variables such as mean winter daily Tmin did have a slightly negative impact on the yield variation (**Figures 11E,F**).

Among mature orchards only, (i.e., tree ages from 7 to 18) (n = 4337), variable importance and partial dependence plots showed that light interception was the dominant control on the yield for mature almond trees (**Supplementary Figures S8, S9**), the yield varied considerably across different cultivars. Almond cultivars with Monterey (2601 ± 458 lbs/acre) (n = 191) and Nonpareil (2401 ± 1152 lbs/acre) (n = 3293) were more productive than others (2052 ± 1110 lbs/acre, **Supplementary Figure S10**). The identified impacts from other meteorological variables were similar to those derived from the scenario using all almond samples (**Supplementary Figure S9**).

TABLE 2 | Yield prediction performance of random forest models driven by three sets of independent variables, based on the sixfold cross validation.

using full model but excluding light interception. Red dashed line denotes the 1-to-1 line, and the gray solid line denotes the linear regression trend.


### CONCLUSION AND DISCUSSION

fpls-11-00290 March 12, 2020 Time: 17:12 # 13

Tree crops have rather complex processes in terms of nut production, involving physiology of tree growth, flowering phenology, bee activity, and etc (Connell, 2000; Zarate-Valdez et al., 2015; Tombesi et al., 2017; Chen et al., 2019a,b). A large dataset across the environmental gradients coupled with a more advanced data analytics, such as artificial intelligence (AI) including machine learning (ML) algorithms, are needed to understand the constraints on the yield gap at the plot and field scales (National Academies of Sciences Engineering Medicine, 2019). This study made use of a unique dataset of field measurements of light interception and almond yield records in California's almond orchards. We used random forest, a widely used ML approach, for interpreting and predicting the variations almond nut production. Our modeling experiments showed that the full random forest model explained about 82% (±1%) of yield variation using a sixfold cross validation, with a RMSE of 480 ± 9 lbs/acre). The RF-based prediction using only field measured light interception and orchard age (R <sup>2</sup> = 0.60); when excluding light interception, the overall orchard characteristics (like location and age, tree density) and environmental variables could still explain 78% of yield variation across samples. Cultivar, light interception, and age were most important in determining overall almond yield. Various climate variables were also found to play important roles in yield variation.

Both seasonal weather conditions during the current year and the previous year were found to affect the plant physiology and thus nut production from year to year at the field scale. Long term climate, on the other hand, determines the spatial variation in the almond yield at the regional scale. Our results showed that, at a given level of light interception, the departure of the actual almond nut production from the potential yield varied significantly, driven mostly by temperature in June and winter, mean summer daily VPDmax, and incoming solar radiation (SRAD) in addition to tree age. Warmer winter, e.g., limited the yield for the mature orchards from reaching the maximum yield. On the other hand, light interception fraction was found higher for mature sites with higher long term mean SRAD and lowest light interception for younger orchards and when March maximum temperature was lower than 19◦C. For the overall almond yield, we also found that summer VPDmax limited the yield when it was beyond 40 hPa and warmer daily Tmin also reduced the yield.

Further studies are needed to examine the stressors of extreme weather such as heatwaves on plant growth. We did find that the number of extreme hot days on the nut production had a negative impact, for example, extreme hot days in June either in preceding year or concurrent year had a considerable negative impact on yield (r = −0.31 and r = −0.21, **Supplementary Figures S1, S7**). However, when putting all other variables together, they didn't show as top six environmental controls, probably because these heat threats could be partially reflected by other monthly climatic variables such as VPD and temperature.

Our results showed that the light interception was found as the predominant control for the almond yield. Overall the almond yield was highly dependent on light interception, e.g., one percent of increase in light interception led to an increase of 57.9 lbs/acre in the potential yield. The mobile platform (MLB) has been used to measure the light interception at the tree and plot level. Recent advances in UAV technology makes it possible to measure the energy reflected by plants at the meter or sub-meter scale (Johansen et al., 2018; Tewes and Schellberg, 2018), and estimate the plant biomass (Bendig et al., 2015; Liu et al., 2019), therefore providing another costeffective way to map the light interception across the field scale. Moreover, satellite observations with higher spatial and temporal resolutions have been increasing in recent years, such as at 3m by PlanetScope. The optical observations at the RGB and NIR have long been used to monitor plant growth and photosynthesis (Zhang et al., 2003; Chen et al., 2019a). An important next step is to calibrate the relationship between the field measured light interception with the optical remote sensing observations from UAVs and drones, and then map the light interception at a larger scale.

In addition to its impact on plant growth, weather condition also affects the timing and intensity of bloom and bee activity in February and March, and therefore the nut production later in the season. The bloom information derived from high resolution remote sensing observations (Chen et al., 2019b) can be integrated into the yield modeling. Yield is also largely impacted by growers' management practices including irrigation, nutrient, and canopy management such as pruning, weed management, and pests and disease control. The development of a large consistent database for location specific historic yield, orchard characteristics including the row orientation, and management history, is critical for future studies.

The machine learning approaches are expected to enhance both the explanatory power and the predictive capability, by bringing various big datasets together. A data-driven yield model based on advanced machine learning analytics, will allow researchers to query the causes and effects of location and year on productivity and to test current theories of the determinant of yields, a critical step in the development of improved sustainability practices. The prediction capability of the yield response to weather and climate, as shown by this study, is also expected to inform growers to adapt their management practices for plant protection under changing climate.

### DATA AVAILABILITY STATEMENT

Publicly available datasets were generated in this study. This data can be found here: http://www.prism.oregonstate.edu/ and https://daymet.ornl.gov/.

### AUTHOR CONTRIBUTIONS

YJ and PB conceived the project idea. BL collected the field data. BC compiled all the rest of spatial data, and built the models. BC and YJ performed the data analysis and wrote the manuscript.

All authors reviewed and edited the manuscript, and agreed with the submission.

### FUNDING

This work was supported by a project (SCB16036) funded by the USDA California Department of Food and Agriculture (CDFA) Specialty Crop Block Grant Program. Research conducted under agricultural experimental station projects CA-D-PLS-2016-H to PB and CA-D-LAW-2296-H to YJ.

### REFERENCES


### ACKNOWLEDGMENTS

The authors would like to acknowledge BL's field crew for collecting the data and the participating growers for their support on the experimental plots.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2020.00290/ full#supplementary-material



**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Jin, Chen, Lampinen and Brown. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Emerging Technologies in Algal Biotechnology: Toward the Establishment of a Sustainable, Algae-Based Bioeconomy

Michele Fabris1,2 \*, Raffaela M. Abbriano<sup>1</sup> , Mathieu Pernice<sup>1</sup> , Donna L. Sutherland<sup>1</sup> , Audrey S. Commault<sup>1</sup> , Christopher C. Hall<sup>1</sup> , Leen Labeeuw<sup>1</sup> , Janice I. McCauley<sup>1</sup> , Unnikrishnan Kuzhiuparambil<sup>1</sup> , Parijat Ray<sup>1</sup> , Tim Kahlke<sup>1</sup> and Peter J. Ralph<sup>1</sup>

<sup>1</sup> Climate Change Cluster (C3), University of Technology Sydney, Ultimo, NSW, Australia, <sup>2</sup> CSIRO Synthetic Biology Future Science Platform, Brisbane, QLD, Australia

### Edited by:

Edward Rybicki, University of Cape Town, South Africa

### Reviewed by:

Philip Thomas Pienkos, National Renewable Energy Laboratory (DOE), United States Maria Stockenreiter, Ludwig Maximilian University of Munich, Germany

> \*Correspondence: Michele Fabris michele.fabris@uts.edu.au

### Specialty section:

This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science

Received: 29 November 2019 Accepted: 24 February 2020 Published: 17 March 2020

### Citation:

Fabris M, Abbriano RM, Pernice M, Sutherland DL, Commault AS, Hall CC, Labeeuw L, McCauley JI, Kuzhiuparambil U, Ray P, Kahlke T and Ralph PJ (2020) Emerging Technologies in Algal Biotechnology: Toward the Establishment of a Sustainable, Algae-Based Bioeconomy. Front. Plant Sci. 11:279. doi: 10.3389/fpls.2020.00279 Mankind has recognized the value of land plants as renewable sources of food, medicine, and materials for millennia. Throughout human history, agricultural methods were continuously modified and improved to meet the changing needs of civilization. Today, our rapidly growing population requires further innovation to address the practical limitations and serious environmental concerns associated with current industrial and agricultural practices. Microalgae are a diverse group of unicellular photosynthetic organisms that are emerging as next-generation resources with the potential to address urgent industrial and agricultural demands. The extensive biological diversity of algae can be leveraged to produce a wealth of valuable bioproducts, either naturally or via genetic manipulation. Microalgae additionally possess a set of intrinsic advantages, such as low production costs, no requirement for arable land, and the capacity to grow rapidly in both large-scale outdoor systems and scalable, fully contained photobioreactors. Here, we review technical advancements, novel fields of application, and products in the field of algal biotechnology to illustrate how algae could present high-tech, low-cost, and environmentally friendly solutions to many current and future needs of our society. We discuss how emerging technologies such as synthetic biology, highthroughput phenomics, and the application of internet of things (IoT) automation to algal manufacturing technology can advance the understanding of algal biology and, ultimately, drive the establishment of an algal-based bioeconomy.

Keywords: microalgae, synthetic biology, phenomics, industry 4.0, bioproducts, food, bioremediation, feedstock

# INTRODUCTION

By 2050, it is estimated that the world population will exceed 10 billion people (United Nations, 2019). Agriculture is already nearly maximally exploited, most arable land is already in use, and issues such as climate change and urban expansion pose important challenges to the future of agriculture (Foley et al., 2011). Simply increasing the intensity of agriculture, farming, fishing, and fossil oil extraction will not be sufficient to meet future demands. Rising global temperatures, extreme weather, changing climatic patterns, and loss of cultivable land will require drastic

**106**

changes in current agrotechnology (Wurtzel et al., 2019) to minimize environmental impact through sustainable sourcing of commodities such as food, bioproducts, and bulk chemicals. Implementation of high-tech engineering and molecular genetics approaches, in the forms of phenomics and genetic engineering, has effectively improved the productivity, cost-effectiveness, and environmental impact of agricultural crops such as soy, corn, wheat, and rice (Mir et al., 2019). At the same time, plant-derived alternatives for animal-based foods such as meat and dairy, and commodities derived from petroleum such as plastics, are being developed (Zhu et al., 2016). Despite the clear advantages the of these solutions, the use of food crops to replace less sustainable manufacturing practices will eventually contribute to increased agricultural demand and face the same challenges that have been characterizing the "fuel vs. food debate." Therefore, new solutions and additional resources are required to meet the increasing demands.

Photosynthetic microalgae are microbes that have colonized every habitat on Earth, and exhibit extraordinary biological diversity, estimated to be greater than 200,000 species (Guiry, 2012), which reflects an enormous range of ecological adaptations. Unlike other microbes often exploited for biobased manufacturing, such as yeast and bacteria, phototrophic algae have the advantage to use sunlight to fix atmospheric carbon, reducing their reliance on sugars for fermentation. Naturally thriving in environments with intermittent and scarce nutrient availability, many species of microalgae have evolved efficient metabolic adaptations to grow rapidly under favorable conditions (Smetacek, 1999; Litchman, 2007). As a result, algae often have a higher photosynthetic efficiency than plants (Bhola et al., 2014), which translates into a higher capacity to generate biomass (Benedetti et al., 2018).

When grown at large scale – in either a pond or photobioreactor – microalgae are more water-efficient than crop plants (Demirbas, 2009) and can be cultivated on non-arable land with minimal use of freshwater (Demirbas, 2009), or even grow in seawater or wastewater. Thus, many geographical areas that are not suitable or sufficiently fertile for crop cultivation could be effectively used for large-scale algal cultivation. Many algal species are naturally efficient producers of carbohydrates, lipids, proteins, pigments, as well as a range of commercial secondary metabolites that are currently sourced from conventional agriculture (Koyande et al., 2019). Also, microalgae are emerging as a next-generation, cell-sized biofactories for the sustainable manufacturing of a myriad of products (Rasala and Mayfield, 2015; Vavitsas et al., 2018), following the example of established microbial platforms such as yeasts and bacteria. In this respect, microalgal biofactories have the potential to be less expensive and more sustainable platforms that may be naturally predisposed to produce certain plant-derived products (Vavitsas et al., 2018).

Currently algae are used for a relatively small number of industrial applications. Recent works have described in details the transition of the focus from algal-based bioenergy to highvalue bioproducts, and the model of algae-based biorefineries (Laurens et al., 2017). In this review, we describe how recent landmark achievements have demonstrated the untapped commercial potential of algae-based applications. Specifically, we outline how cutting-edge technology developments such as automation, synthetic biology and phenomics can leverage the already naturally promising capabilities of microalgae in the coming years. By highlighting recent key achievements and unsolved knowledge gaps in the field – both in terms of technology advancements and applications – we describe the future development of microalgae as next-generation, low-cost, sustainable, scalable, and high productivity crop system. We anticipate that this will contribute to generate an algal-based bioeconomy, which will contribute to solutions to the imminent challenges caused by our growing society.

# TECHNOLOGY DEVELOPMENT

While crop plants have been bred and selected for millennia to isolate specific traits and to obtain highly productive strains, all present microalgal species are effectively environmental isolates. To maximize productivity and increase the industrial potential of microalgae, it is key to optimize both the organism and environment that supports its growth. In the following sections, we describe how this can be achieved through the latest technology developments in algal cultivation and harvesting, automation, phenotyping, and synthetic biology (**Figure 1**).

### Algal Cultivation

One of the most attractive intrinsic features of many algal species is that they are capable of rapidly and inexpensively generating large amounts of biomass compared to plants (Brennan and Owende, 2010). In nature, microalgae are capable of reaching high biomass concentrations under eutrophic conditions but, from a mass culture point of view, even these concentrations are not sufficient. In the past decade, there has been a large body of research focused on optimizing conditions that maximally promote algal growth rates, or elicit enhanced production of a specific product, under artificial growth conditions. However, one of the biggest limitations in algal mass cultivation is creating a cost-effective production system. In this regard, a diverse range of algal cultivation techniques can offer differing levels of control over the growth and product yield, with different associated capital and operating costs.

Several factors can limit microalgal growth in mass culture, including light availability, temperature and pH as well as both the concentration and ratio of the major nutrients, carbon, nitrogen and phosphorus (Sutherland et al., 2015). Some algae are capable of growing autotrophically as well as mixo- or heterotrophically which allows them to avoid light limitation constraints in dense culture, but does require the addition of organic carbon sources. As with any supplementation, adding organic carbon to the growth medium increases material input costs, but may achieve higher cell densities (Venkata Mohan et al., 2015). In principle, algae have the same basic requirements as plants, in that they need biologically available nitrogen and phosphorus, as well as trace nutrients (i.e. sulfur, calcium, iron, silicon.), and management of the pH levels to maximize nutrient availability (White and Ryan, 2015). The water source can affect what nutrients need to be additionally supplied, while water

computer (PLC), receives information and logs its operation to a database computer. The database collects data from a network of plug-and-play sensors, which inform a digital twin simulation of the facility. The digital twin predicts the future demand and yield of the algae culture and updates the controller to optimize the process to match the predicted demand.

availability and recovery is key in determining what algal species can be selected. Algae can grow on different water sources, such as marine, fresh, or waste water. Wastewater is naturally rich in nutrients, but has additional contaminants that could cause culture crashes. There is a large diversity of marine algae and ready availability of seawater. However, seawater requires the addition of fertilizers and, in open systems, it is subjected to evaporation. This causes the salinity to be altered and to require monitoring. Fresh water may also require additional nutrients, but may also increase the strain on water supplies in water scarce regions. Recycling the water can aid in reducing these issues and improve the economic viability. The overall water requirement is still lower than traditional plant based crops (Rawat et al., 2013) leading algae to be desirable alternatives for cultivation.

Traditionally, microalgae have been grown in simple open ponds (Becker, 1994), but research and technological advances over the past several decades have led to a diversity of high-productivity bioreactor designs. Large-scale autotrophic algal production designs accommodate suspended or attached growth in either open or closed systems, or a hybrid of these, reviewed extensively elsewhere (Ugwu et al., 2008; Brennan and Owende, 2010; Harun et al., 2010; Christenson and Sims, 2011; Olivieri et al., 2014).

Besides stagnant ponds, the cheapest option for large-scale microalgal production is a shallow open pond raceway design that includes basic mixing. Compared to other photo-bioreactor (PBR) designs, they also have lower energy requirements, lower capital and operating costs, and can be built at a large scale (Brennan and Owende, 2010; Borowitzka and Vonshak, 2017). However, they generally have the lowest areal productivity (<10 g m−<sup>2</sup> d −1 compared to >20 g m−<sup>2</sup> d −1 in some PBRs) (De Vree et al., 2015). Some advances have been made improving productivity by modifying the design of open systems, such as high rate open ponds (HRAP), where increased baffles or more complex geometries improve the overall mixing pattern and ensure algae remain in the illuminated part of the water column (Christenson and Sims, 2011; Craggs et al., 2012). Companies worldwide are investing in this system, For example in Hawaii, this has generated over

\$US 10 million gross profits from biomass grown in open ponds (Maeda et al., 2018).

Another improvement in outdoor open cultivation are bioreactors for attached growth, such as algae turf scrubbers (ATS), or motorized wheels with biofilm growth (Wang and Lan, 2018). Novel biofilm-based algal cultivation in particular has seen increased research in recent years, in part due to higher harvested solid content (10–20% compared to <0.02% for suspended systems), which leads to lower harvesting costs. While biofilm growth is not suitable for all algal species, and can lead to complex mixed algae-bacteria communities (and therefore less suited for high-value, single products), it has been investigated for wastewater remediation (Gross et al., 2015) (section "Algal Biodegradation of Emerging Contaminants"). Closed suspended growth PBRs, such as flat-plate, tubular, or bag reactors have increased operating control, better mixing, and less chance of contamination compared to open systems and are suitable for genetically modified organisms, but also have significantly higher capital and operational costs (Gupta et al., 2015). Closed systems can be operated using artificial light (at increased costs), which can be tailored to the specific algae to increase productivity (Schulze et al., 2014; Glemser et al., 2016). In addition, genetically modified organisms (GMOs) may have regulatory limitations that prevent them from growing in open systems where they can be released into the wild. As such, closed systems may become increasingly common in the future. Despite these advantages, most of the current production is done in open pond systems (on the order of thousands of tonnes per annum) for products such as biofuels, animal feed, and nutraceuticals, while closed systems (hundreds of tonnes per annum) are used primarily for highvalue products (Posten, 2009; Borowitzka and Vonshak, 2017).

Various techno-economic assessments have reviewed the feasibility of large-scale algal production (Laurens et al., 2017). The capital cost of an open pond can range from ∼\$US 6/m<sup>2</sup> (Craggs et al., 2012) to US\$ 50/m<sup>2</sup> (Huntley et al., 2015), while enclosed PBR systems can to cost up to 3 to 30 times more (Panis and Carreon, 2016). Operating cost can instead vary from US\$ 0.8/kg dry weight (DW) to up to \$8/kg DW for various systems and applications. Biofuels in particular have received most attention and are currently not price competitive yet, with production costs at ∼ US\$ 3/L compared to <US\$ 1/L of producing fuel from fossil oil (Sun et al., 2011; Laurens et al., 2017; Roles et al., 2020). The areal/volumetric productivity is generally one of the largest uncertainties as well as drivers for success (Jonker and Faaij, 2013; Chauton et al., 2015; Panis and Carreon, 2016; Hoffman et al., 2017). Strain selection is key to optimize productivities for the final product. This could require designing novel strains through genetic engineering or synthetic biology (section "Synthetic Biology") that have been thoroughly profiled and selected using a phenomics approach (section "Phenomics"). On the cost side of the equation, dewatering is one of the major expenses in algal processing currently, consisting in up to 20–30% of the final cost, as reviewed extensively by Christenson and Sims (2011), Milledge and Heaven (2013), Gerardo et al. (2015) and Fasaei et al. (2018).

Algal suspensions are generally very dilute; therefore increasing the biomass content in the cultivation stage can substantially reduce costs, which is a major advantage for closed and attached growth PBRs compared to open ponds (Fasaei et al., 2018). Advances have been made in harvesting technology, by employing for example cross-flow filtration (Gerardo et al., 2014), cheaper flocculants ('T Lam et al., 2018; Nguyen et al., 2019), bio-flocculants (Ummalyma et al., 2017), microfluidics at lab scale (Kim et al., 2018), and novel techniques such as pulsed electric field, ultrasound, and electroflocculation, that have yet to be demonstrated at industrial scale (Milledge and Heaven, 2013; 'T Lam et al., 2018). However, while some harvesting processes can reduce the energy costs – for example filtration has a lower energy requirement compared to centrifugation – they can lead to a higher operating cost (e.g. filtration is subject to membrane fouling) (Bilad et al., 2014). Flocculants, chemicals added to cause algal cells to aggregate, can be inexpensive and have a long history in wastewater treatment, but can be hard to recover and affect downstream processing and media recycling (Milledge and Heaven, 2013). To address these short-falls, some systems combine multiple harvesting steps (e.g. flocculation combined with dissolved air flotation to remove the aggregates) (Pragya et al., 2013), while others are looking to bypass the harvesting step completely by modifying the algae to secrete the compound of interest (Christenson and Sims, 2011). The final harvesting step depends on the required product, the type of algae, and the specific cultivation strategies. As such, there is no one-size-fits-all solution to harvesting of the algal cultures.

In many cases, the final products can fit into existing industrial processes, for example transesterification for biofuels and extraction of high-value products (Greenwell et al., 2010; Khanra et al., 2018). There is also ongoing research for improvements in extraction of the algal products, such as by using supercritical extraction, pressure or microwave assisted extraction, ionic liquids, novel (less toxic) solvents, enzyme assisted extraction, or aqueous biphasic systems (Kadam et al., 2013; Kumar et al., 2015; Chew et al., 2017; 'T Lam et al., 2018; Khanra et al., 2018). Many of these novel and "green" extraction processes are dependent on the desired product, and are still only in use at the lab or pilot phase; getting them to an industrial scale would require significant investment and further research (Michalak and Chojnacka, 2015).

There is currently no single "best practice" method to cultivate algae, especially at scale. Final design of the system is dependent on the final product, the geographical location, as well as local resources available (e.g. accessibility to water, to CO2, and to waste streams). Modeling and lab scale experiments have suggested novel innovations in designing and operating the process, but the final consideration is the cost: some processes may have a larger up-front capital cost, but reduce the overall operating cost (e.g. HRAP ponds), while others may have very low capital costs but affect further downstream processing (e.g. chemical flocculation). As such, due consideration for the overall cost will guide the final design and operation of the system.

### Industry 4.0 Approach to Algal Biorefineries

Regardless how the biomass is produced, if the downstream processing can be performed in an integrated biorefinery that allows the greatest number of products and co-products

to be extracted (**Figure 2**), and with the least amounts of residual/waste, it will ensure the maximum return on investment for downstream processing. Industry 4.0 is an advanced manufacturing approach based on machine-tomachine communication technologies, also known as "the Internet of Things," or IoT (Atzori et al., 2010), whereby automation, sensors, and machine learning create a self-adapting manufacturing processes able to adjust in real time to changes in the process itself (Kagermann et al., 2011; Kagermann et al., 2013). In a microalgal biorefinery, this means that not only can the algal cultivation and harvesting system be automated to reduce operating costs, but a network of plug-and-play IoT sensors could allow the operators to monitor the algae growth and productivity in real time (**Figure 1c**; Whitmore et al., 2015). The concept of Industry 4.0 goes a step further by building a simulation, or digital twin, of the facility and the algal culture from the sensor data. The simulation can make real-time predictions of future cellular yield and adjust operations to meet expected product demand and to reduce waste (**Figure 1c**; Tuegel et al., 2011; Uhlemann et al., 2017; Tao et al., 2018). For example, a fully realized Industry 4.0 microalgal biorefinery (**Figures 1c**, **2**) would link the controlled cellular yield of specific components with automated serial downstream extraction of several co-products that are driven by current demand, rather than traditional linear production stockpiling that awaits demand (Kagermann et al., 2013). Biorefineries can be located at regional hubs to service surrounding producers.

### Phenomics

Phenomics is defined as "the acquisition of high-dimensional phenotypic data in an organism-wide scale" (Houle et al., 2010). Algal phenomics is currently in very early stages of developments, however, it holds great potential in microalgal agriculture for food security (section "Food and Nutraceuticals"), bioproducts sourcing (sections "Food and nutraceuticals," "Feedstocks," "High-Value Products," and "Biopolymers, Bioplastics, and Bulk Chemicals"), bioremediation (section "Algal Biodegradation of Emerging Contaminants"), and carbon sequestration. By creating a database of GxE = P [where G, genome, E, environment(s) and P, phenotype(s)] interactions for a given algal species, researchers can screen natural and artificial diversity for the combination of gene alleles that will combine essential phenotypes (Furbank and Tester, 2011), such as fast growth and high product yield.

Recent advancements from the field of plant phenomics highlight the potential impact of phenomics techniques and technologies in microalgae. For example, a recent phenomicsbased study on Arabidopsis thaliana yielded a mutant exhibiting both increased pathogen defense and photosynthetic growth,

breaking the supposed trade-off between growth and defense that was a near-dogma among plant researchers (Campos et al., 2016; Cruz et al., 2016). In microalgae, such a phenomics approach could create equally dramatic combinations of useful phenotypes, such as a strains that use quorum sensing (Das et al., 2019) to trigger autoflocculation (González-Fernández and Ballesteros, 2013) and induction of product synthesis (e.g. carotenoids) (Gong and Bassi, 2016) only when the culture reaches harvest density.

A major limitation to present-day microalgal phenomics is the lack of searchable phenomics databases. Plant and yeast researchers, for example, can design or even perform in silico experiments using The Arabidopsis Information Resource (TAIR) (Lamesch et al., 2011) and PROPHECY (Fernandez-Ricaud et al., 2005; Fernandez-Ricaud et al., 2016) databases, respectively. Such tools accelerate research by showing how different genes can be related by a shared phenotype (Ohyama et al., 2008), or even differentiate the functions of seemingly redundant gene copies (Yadav et al., 2007). To bring this power to the field of algal research, there will need to be investments in building a comprehensive phenotypic database. The field of plant phenomics has already created the standards for data sharing, knowledge retrieval, and ontology annotation (Oellrich et al., 2015; Munir and Sheraz Anjum, 2018; Neveu et al., 2019), which can be adapted to algae. The data analysis tools presently used for model microbes (e.g. yeast) can also be applied to microalgae to measure phenotype data from highthroughput algae culture formats (agar plates, microplates, etc.) using standard microbiology sensors such as fluorescence and absorption spectrophotometers (Fernandez-Ricaud et al., 2005; Fernandez-Ricaud et al., 2016), hyperspectral cameras (Roitsch et al., 2019), and flow cytometers (Cagnon et al., 2013). Even morphological phenotypes can be automatically digitized via machine learning (ML) approaches such as image processing with Support Vector Machines (SVNs) and Convoluted Neural Networks (CNNs), as demonstrated in Mohanty et al. (2016) and Sladojevic et al. (2016).

One challenge for developing microalgal phenomics databases will be choosing the growth environments that cause the microalgae to display a range of phenotypes based on their genetic predispositions. For example, much research has been done using high and low concentrations of carbon dioxide to learn about the roles of genes in photosynthesis (Suzuki et al., 1999; Vance and Spalding, 2005; Duanmu et al., 2009), but elucidating phenotypes related to stress responses and repair cycles can be more challenging (Cruz et al., 2016; Tietz et al., 2017). The risk of poorly chosen algal phenomics reference environments could result in insufficient segregation of phenotypes (Thomas, 1993), or worse, the environments might be so different from large scale cultivation as to render the measured phenotypes misleading and irrelevant to large scale enterprises (Rawat et al., 2013).

Another obstacle to algal phenomics is the limited capacity to manipulate the genetics of many non-model microalgal species. Many microalgae have complex life cycles (Graham et al., 2009) and their genomes are often large and highly repetitive, defying typical shot-gun sequencing techniques for genome sequencing and assembly (Paajanen et al., 2017). However, 3rd generation methods that allow long-read sequencing – such as Nanopore and PacBio sequencing – are breaking the log-jam. For example, PacBio sequencing has proven capable of mapping trans-gene integration sites in plants (Liu et al., 2019). These tools will be key in mapping genotypes of mutant libraries. Novel gene editing technologies such CRISPR-Cas9 are increasingly used to create large, genome-wide knock-out libraries in important crops such as tomatoes and rice (Jacobs et al., 2017; Meng et al., 2017). In the future, a combination of these gene-editing methods with 3rd generation sequencing technologies will enable cost and time-effective creation and mapping of knock-out libraries of important microalgal species. With a reference genome in hand, linking a phenotype to the relevant gene alleles traditionally relies on statistical analysis of the progeny after cross-breeding two individuals or populations with different phenotypes for example through Quantitative Trait Loci (QTL) mapping. With each successive generation, the genomic regions responsible for the phenotypes can be narrowed down until one has a testable list of candidate gene loci (van Bezouw et al., 2019). Given that many microalgae either do not breed at all or only under often unknown environmental conditions, novel approaches must be developed to fully utilize the power of phenomic mutant screens.

Once these challenges have been solved, algal phenomics will have a big impact on algal biotechnology by enabling the development of microalgae as new bioproducts and pharmaceutical workhorses (section "High-Value Products"). In this case, the yield of the desired product (or a suitable proxy) is treated as one phenotype in a phenomics search for both productivity and reliability among engineered strains. In addition, the maturation of algal phenomics tools will likely change the very mindset of algal biotechnology researchers. Presently, algal biotech researchers pick a single strain that synthetises the product of interest, and then they try to optimize the culture environment to improve the productivity, often resulting in expensive and complicated PBR design (Vasumathi et al., 2012; Wang et al., 2012; Melnicki et al., 2013; Lucker et al., 2014). In the era of algal phenomics, researchers will instead define their production culture environment, and optimize the algae to that environment, as much as a plant breeder would do for field crops (Donald, 1968; Jordan et al., 2011).

### Synthetic Biology

Synthetic biology applies engineering principles to the rational design of living organisms. Within this discipline, a biological system is viewed as a collection of characterized genetic parts that can be modified and reassembled to alter existing functions or to build them de novo in alternative host organisms. Genetic designs are revised through iterations of a design-build-test-learn cycle to achieve optimized metabolic configurations for biotechnological applications (Khalil and Collins, 2010; Nielsen and Keasling, 2016). Synthetic biology applied to microalgae will combine this powerful new approach with the benefits of a photosynthetic microbial host to generate novel production strains tailored to suit future environmental challenges (**Figure 1a**).

Tools for the genetic engineering of microalgae are evolving rapidly, enabled by the increased availability of sequenced

genomes across multiple algal lineages. The sequencing of microalgal genomes has facilitated genetic tool development in the green alga Chlamydomonas reinhardtii (Mussgnug, 2015), stramenopiles Phaeodactylum tricornutum (Huang and Daboussi, 2017) and Nannochloropsis gaditana (Poliner et al., 2018), and cyanobacteria Synechocystis sp. PCC 6803 (Hagemann and Hess, 2018), among others. In addition, methods for the genetic transformation in microalgae have been optimized for many species and include natural transformation, electroporation, bead beating, biolistic transformation, and conjugative plasmid transfer (Qin et al., 2012).

Genomic data from these species have facilitated the identification of native genetic elements necessary for genetic engineering and successful transformation. Several constitutive and inducible endogenous promoter/terminator pairs have been demonstrated to effectively express transgenes in model species (Wang et al., 2012; Ramarajan et al., 2019), including bidirectional promoters for gene stacking or co-expression with a selectable marker (Poliner et al., 2018). In addition, heterologous or synthetic promoter sequences have been characterized (Berla et al., 2013; Zhou et al., 2014; Scranton et al., 2016). Other regulatory effectors, such as ligand-binding riboswitches, have been identified and developed as tools to regulate gene expression in cyanobacteria and C. reinhardtii (Moulin et al., 2013; Nakahira et al., 2013). Additional characterization of sequences that regulate transcription will be crucial to transition from stepwise genetic engineering to targeting multiple sites, introducing multigene pathways, or building independent synthetic circuits. In addition to sequences that modulate transcription, the molecular toolkit in microalgae also includes a useful suite of selectable markers, reporter genes, protein tags, and peptide sequences for ribosomal skipping or protein localization (Vavitsas et al., 2019). To standardize these commonly used genetic parts and to facilitate collaboration, the scientific community has adopted Type IIS restriction endonuclease cloning systems. This approach allows for efficient modular assembly of complex plasmids from a library of domesticated parts, and is being widely implemented in several models, including in plants (Patron et al., 2015). Suites of parts specific to microalgae have been developed to be compatible with a common syntax to benefit from existing part registries. Type IIS cloning systems specific to microalgae include the MoClo toolkit for C. reinhardtii (Crozet et al., 2018), CyanoGate for cyanobacteria (Vasudevan et al., 2019), and uLoop for diatoms (Pollak et al., 2019).

Multiple molecular techniques are available to modify native gene expression or target specific areas of the genome in microalgae. Gene knockdown by introduction of antisense, artificial small RNAs, and CRISPRi has been implemented in multiple systems (De Riso et al., 2009; Zhao et al., 2009; Yao et al., 2016; Wei et al., 2017; Sun et al., 2018). Site-specific genetic manipulation by homologous recombination (HR) is routine in cyanobacteria (Zang et al., 2007), the chloroplast genome of C. reinhardtii (Esland et al., 2018), and the nuclear genome of Nannochloropsis (Kilian et al., 2011). In contrast, HR occurs at a low frequency in the nuclear genome of C. reinhardtii and P. tricornutum, but can be induced in the presence of doublestrand DNA breaks by targeted endonucleases, enabling targeted gene knockout and/or knock-in (Shin et al., 2016; Greiner et al., 2017; Kroth et al., 2018). Although technology for precision genome editing, including zinc-finger nucleases, transcription activator-like effector nucleases (TALENs), or CRISPR/Cas9, has been reported in many microalgae (Sizova et al., 2013; Weyman et al., 2015; Li et al., 2016; Nymark et al., 2016; Ajjawi et al., 2017), several challenges related to targeting, efficiency, and toxicity remain to be fully overcome. Strategies to circumvent these issues include transient Cas9 expression (Guzmán-Zapata et al., 2019), direct ribonucleoprotein (RNP) delivery (Baek et al., 2016; Shin et al., 2016) and use of Cas variants (Ungerer and Pakrasi, 2016). Marker-free and multiplex gene knock-out remains a challenge in some microalgae, although the use of multiple sgRNAs to multiplex genome editing targets has been shown to be feasible in diatoms (Serif et al., 2018).

Despite the rapid advances in the genetic tools available in microalgae, the field trails behind other established chassis microorganisms such as Escherichia coli and Saccharomyces cerevisiae. These model systems benefit from decades of intense study, resulting in diverse suites of characterized genetic parts and tools, known metabolic features, and well-annotated genomes. Approaches such as protein engineering and directed evolution that have been effectively implemented in these traditional hosts (Abatemarco et al., 2013) could also be applied in microalgae to hasten their development as chassis organisms.

Advances in synthetic biology are also enabling the design of entire microbial genomes (Hutchison et al., 2016; Richardson et al., 2017). While still on the horizon for eukaryotic algae, the development of self-replicating episomes in diatoms (Karas et al., 2015) has demonstrated that a synthetic sequence can be faithfully maintained in the diatom nucleus without integration into the native genome. This innovation is a step toward the design and assembly of independent, artificial chromosomes in microalgae. Reconstruction of a native P. tricornutum chromosome has already been demonstrated in yeast (Karas et al., 2013), and it is possible that a similar approach could be used to construct completely refactored chromosomal sequences.

Currently, genetic engineers are limited by the number of designs that they can feasibly assemble and test. However, it is anticipated that increased integration of computational design and automation with biology will rapidly shift this paradigm. Computational modeling can be used to predict non-intuitive approaches to optimize metabolic flux through heterologous pathways, as was demonstrated by the optimization of terpenoid production in cyanobacteria (Lin et al., 2017). Novel biological designs or complex combinatorial libraries can be rapidly assembled and evaluated in automated, highthroughput biofoundries, which are attracting investment from research institutions across the globe (Hillson et al., 2019). To evaluate clone libraries at scale, strain development must also be accompanied by improved technology for small molecule detection, such as the development of novel biosensors, as well as advancements in multi-dimensional phenotyping (section "Phenomics").

Synthetic biology is not limited to the production of existing natural compounds, since the deconstruction of biology into its basic genetic components permits systems to be redesigned free

from pre-existing constraints. An exciting avenue of synthetic biology will be the creation of novel, new-to-nature compounds with potential new functions and applications (Moses et al., 2014; Arendt et al., 2016; Luo et al., 2019). Synthetic biology can also be leveraged to improve agricultural outcomes for the cultivation of microalgae, including optimization of photosynthetic efficiency and improving carbon utilization (Gimpel et al., 2013; Erb and Zarzycki, 2016). For example, scenarios for the synthetic redesign of more efficient photosynthetic carbon fixation have been computationally predicted (Bar-Even et al., 2010). Given these advancements, the application of synthetic biology to microalgae has enormous potential to reinvent conventional animal and plant-based industries (e.g. food, high-value products, and chemicals) through innovations to minimize cost and environmental impact.

# APPLICATIONS

### Food and Nutraceuticals

Increasing the current capacity of microalgae to provide a source of nutrients, minerals, trace elements and other bioactive compounds is an active area of research that establishes a precedent for the development for new health products (Plaza et al., 2008; Lordan et al., 2011; Wells et al., 2017; Barkia et al., 2019). The microalgal industry has yet to reach its full potential, with an estimated global net worth of \$US1-1.5 billion (Pulz and Gross, 2004). Due to a history of safe production and consumption, the cyanobacteria Spirulina sp., along with the green algae Chlorella sp. and C. reinhardtii are internationally recognized as "generally regarded as safe" or GRAS, a certification legislated under the United States Food and Drug Administration (FDA, 2019). Other certified GRAS species include the green algae Haematococcus sp. and Dunaliella sp. (FDA, 2019).

There are many commercial food markets that can be occupied by ingredients and products derived from microalgal biomass. For example, microalgal biomass can be a source of bulk protein, carbohydrates, and lipids (Koyande et al., 2019). Microalgal protein is a particularly promising avenue to contribute to the future of sustainably based agriculture. Currently, the majority of global protein intake is attributed to higher plants (Billen et al., 2014; Henchion et al., 2017; Caporgno and Mathys, 2018), but plants require large amounts of arable land, water, and use of herbicides and fungicides (Dahman et al., 2019). Algal-sourced protein can be a sustainable alternative soy-based protein, due to its higher protein content and favorable amino acid profile, making it a high-quality protein for human nutrition (Spolaore et al., 2006; Kent et al., 2015). Recent studies show promising results with regard to improved physico-chemical and nutritional properties of Spirulina protein blends (Grahl et al., 2018; Palanisamy et al., 2019).

Some microalgae species are also a source of bioactive secondary metabolites that may ameliorate disease symptoms or causes, such as inflammation (Montero-Lobato et al., 2018) or provide protection to neuro-degenerative diseases (Olasehinde et al., 2017). Dried algal biomass from GRAS-certified species is most commonly consumed as a powder and already marketed as dietary supplement to improve health, is often added to other foods, such as blended beverages (Vigani et al., 2015). Powdered dietary supplements have been assessed in a number of clinical trials with some promising outcomes (Merchant, 2001; Karkos et al., 2011). For example, both Spirulina sp. and Chlorella sp. have clinically shown the ability to positively affect lipid profiles, various immune variables, and have antioxidant capacities (Mao et al., 2005; Park et al., 2008; Ryu et al., 2014; Kim et al., 2016; Garcia et al., 2017).

With an increased public preference for naturally sourced food additives, microalgae pigments offer an appealing alternative to synthetic pigments. Naturally derived pigments are a group of compounds that are inherently bioactive. They act as radical scavengers and can reduce oxidative damage (Singh et al., 2005), and therefore have appeal as dietary supplements or fortifying ingredients to promote human health (Stahl and Sies, 2005). This is in contrast to synthetic pigments, some of which are raising increasing concerns regarding their toxicities and subsequent adverse health effects, whilst also not providing any nutritional value (Oplatowska-Stachowiak and Elliott, 2017).

Microalgae are major producers of pigments such as fatsoluble chlorophylls, carotenoids (carotenes and xanthophylls) and water soluble phycobilins e.g. phycocyanin (Begum et al., 2016). Haematococcus sp. and Dunaliella sp. are two species that can accumulate significant levels bioactive pigment molecules such as astaxanthin (Guerin et al., 2003) and β-carotene (Tafreshi and Shariati, 2009), respectively. Astaxanthin is notable for its brilliant red color that brightens the flesh of seafood (Kidd, 2011). Humans do not synthesize astaxanthin, and dietary intake is almost exclusively via seafood (Kidd, 2011). Astaxanthin is presently mostly produced synthetically. Current production costs of microalgal-derived astaxanthin are still higher than those of synthetic (EUR 1540/kg and EUR 880/kg, respectively) (Panis and Carreon, 2016), although studies have estimated that these costs could be theoretically reduced to US\$ 500 – US\$ 800/kg (Li et al., 2011). Another common commercial pigment from microalgae is phycocyanin, derived from Spirulina sp. (Vigani et al., 2015). This deep natural blue pigment is utilized as a natural food colorant for food items such as chewing gum, ice sherbets, popsicles, candies, soft drinks, dairy products, and jellies (Begum et al., 2016).

Phytosterols are compounds often used as food supplement and cholesterol-lowering agents and are currently extracted from plants with suboptimal yields due to a complex extraction process (Ras et al., 2014). The sterol composition of algae is extremely diverse and comprises molecules typically synthetized by plants (e.g. brassicasterol and stigmasterol), animals (e.g. cholesterol) and fungi (e.g. ergosterol) (Rampen et al., 2010; Miller et al., 2012; Fabris et al., 2014; Lu et al., 2014), as well as novel and uncharacterized triterpenoids (Commault et al., 2019) and sterols, such as gymnodinosterol and brevesterol (Giner et al., 2003). Stigmasterol in particular is commonly used as cholesterol-lowering agents in food supplements (Batta et al., 2006). Several algal species naturally produce equal or greater amounts of phytosterols than plants, which are usually in the range of 0.025–0.4% of plant dry biomass (Piironen et al., 2003). Therefore diatoms and other algal groups have the potential to be

alternative, low-cost, and more sustainable source of phytosterols (Ahmed et al., 2015; Jaramillo-Madrid et al., 2019). For example, the model diatom P. tricornutum produces up to 0.32% d.w. of phytosterols and the haptophyte Pavlova lutheri can accumulate phytosterols up to 5.1% d.w (Ahmed et al., 2015).

Essential polyunsaturated fatty acids (PUFA) such as eicosapentaenoic acid (EPA) and docosahexaenoic acid (DHA), play crucial roles in human health. DHA is necessary for neural development (Swanson et al., 2012) and is routinely utilized in infant formulas, fortified food and beverages and dietary supplements (Ratledge, 2004). Presently, most DHA and EPA is supplied by wild-catch and captive based fisheries (Lenihan-Geels et al., 2013). As primary producers of essential PUFAs in nature, with DHA concentrations up to 50% of total biomass (Ratledge, 2004), microalgae represent a promising and more sustainable alternative (Ryckebosch et al., 2014). Presently, the production costs associated with microalgal derived EPA/DHA reach US\$ 40/kg EPA + DHA, but technological advancements could possibly lower this to ∼US\$ 10/kg EPA + DHA, which is competitive if compared to fish oil (∼US\$ 8/kg EPA + DHA) (Chauton et al., 2015).

Currently, three different commercial fermentation processes are used to produce DHA, with each utilizing different microorganisms (Ratledge, 2004). Martek Biosciences Corp (Netherlands) led the infant formula DHA market until 2011, when it was acquired by Dutch State Mines (DSM) in 2011. DSM utilizes the dinoflagellate microalgae Crypthecodinium cohnii, which accumulates DHA up to 60% of the total fatty acids fraction (Jacobsen et al., 2013) for use in infant formula, and Schizochytrium sp., a heterotrophic protist which can yield about 40% (w/w) of DHA (Ratledge et al., 2010). Oil from Schizochytrium sp. has been traditionally used for improving animal feeds, but there is also a market push toward human nutritional supplements (Ratledge and Cohen, 2008). The thraustochytrid Ulkenia sp. utilized by Nutrinova GmbH (Germany) produces up to 46% (w/w) DHA. In contrast to phototrophic algae, thraustochytrids are grown heterotrophically in stainless steel fermenters using complex organic substances, including by-products from other processes (e.g. sugars, organic acids) as a sole carbon and energy source (Chang et al., 2014). The oils obtained from Schizochytrium and Ulkenia are defined as "novel foods" under the European Union (EFSA Panel on Dietetic Products, Nutrition and Allergies [NDA], 2014) and in 2017 the Food Standards Australia New Zealand (FSANZ, 2019) approved the use of Schizochytrium-derived DHA-rich oil for use in infant formula products (Schedule 25, Permitted novel foods). Since 2017, several GRAS notices have been approved for Schizochytrium-derived oils by the Food Safety Authority (FDA), indicating a growing momentum in the utilization of microbe derived oils.

In contrast to DHA, much less progress has been made in the utilization of phototrophic microalgae for developing an alternative to fish oil for EPA and other important fatty acids. High-quality EPA is found in marine microalgae across a number of classes including Bacillariophyceae (diatoms), Chlorophyceae, Chrysophyceae, Cryptophyceae, Eustigamatophyceae and Prasinophyceae (Wen and Chen, 2003). While EPA production to date has focused on photoautotrophic growth, it is not yet economical, but emerging production and processing technologies may lead to sufficient enhancement of EPA production in microalgae to achieve market viability (Vazhappilly and Chen, 1998).

Whilst the microalgal industry does currently contribute biomass for food and nutrition, its scope is limited to a handful of algae and applications. Progress toward a greater utilization of algal products faces numerous technological challenges. Extensive screening and biological evaluation is needed to optimize production of specific metabolites and gain an understanding of how algal dietary value is affected by geographical region and growth season (Wells et al., 2017). Emerging technologies such as phenomics (section "Phenomics") are useful to survey multiple quantitative traits and to provide feedback on culture optimization. Furthermore, progress in cultivation technology (section "Algal Cultivation") and bioprocessing is needed to ensure such processes are economical viable and can compete with traditional and synthetic sources.

Further challenges include compliance with legislation, cost of production, and consumer perception. The latter will need to necessarily address issues regarding the association between microalgae and toxic cyanobacterial blooms, and their portrayal by the media. New algal strains without a documented history of safe consumption must be assessed and approved as "novel food" under EU and AUS legislative regulations (Sidari and Tofalo, 2019) or obtain a Food and Drug Administration (FDA) GRAS certification (Caporgno and Mathys, 2018), to be considered as future agricultural sources for food and nutrition. Factors such as health and nutritional benefits, taste, safety, freshness, and sustainability may persuade adoption of such products. Barriers such as lack of knowledge and familiarity must also be recognized before achieving consumer confidence. Future research will need to validate health benefits scientifically via robust clinical in vivo studies (i.e. random controlled trials), as well as directing efforts toward positive re-enforcement between new microalgal derived products and existing bio-products to help overcome negative consumer perceptions.

Thus, with efforts being made in emerging technologies such as phenomics and bioprocessing, microalgae is anticipated as promising future agricultural crop to cater for the increasing demands of future human and animal nutrition or other high value ingredients.

### Feedstocks

While the potential of algae as next-generation of biological resources is still emerging, some well-established industrial sectors are already routinely using them as feedstock. Among these, the aquaculture industry has used algae for the production of "aquafeed" for decades (Hemaiswarya et al., 2011). The rapid growth rates and balanced nutritional value of microalgae are ideal for aquafeed, and aquaculture production facilities commonly utilize microalgae either directly as live feed or indirectly as algal meal, consisting in the residual biomass left after extraction of lipids (Borowitzka, 1999).

The recent demand for algal meal is mainly driven by the increasing consumer demand in more sustainable food products.

Currently, a major proportion of conventional agriculture and aquaculture utilizes fishmeal, the crude flour obtained after cooking, drying, and grinding fish parts, which has a high protein and PUFA content, and relatively low cost production (approx. \$US1,500 per ton, source<sup>1</sup> ). Fishmeal has been used historically as a feed for farmed seafood, poultry, and pigs, and even as a fertilizer (Hardy and Tacon, 2002). However, fishmeal is now widely recognized as unsustainable, as its production is largely based on by-catch, leading to depletion of ecosystems and the collapse of local fisheries. Therefore, more sustainable ingredients are increasingly considered as alternatives to fishmeal, including soybean meal (Alvarez et al., 2007), cottonseed meal, insects meal, legumes, and algae (Hardy and Tacon, 2002). While algal feed represents one of the most promising alternative to fishmeal, because of the low land, freshwater and carbon footprints (Kim et al., 2019), matching its production with the low cost and large scale of conventional fishmeal production (6 to 7 million metric tons per annum) has proven challenging. In this respect, future developments in large-scale culture systems, as detailed in section "Algal Cultivation," but also in new business models incorporating multiple products (**Figure 2**), could help to solve these challenges and to achieve the potential of algal meal as an emerging feedstock. For example, large feeding trials have showed that, even after extraction of a vast majority of its PUFA content (which can be sold separately as high-value nutraceutical, section "Food and Nutraceuticals"), the residual biomass of Nannochloropsis has great potential as aquafeed for Atlantic salmon, common carp and whiteleg shrimp (Kiron et al., 2012). This clearly demonstrates the potential of algal meal, especially when integrated in a multi-product biorefinery business model, as an emerging and viable feedstock (**Figure 2**).

The most historical and natural application of algae for production of feedstock is as direct live feed. Indeed, algae have been widely used as direct live feed during juvenile stages of abalone, crustaceans, fish species and bivalves for decades (Benemann, 1992). Among these, bivalve hatcheries require the most microalgal production in comparison to any other form of food in aquaculture, due to bivalves being obligate filterfeeders throughout their entire life (Guedes and Malcata, 2012). Consequently, mass production of microalgae can account for >30% of a bivalve hatchery operating costs, indicating it is a major financial consideration for this sector of aquaculture industry (Guedes and Malcata, 2012). Approximately 20 algal species have been identified in the 1980s as most suitable live feed for aquaculture industry (Laing, 1987). From these selected species, the genera Chaetoceros, Tisochrysis, Pavlova and Tetraselmis, are considered some of the most suitable for the rearing of bivalves, with their specific size being one of the most important attribute (Guedes and Malcata, 2012). For example, Tisochrysis lutea (3–7.5 µm) is utilized throughout the production of many bivalves, from larvae and juveniles through to adults, as they are primarily an appropriate size but are also nutritionally valuable and robust in culture (Bendif et al., 2013). While T. lutea is suitable for all growth phases of bivalves, some other species such as the diatom Chaetoceros muelleri are slightly larger (5–8 µm) and therefore unsuitable for the early juvenile phase (Pacheco-Vega and Sánchez-Saavedra, 2009). As a result, the efficiency of feed profiles for bivalves has largely been optimized for the aquaculture industry by mixing a range of microalgae species that are nutritionally diverse and covering a range of sizes (Heasman et al., 2000). The biological differences between microalgae species means that their photosynthetic demands (light and CO2) and nutritional requirements are likely to differ significantly (Ihnken et al., 2011). However, a similar set of standard growth conditions are generally set for all cultures (Heasman et al., 2000). This "standardization" operated by the aquaculture industry implies that a largely un-optimized "one size fits all" culture system is employed, regardless of efficiency (Guedes and Malcata, 2012). Consequently, future biotechnological research (sections "Algal Cultivation," "Phenomics," and "Synthetic Biology") integrating (i) differences in optimal growth conditions between microalgae species, (ii) strain selection, and (iii) new cultivation technology, especially the next generation of nearly fully automated photobioreactors, are likely to increase microalgal yields to an extent that was previously unimaginable.

### High-Value Products

Most of the high-value products that are currently sourced from higher plants are also naturally produced by algae, or could be produced by algae through genetic engineering and synthetic biology. Given the vast diversity of microalgae, they naturally produce an extremely wide – and largely uncharacterized – range of natural products, potentially useful for human consumption and use. Some products are already synthetized efficiently, while the yields of others can be maximized to meet industrial requirements by the integration of advanced strain and bioprocess engineering (sections "Algal Cultivation," "Phenomics," and "Synthetic Biology"). Only a minute fraction of all algal species, consisting of mostly model species, are currently profiled for their biochemical capabilities (Sasso et al., 2012). Therefore, the full potential of algae in this context can only be estimated. Besides the above-mentioned food supplements, pigments and PUFAs (section "Food and Nutraceuticals"), a substantial number of different high-value products are already being sourced from algae.

Plant biostimulants (PBs) are a heterogeneous class of compounds that include phytohormones, small molecules, and polymers, which are used to improve crop performances and protect from abiotic stresses (Drobek et al., 2019). Extracts and bioactive compounds derived from wild type microalgae species are increasingly being explored as source of PBs (Chiaiese et al., 2018). Although this application is still at early stages, it could be developed and result particularly advantageous if included in a multi-purpose algal biorefinery (**Figure 2**). As the biochemical composition of microalgae greatly varies depending on species and culture conditions, the potential of algae-based applications in this context could be leveraged by more detailed knowledge on algal biochemistry and physiology, to allow better choice of species and growth conditions to promote the production of specific PBs (Drobek et al., 2019). On the other hand, the potential of engineered algal strains to enable the

<sup>1</sup>www.indexmundi.com

production of higher amounts or specific molecules is completely unexplored. Advanced genetics and synthetic biology techniques could soon enable the development of novel algal strains designed to produce specific metabolites, phytohormones, or peptides with PBs activity.

Microalgae can be genetically engineered to synthesize a myriad of high-value products. Terpenoids are the largest class of natural products that include countless bioactive plant secondary metabolites with applications as cosmetics, biofuels, nutraceuticals, and as life-saving pharmaceuticals in high demand (Vickers et al., 2017). As plant secondary metabolites, these compounds are typically produced in trace amounts; therefore industrial extraction ex planta requires very large quantities of biomass, with high economic and environmental costs. This could be averted with microalgae engineered to produce these chemicals in higher concentrations than is possible in plants (Arendt et al., 2016; Moses et al., 2017; Vavitsas et al., 2018; Lauersen, 2019). Cyanobacteria, for example, have been widely used for heterologous plant-derived terpenoid engineering, extensively reviewed in Chaves and Melis (2018) and Lin and Pakrasi (2019), and proof-of-concept works have unveiled the potential of engineered eukaryotic microalgae such as C. reinhardtii in producing high value terpenoids such as the food flavoring and aromas sesquiterpenoid patchulol and (E) α-bisabolene (Lauersen et al., 2016; Wichmann et al., 2018) and diterpenoids such as casbene, taxadiene, and 13R(+)manoylnyl oxide (Lauersen et al., 2018), as well as lambdane diterpenoids (Papaefthimiou et al., 2019), which are relevant precursors of plant-derived therapeutic and cosmetic products. Similarly, the diatom P. tricornutum is currently being explored for similar applications and demonstrated its potential in producing triterpenoid lupeol and traces of betulin, precursors of the topoisomerase inhibitor betulinic acid (D'Adamo et al., 2019), commonly used in anticancer and antiviral pharmaceutical preparations and naturally produced in trace amounts from the bark of plant species, such as the white birch tree (Pisha et al., 1995).

Plant monoterpenoids are particularly challenging compounds to produce in conventional microbial hosts such as S. cerevisiae and E. coli because these organisms do not naturally accumulate pools of the precursor geranyl diphosphate (GPP) (Vickers et al., 2017). It has been recently demonstrated that P. tricornutum naturally accumulates cytosolic pools of GPP and that these can be efficiently converted into the monoterpenoid geraniol (0.309 mg/L), through the episomal expression of a heterologous plant geraniol synthase enzyme (Fabris et al., 2020). Geraniol has several commercial applications as component of essential oils, flavouring agent and insect repellent, and is the key precursor of the monoterpenoid indole alkaloids (MIAs), a diverse group of bioactive plant metabolites that include the anticancer agents vinblastine and vincristine (Miettinen et al., 2014). This is a relevant demonstration that diatoms might harbour an important intrinsic advantage over conventional terpenoid production hosts for the synthesis of this challenging class of compounds, and that extrachromosomal episomes are suitable for metabolic engineering applications in diatoms, which is seminal for more complex, engineering approaches.

Exciting progress in algal genetics and synthetic biology, including key technologies for the assembly and expression of multi-gene constructs and tools for targeted gene editing (section "Synthetic Biology"), as well as advances in metabolic systems biology, will rapidly enable the expression of more complex metabolic pathways (Slattery et al., 2018) and increase understanding of the resulting interactions with endogenous algal metabolism, resulting in tailored engineering efforts that will go beyond simple proofs of concept and result in industrially relevant product yields.

Other products that could be sourced from engineered algal strains include industrial recombinant enzymes (Rasala et al., 2012; Lauersen et al., 2013) and protein-based therapeutics (Rasala et al., 2010; Gimpel et al., 2015). These recombinant protein drugs are generally produced in microbes such as E. coli, yeasts or mammalian cell lines such as Chinese Hamster Ovarian (CHO) cells. The latter, in particular, are associated with extremely high cost of production, mostly due to complex growth media composition (US\$10 – 500/L) (Xu et al., 2017). Therefore, in the last two decades, an increasing research effort has been put into developing robust, alternative production hosts. Plant and algae-based expression systems are envisioned as a valid, low-cost solution for producing therapeutics in countries and areas that lack resources for costly mammalian-based fermentation systems (Taunt et al., 2018), with the advantage of being immune to most pathogens and contaminations that affect animal hosts (Specht and Mayfield, 2014). In this search, it has been demonstrated the suitability of C. reinhardtii to produce – predominantly in the chloroplast – functional recombinant therapeutics, including a fully assembled human antibody, immunoglobulin G (IgG) (Tran et al., 2009), vaccine subunits (Gregory et al., 2013), vaccine antigens (Demurtas et al., 2013), immunoconjugated cytotoxins for cancer targeted treatments (Tran et al., 2013), and single domain antibodies (VHH) (Barrera et al., 2015). Diatoms such as P. tricornutum have also been used to successfully and efficiently produce and secrete fully assembled antibodies (Hempel et al., 2011b; Hempel et al., 2017), while the silicified Thalassiosira pseudonana has been engineered for targeted drug delivery in vivo, by displaying a recombinant IgG binding domain on the silica frustules, turning the whole cell into a drug delivery vector effective on tumor models (Delalat et al., 2015). The production of edible vaccines is another developing field where algae-based expression systems are finding a relevant niche (Specht and Mayfield, 2014), with particular relevance to the poultry and aquaculture industry. However, challenges will need to be addressed to make algae the preferred production hosts for therapeutics. In addition to the cultivation challenges already mentioned, the production of recombinant therapeutics in algae is currently hindered by overall low expression levels, and it is expected that developments in algal genetics and synthetic biology (section "Synthetic Biology") will enable more competitive yields. Strategies involving innovative genetic design – for example the insertion of intronic sequences in the transgene of interest – could be used to significantly improve the expression of recombinant proteins (Baier et al., 2018). Also,

to be suitable for therapeutics production, algal production hosts need to exhibit the correct post-translational modifications, such as protein glycosylation, to avoid adverse immune reactions in inoculated animals. Little is known about the Nglycosylation properties of microalgae, but large-scale profiling of glycosylation properties of diverse non-model species and genetic engineering may possibly offer possibilities for algae to become a preferred production platform for glycoproteins (Mathieu-Rivet et al., 2014).

From these convincing proof-of-concepts examples, the (enhanced) production of endogenous and heterologous high-value products in microalgae will enormously benefit from the technological developments reviewed in section "Technology Development." More complex synthetic biology approaches, in combination with detailed knowledge on novel or engineered strains from high-throughput phenomics approaches, and advancements in cultivation technology (**Figure 1**), will address the main bottlenecks of low yields and upscaling, and open the doors to the cost-effective production of a much wider diversity of bioproducts. This will alleviate the environmental impact imposed by current practices involving inefficient bioproduct sourcing from plants or from other high-cost and less environmentally friendly production methods.

# Biopolymers, Bioplastics, and Bulk Chemicals

The demand of plastic and plastic-based products have grown significantly in last few decades, which has placed a major strain on the remaining petrochemical resources of our planet. The increasing production of these petrochemical-based plastics has also generated concern regarding plastic pollution worldwide, mostly in marine ecosystems due to their persistence in environment as non-biodegradable materials (Tetu et al., 2019). Therefore, alternatives to petrochemical-based plastics sources are in high demand, as they would make plastic production sustainable while mitigating the issue of plastic pollution. Algae have the potential to be an economically viable feedstock for bioplastics production, as the biomass can be sold at US\$ 970/tonne, which is within the current standard range for other sources of bioplastics (US\$ 800 – 1200/tonne (Beckstrom et al., 2020).

Microalgal biomass components such as starch, carbohydrates, and lipids can be converted into plastics (Noreen et al., 2016). There are currently three main approaches to produce bioplastics from microalgae, including: (i) direct use of microalgae as bioplastics, (ii) blending of microalgae with existing petroleum-based plastics or bioplastics, and (iii) genetic engineering of microalgae to produce bioplastic polymer precursors. In the first approach, Zeller et al. (2013) have reported production of bioplastics and thermoplastic blends directly from S. platensis and C. vulgaris, while Wang et al. (2016) described the preparation of thermoplastics by blending a heterogeneous population of planktonic algae. However, the most common approach to making microalgae-based bioplastics is to blend the biomass with existing petrochemical-based plastics, such as polyethylene, polypropylene, polyvinyl chloride. Shi et al. (2012) described the processing of microalgae-corn starchbased thermoplastics using Nannochloropsis and Spirulina, and further blending with polyethylene and polypropylene. Chlorella sp. biomass was blended with polyethylene and polypropylene and was found to possess good thermoplastics processability because of the presence of natural cellulosic type materials (Zhang et al., 2000a). The properties and processing of PVC-Chlorella composite has also been reported (Zhang et al., 2000b).

With the increasing demand of bioplastics in the market, considerable research effort has been directed in investigating the blending of algal biomass with other bio-derived plastics components. A recent study has reported of addition of green, brown, and red algal biomass to polylactic acid plastics (Bulota and Budtova, 2015) with no pre-treatment other than drying and sieving. Polyhydroxyalkanoates (PHAs), one of the widely studied biodegradable polyesters with high mechanical strength and melting point, is naturally produced in certain bacteria, including some cyanobacteria (Sudesh et al., 2000). Microbial production of PHAs generally occurs under stressful environmental conditions (Bassas et al., 2008; Balaji et al., 2013). PHA is generally extracted by three subsequent steps of disrupting the cells (by chemical, physical or biological treatment), recovery of PHAs, and purification (Fiorese et al., 2009). However, with the continuous increase in interest in PHAs production, metabolic engineering, and synthetic biology (section "Synthetic Biology") could enable the heterologous synthesis of PHA precursors in eukaryotic microalgae as demonstrated in diatoms (Hempel et al., 2011a).

Biomass-derived chemicals, such as 5-hydroxymethylfurfural (5-HMF), levulinic acid, furfurals, sugar alcohols, lactic acid, succinic acid, and phenols, are considered platform chemicals. These platform chemicals are used for the production of a variety of important chemicals on an industrial scale (Kohli et al., 2019). Bio-based bulk chemicals possess a clear substitution potential for fossil oil-based bulk chemicals. However, current biomass feedstocks for industrial use are typically derived from plant material, posing challenges such as destruction of rainforests, competitive food consumption, and other adverse environmental impacts. Microalgae, with its superior areal productivity to traditional agricultural crops and high concentration of lipid, carbohydrate, and proteins, have appeared as an alternative and attractive candidate for the production of bulk chemicals, including bio-based platform chemicals and bio-based solvents (Wijffels et al., 2010). Catalytic valorization is an emerging field that can be applied to the production of value added chemicals from microalgae. Even though the technology readiness for commercialization is still a challenge, the field is active with several research groups working on algae and catalytic systems for the conversion of algal biomass to value added platform chemicals. For example, Chlorococcum sp. was reported to be converted into 1,2 propanediol (1,2-PDO) and ethylene glycol (EG) in water over nickel-based catalysts (Miao et al., 2015), while the hydrolysis of Scenedesmus sp. over the Sn-Beta catalyst was used to produce lactate (Zan et al., 2018). This was achieved via formic acid induced controlled release hydrolysis, with an

achieved yield of 83%. Another recent study demonstrated the conversion of algal polysaccahrides from Phorphyridium cruentum and C. vulgaris to monosaccharides, HMF, and furfural in the neat deep eutectic solvent (DES) or in the biphasic system ChCl/oxalic acid/methyl isobutyl ketone (Bodachivskyi et al., 2019).

Microalgal biomass that has the lipids already extracted is good source for carbohydrates. The reported yields are up to 80% of the cell mass and hence, could be useful upon hydrolysis to generate fermentable sugars. A recent study has reported hydrolyses of lipid extracted C. vulgaris biomass using solid acid catalysts to obtain monosaccharides such as glucose, galactose, xylose, rhamnose, mannose, and 2,3 butanediol (Seon et al., 2019). These monosaccharides can be used for microbial fermentation to produce many useful products, such as lactic acid, hydrogen gas, and ethanol. 2,3 butanediol is a value-added chemical with great potential for the industrial production of synthetic rubber, plastic, and biosolvent (Soo-Jung et al., 2017; Seon et al., 2019). In another study, microalgal hydrolysate from C. vulgaris was converted into ethanol via continuous immobilized yeast fermentation at a yield of 89% (Kim et al., 2014).

Several challenges will need to be addressed in terms of low product yield and relatively high costs of such biochemical conversion processes. However, the integration of these application in a multi-product biorefinery approach (**Figure 2**), could improve the overall economic feasibility of bioplastic and bulk chemical production from microalgal biomass.

### Algal Biodegradation of Emerging Contaminants

Emerging contaminants (EC) are primarily synthetic organic chemicals, such as pharmaceuticals, herbicides, pesticides, and flame retardants, whose presence in the environments are of concern due to their potential risks to ecosystems and human health, at environmentally relevant concentrations (Petrie et al., 2015; Tran et al., 2018; Sutherland and Ralph, 2019). There is increasing concern over the presence of ECs in agricultural land- and water-scapes. With climate change and expanding populations, accumulating ECs due to agricultural intensification and increased water reuse could lead to unpredictable long-term consequences for humans and the environment (Martinez-Piernas et al., 2018). While direct application can be managed through improved onfarm best management practices, indirect application is reliant on improvements in wastewater treatment that would reduce, transform, or eliminate ECs.

Wastewater treatment using microalgae for nutrient removal is a well-established technology that has lower capital and operational costs, and is more efficient than traditional wastewater treatment systems (Benemann, 2008; Craggs et al., 2012). However, there have been few studies to date on the use of microalgae for bioremediation of ECs despite their potential for detoxifying organic and inorganic pollutants. Coupling of nutrient and EC removal by microalgae has the potential to provide more cost-effective and efficient wastewater treatment as well as meeting both environmental and human health protection goals (Sutherland and Ralph, 2019).

While still in its infancy, microalgal biodegradation provides one of the most promising technologies to transform, neutralize, or eliminate ECs from agricultural runoff. Unlike other remediation techniques, such as activated carbon adsorption filters, which simply concentrates the EC and removes it from one environment to another environment, biodegradation involves the transformation of complex compounds into simpler breakdown molecules through catalytic metabolic degradation (Sutherland and Ralph, 2019). Microalgal degradation of ECs can occur via two main mechanisms. The first mechanism involves direct metabolic degradation of the EC by the microalga. In this case, the microalga employs mixotrophic growth strategies and the EC serves as the carbon source or electron donor/acceptor (Tiwari et al., 2017). The second mechanism involves indirect, or co-metabolism, where the EC is degraded by enzymes that are catalyzing other substrates present (Tiwari et al., 2017). Microalgae possess a large number of enzymes that play a role in cellular protection through the deactivation and/or degradation of a range of organic compounds that induce cellular stress in microalgae (Wang et al., 2019). Microalgal degradation of ECs relies on a complex enzymatic process involving a number of enzymes, including: superoxide dismutase, catalase, glutamyl-tRNA reductase, malate/pyruvate dehydrogenase, mono(di)oxygenase, pyrophosphatase, carboxylase/decarboxylase, dehydratase, alkaline and acid phosphatase, transferase, and hydrolases (Elbaz et al., 2010; Xiong et al., 2018; Wang et al., 2019). Several of these enzymes, including superoxide dismutase and catalase, have shown increased activity in several freshwater microalgal species, when the cells were exposed to the veterinary antibiotics Florfenicol and Ofloxacin (Wang et al., 2019).

In one bioremediation study, the green algae Scenedesmus obliquus and Chlorella pyrenoidosa were found to enzymatically degrade progesterone and norgestrel by reduction (hydrogenation), hydroxylation, oxidation (dehydrogenation) and side-chain breakdown (Peng et al., 2014). In another study, co-metabolic removal of the antibiotic ciprofloxacin by the green alga Chlamydomonas mexicana was observed, but the enzymatic mechanisms involved in its metabolism were not identified (Xiong et al., 2017). Due to the complexity of enzymatic biodegradation processes, simply screening microalgal strains for EC biodegradation activity remains the most viable strategy for developing new bioremediation strains (Sutherland and Ralph, 2019).

One of the challenges with screening microalgae for EC biodegradation is the large number of both ECs and microalgal species. Currently, there are approximately 200 known ECs in the environment, while there are thousands of recognized algal species (Pradhan and Rai, 2001; Guiry, 2012). Therefore, there is a need for the development of cost effective high through-put screening methods that allow for rapid screening of a wide range of microalgal species against a wide range of ECs. A microalgal phenomics facility (section "Phenomics," **Figure 1b**) would provide the necessary cost-effective and

efficient high through-put screening to help rapidly develop microalgal biodegradation technology.

Another challenge with screening microalgae for EC biodegradation is that the enzymes responsible for degrading the EC may not be active at the time of screening (Sutherland and Ralph, 2019). This is due to both the production and maintenance of these complex enzymes being metabolically expensive, which comes at the cost of growth and reproduction of the cell (Sutherland and Ralph, 2019). For example, both the cellular energy budget and growth rates were significantly reduced in the microalga Raphidocelis subcapitata, following the induction of superoxide dismutase production by the cells exposed to four different antibiotics (Aderemi et al., 2018). For some microalgae, pre-acclimation to sub-toxic concentrations of the EC may be required to initiate enzyme production in order to screen for biodegradation potential (Sutherland and Ralph, 2019). For example, microalgal biodegradation of several different antibiotics was enhanced following pre-exposure of the microalgal strain to low levels of the antibiotic due to increased production of antioxidants, including xanthophylls, by the cells (Chen et al., 2015; Xiong et al., 2017). Biodegradation may also lead to intermediary products that could be similarly, or more toxic, than the parent compound. Identification of the breakdown products with specific assays, coupled with toxicological screening is an important step that needs to be included in microalgal biodegradation assessments.

For microalgal species with demonstrated biodegradation capability, the induction of elevated Phase I and Phase II enzyme production can further enhance the EC degradation process, both improving its efficiency and effectiveness. This can be induced through genetic means, such as synthetic biology, targeted gene editing, or genetic engineering. For example, Zhang et al. (2018) used random mutagenesis and site-directed mutagenesis to increase the production of the degrading enzyme, laccase, by 31 to 37-fold in the white-rot fungus Cerrena unicolor BBP6. Similar approaches could be used to increase the biocatalytic activities of microalgal laccases.

Synthetic biology approaches (section "Synthetic Biology," **Figure 1a**), can be used to engineer microalgae and overexpress entire artificial degrading pathways that include enzymes, such as fungal laccases, peroxidases, cellulases, and ligninases, to further increase the potential of algal bioremediation. These pathways can be either expressed in the host in the same configuration as in the source organism, or even in new-tonature combinations, picking enzymes from multiple organisms and assembling new degradation pathways, both by rational design, and by random/combinatorial assembly and screening (Tay et al., 2017). While there are currently limited studies on genetic engineering of microalgae for bioremediation purposes, Chiaiese et al. (2011) successfully demonstrated fungal laccase POX A1b expression in the green alga, Chlorella emersonii, which enhanced microalgal biodegradation of phenols by up to about 40%. However, while genetically engineering microalgae for enhanced biodegradation appears promising, the potential environmental risks intrinsic to the use of genetically modified organisms (GMO) that would limits their application in outdoors settings need to be evaluated (Szyjka et al., 2017). In addition to this, for many countries, the legislation around the limited use, or the total ban of, GMOs means that transgenic microalgae for ECs biodegradation would not be a viable option, at present.

While microalgae have the demonstrated ability to biodegrade ECs associated with agricultural practises, further research is needed to exploit microalgal biodegradation, through enhanced enzyme expression and optimized growth conditions. When coupled with nutrient removal, such as HRAPs, microalgal treatment of EC can be a cost-effective viable option for the reduction of contaminant pollution in waterways (Sutherland and Ralph, 2019).

# CONCLUSION

Agriculture is one of the most ancient human practices and it has always been essential to our civilization. Agriculture and human society have co-evolved, reciprocally influencing each other. Over millennia humans isolated, bred, and generated new species to satisfy needs that have been steadily increasing in size and diversity. In modern times, agriculture technology has seen impressive improvements in yield, efficiency, and product differentiation thanks to developments in cultivation technology, genetics, and phenomics. Although algae-derived applications have been present in human history, the push to develop these organisms as industrial resources is a very recent objective. Compared to conventional agriculture crops, algaebased practices are an extremely young application field, and all current industrial algal strains are relatively uncharacterized. However, decades of foundational research on algal biochemistry and physiology (not reviewed here), may be leveraged to expedite the use of algae in biotechnology (Hildebrand et al., 2013). Efforts to progress the understanding of diverse algal traits has recently been bolstered by the advent of genome sequencing projects and functional genetic tools, revealing novel aspects of algal metabolism relevant to industrial applications (Moellering and Benning, 2010; Allen et al., 2011; Fabris et al., 2012; Kirst et al., 2012; Radakovits et al., 2012; Fabris et al., 2014; Abbriano et al., 2018; Luo et al., 2018; Pollier et al., 2019; Smith et al., 2019). Moreover, it is expected that knowledge on algal traits will be increasingly generated by the implementation of advanced synthetic and molecular biology approaches combined with phenomics. Presently, however, the relatively few algal species employed in commercial applications largely consist of natural isolates with minimal selections, breeding or genetic engineering (if any) to better perform in industrial settings or for improved yields. Despite this, as illustrated by the achievements highlighted in this review, algae already find applications in many industrial fields and sectors, often with the clear potential of replacing more energy, cost, and environmentally intensive solutions. Evaluating the current progress and achievement of algal biotechnology and industry from this perspective is at the same time both impressive and encouraging, and this needs to be kept into account when drawing the trajectory of future developments of this field. The emerging technologies that we described will drastically accelerate the process of industrialization of algae, providing

knowledge and tools to deliver highly productive, algae-based solutions to a diversity of societal needs. This includes deeper understanding of algal biology, genetics, and biochemical capabilities, which will drive the optimization of both the organisms and the environment in which it is cultivated. This will allow in the near future the move toward ad hoc, highly productive strains, either as novel natural isolates or genetically engineered strains, and efficient cultivation systems with minimal environmental impact. We envision that high-tech algae-based solutions will find applications in almost every industrial sector, including ones essential to meeting the increasing needs of human society, such as food, pharmaceutical and bulk chemicals manufacture, while ensuring minimal environmental impact and lower production costs. The development of highly efficient algal biorefineries ('T Lam et al., 2018; **Figure 2**) will allow cosourcing different products, minimizing waste and maximizing the productivity, improving the economics of processes otherwise low-efficient. As such, we anticipate that the progress of algae biotechnology will have a disruptive effect to the current industrial landscape, and will prompt the emergence of a scalable,

### REFERENCES


sustainable, and efficient algae-based bio-economy, which will be key in overcoming challenges and limitations that conventional agriculture will face in the years ahead.

### AUTHOR CONTRIBUTIONS

MF, RA, AC, DS, MP, LL, JM, UK, PaR, CH, and TK wrote the manuscript. All authors read and edited the manuscript.

### FUNDING

This work was supported by the Climate Change Cluster (C3) of the University of Technology Sydney (UTS, Australia) and the CSIRO Synthetic Biology Future Science Platform. MF is supported by a CSIRO Synthetic Biology Future Science Platform Fellowship, co-funded by UTS and CSIRO. LL is supported by an Australian Research Council Linkage grant, co-sponsored by GE Healthcare.



production of monoterpenoids. ACS Synth. Biol. [Online ahead of print]. doi: 10.1021/acssynbio.9b00455


affect the sensoryvproperties of extrusion products derivedvfrom soy and algae. J. Clean. Prod. 198, 962–971. doi: 10.1016/j.jclepro.2018.07.041



pluvialis. Biotechnol. Adv. 29, 568–574. doi: 10.1016/j.biotechadv.2011. 04.001




wastewater treatment high rate algal ponds for biofuel production. Bioresour. Technol. 184, 222–229. doi: 10.1016/j.biortech.2014.10.074



**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Fabris, Abbriano, Pernice, Sutherland, Commault, Hall, Labeeuw, McCauley, Kuzhiuparambil, Ray, Kahlke and Ralph. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Application of Transposon Insertion Sequencing to Agricultural Science

### Belinda K. Fabian1,2, Sasha G. Tetu1,2 \* and Ian T. Paulsen1,2 \*

<sup>1</sup> ARC Centre of Excellence in Synthetic Biology, Macquarie University, Sydney, NSW, Australia, <sup>2</sup> Department of Molecular Sciences, Macquarie University, Sydney, NSW, Australia

Many plant-associated bacteria have the ability to positively affect plant growth and there is growing interest in utilizing such bacteria in agricultural settings to reduce reliance on pesticides and fertilizers. However, our capacity to utilize microbes in this way is currently limited due to patchy understanding of bacterial–plant interactions at a molecular level. Traditional methods of studying molecular interactions have sought to characterize the function of one gene at a time, but the slow pace of this work means the functions of the vast majority of bacterial genes remain unknown or poorly understood. New approaches to improve and speed up investigations into the functions of bacterial genes in agricultural systems will facilitate efforts to optimize microbial communities and develop microbe-based products. Techniques enabling high-throughput gene functional analysis, such as transposon insertion sequencing analyses, have great potential to be widely applied to determine key aspects of plant-bacterial interactions. Transposon insertion sequencing combines saturation transposon mutagenesis and high-throughput sequencing to simultaneously investigate the function of all the nonessential genes in a bacterial genome. This technique can be used for both in vitro and in vivo studies to identify genes involved in microbe-plant interactions, stress tolerance and pathogen virulence. The information provided by such investigations will rapidly accelerate the rate of bacterial gene functional determination and provide insights into the genes and pathways that underlie biotic interactions, metabolism, and survival of agriculturally relevant bacteria. This knowledge could be used to select the most appropriate plant growth promoting bacteria for a specific set of conditions, formulating crop inoculants, or developing crop protection products. This review provides an overview of transposon insertion sequencing, outlines how this approach has been applied to study plant-associated bacteria, and proposes new applications of these techniques for the benefit of agriculture.

Keywords: biocontrol, plant growth promoting bacteria, fertilizer, microbiome, pesticide, transposon insertion sequencing, transposon mutagenesis

### INTRODUCTION

A major component of agricultural ecosystems is the microbiota (bacteria, archaea, protists, fungi, and viruses) present on plants and in soils. Interactions between microbes and plants can be detrimental, commensal, or favorable (Buée et al., 2009). The benefits provided by microbes in agricultural systems include breaking down organic matter, fixing nutrients, making nutrients

### Edited by:

Nicola Colonna, Italian National Agency for New Technologies, Energy and Sustainable Economic Development (ENEA), Italy

### Reviewed by:

Wusirika Ramakrishna, Central University of Punjab, India Pedro Carrasco, University of Valencia, Spain

### \*Correspondence:

Sasha G. Tetu sasha.tetu@mq.edu.au Ian T. Paulsen ian.paulsen@mq.edu.au

### Specialty section:

This article was submitted to Crop and Product Physiology, a section of the journal Frontiers in Plant Science

Received: 19 October 2019 Accepted: 26 February 2020 Published: 18 March 2020

### Citation:

Fabian BK, Tetu SG and Paulsen IT (2020) Application of Transposon Insertion Sequencing to Agricultural Science. Front. Plant Sci. 11:291. doi: 10.3389/fpls.2020.00291

**128**

available for plant use, nutrient recycling, defense against plant pathogens and improving abiotic stress tolerance (Naik et al., 2019). To survive and thrive in their niche, plant-associated microbes need to be able to evade host defenses, use the nutrients available from the host and successfully compete with other microbes (Gonzalez-Mula et al., 2019).

To maximize crop production and minimize losses due to biotic and abiotic stresses it is increasingly clear that we need to take advantage of the benefits that beneficial microbes can provide. However, this is currently difficult to achieve with only a limited understanding of microbial gene functions and their interactions with plants on a molecular level. An increasingly common approach when studying agricultural microbiota is to ascertain microbial community composition using metagenomics (Shelake et al., 2019). This allows for the collection of genetic information about more members of the community than just the microbes that are able to be cultivated (Levy et al., 2018). Decreasing sequencing costs have facilitated the rapid generation of vast amounts of genomic data. However, collecting this sequence information does not readily translate into understanding what functions are being performed within a microbial community and what genes are involved (van Opijnen and Camilli, 2013; Vorholt et al., 2017).

The traditional methods for experimental demonstration of gene function, such as gene knockouts and over-expression, happen at a much slower pace than the rate at which genomic information is accumulating (Chang et al., 2016). The laborious and often problematic nature of this gene function characterization (see section " Saturation Transposon Insertion Mutagenesis and High Throughput DNA Sequencing") means only a small proportion of genes in a small number of microbes from agricultural systems have been experimentally investigated (Price et al., 2018a). For microbes that have not been experimentally characterized, sequence homology, genome location or similarity of domains is often used to infer gene function, but predictions of gene or organism function generated by these computational approaches are often erroneous, especially when the closest characterized relative is separated by a large phylogenetic distance (Mutalik et al., 2019). There are many circumstances where even these methods geared at prediction of gene function do not yield any functional information and genes remain annotated as hypothetical proteins. The combination of these issues results in the functions of many microbial genes remaining unknown or poorly understood (Van Opijnen and Camilli, 2012; Jarboe, 2018). This is currently limiting our understanding of the role many microbes play in agricultural and other systems and hampers interpretation of microbiome survey data and efforts to formulate optimal plant associated microbial communities (Shelake et al., 2019).

Most of what is currently known about genes involved in beneficial plant interactions has been determined by comparing genomes and individually testing genes that correlate with a phenotype of interest (Levy et al., 2018). A high throughput method of gathering gene functional information and relating genotype to phenotype is required to rapidly advance our understanding of the complex molecular interactions that occur between microbes and plants (van Opijnen and Camilli, 2013; Perry et al., 2016) and move beyond cataloging microbial community members toward a mechanistic understanding of agricultural microbiomes (Poole et al., 2018).

# THE NEED FOR ALTERNATIVES TO PESTICIDES AND FERTILIZERS

To provide food and fiber for the rapidly growing global population, crop yields need to increase dramatically (Godfray et al., 2012; Pardey et al., 2014). In the past, agricultural intensification has been enabled by the use of pesticides, fertilizers, machinery, and increased plowing depth and sowing density. While this has previously been adequate to increase crop yields and meet global demand (Emmerson et al., 2016), these methods are no longer delivering sufficient increases in yields and are contributing to biodiversity declines which can have flow on effects for agricultural yields, for example, pollination services that are critical for food production (Gallai et al., 2009; Tscharntke et al., 2012). Declines in the rates at which crop yields are increasing, the changing climate and abiotic stresses (especially water availability and salinity) mean that new approaches are necessary to meet demand for food and fiber (Ray et al., 2012; Backer et al., 2018). The major ways to meet this demand while holding agricultural land area stable are to increase crop yields and/or dramatically reduce the quantity of crops lost to disease (Phalan et al., 2011; Godfray et al., 2012).

One path to increasing biomass accumulation and increasing crop yields is through the addition of fertilizers to crop lands. When fertilizers with elevated levels of nitrogen and phosphorus are added to fields they can create an imbalance in the inorganic nutrients present in the soil (Gosal et al., 2018). The addition of phosphorus in this way results in a large proportion binding to soil particles (immobilization) and not being biologically available for plant use (Khan et al., 2007). Excessive use of fertilizers can lead to surrounding lands being contaminated through run-off, changes to the physico-chemical properties of soil and detrimental impacts on the native soil microbiota which can have flow on effects for the resilience of agricultural soils and crop yields (Jangid et al., 2008; Santhanam et al., 2015; Naik et al., 2019). For these reasons, there is increasing recognition that alternatives to inorganic fertilizers are needed (Cordell et al., 2009).

Every year 20–40% of global food production is lost worldwide to plant pests and diseases despite efforts to control crop diseases using pesticides and other crop management techniques (FAO, 2016). Crop diseases are mainly controlled through the use of pesticides and farming practices, such as crop rotation, integrated pest management and the development of disease resistant crop varieties (Syed Ab Rahman et al., 2018). Continuously using pesticides as the major weapon against crop pathogens ultimately results in the pathogens developing resistance and pesticides losing their effectiveness. This has been observed in Zymoseptoria tritici, the fungal causative agent for Septoria tritici blotch, progressively evolving resistance to fungicides in Europe since the early 2000s, and since 2011 in Tasmania, Australia

(McDonald et al., 2019). The Fungicide Resistance Action Committee (FRAC), a pesticide industry expert group, lists over 429 instances of plant pathogens becoming resistant to fungicides (Fungicide Resistance Action Committee [FRAC], 2018).

In many instances, farmers have to apply higher doses and multiple pesticides to combat this rising resistance; eventually plant pathogens will no longer be controllable using existing pest control methods (Lucas et al., 2015). In addition to rising pathogen resistance, the application of pesticides to crops may cause detrimental off-target effects for many other members of these ecosystems. Commensal and beneficial microbes are often killed and insects that are protecting crops from herbivory or providing essential services such as pollination are damaged or removed from these habitats (Bünemann et al., 2006; Geiger et al., 2010). With the combination of the slowing rate of discovery and development of new pesticides (Borel, 2017), resistance developed by pathogens and the harm pesticides cause to the environment it is widely agreed that new methods of disease control are needed (Syed Ab Rahman et al., 2018).

# BOOSTING CROP YIELDS VIA APPLICATION OF BENEFICIAL BACTERIA

Many plant-associated bacteria are now understood to have the capacity to positively affect plant growth and there is growing interest in utilizing such bacteria in agricultural settings to reduce reliance on pesticides and fertilizers (Naik et al., 2019). These plant growth promoting bacteria colonize the surface of plants and the thin layer of soil surrounding the plant roots (rhizosphere; Zhang et al., 2017). These bacteria use the nutrients exuded from plant roots to fuel their growth and in turn can stimulate plant growth (Buée et al., 2009).

Plant growth is promoted by such bacteria in a number of direct and indirect ways (**Figure 1**). Firstly, beneficial bacteria can assist with biofertilization, increasing the bioavailability of nutrients for plant use. Examples include atmospheric nitrogen fixation (Ke et al., 2019), solubilizing phosphate (Arkhipova et al., 2019; Wu et al., 2019) and the production of iron binding siderophores which can be taken up by plants (Flores-Félix et al., 2015). The second way rhizobacteria can stimulate plant growth is through the production of phytohormones (also referred to as plant growth regulators). These hormones can stimulate plant growth in the same way as endogenous plant hormones. For example, the endophytic bacteria Sphingomonas sp. LK11 releases gibberellins which increases tomato plant shoot length and root weight (Khan et al., 2014). Thirdly, beneficial bacteria can ameliorate the effect of abiotic stresses and lead to increased tolerance of abiotic stresses. The presence of Pseudomonas putida AKMP7 has been shown to promote wheat growth under heat stress (Ali et al., 2011) and under waterlogged conditions Achromobacter xylosoxidans Fd2 induces waterlogging tolerance which results in 46.5% higher yield from Holy basil plants (Ocimum sanctum; Barnawal et al., 2012). The final direct method of promoting plant growth is rhizoremediation. This is when soil bacteria degrade pollutants and attenuate the effects of toxins in the soil. The removal of these contaminants from the environment reduces the stress on the plant and results in increased plant growth (Correa-Garcia et al., 2018). For example, rice growth in heavy metal contaminated soils is boosted by cadmium-resistant Ochrobactrum sp. and lead- and arsenicresistant Bacillus sp. (Pandey et al., 2013).

The indirect method for promoting plant growth is when beneficial microbes are used to control plant pathogens, an eco-friendly and sustainable process termed biocontrol (Handelsman and Stabb, 1996). Biocontrol bacteria reduce the impacts that pathogens have on plants through three interacting mechanisms (**Figure 1**). Firstly, the beneficial microbes grow in the same spaces as the pathogens and compete with them for nutrients and niches. For example, the biocontrol bacteria Chryseobacterium sp. WR21 controls bacterial wilt disease by more effectively competing for tomato root exudates compared to the pathogenic bacteria Ralstonia solanacearum (Huang et al., 2017). The second method for reducing the impact of plant pathogens is through the induction of plant systemic resistance. Bacillus amyloliquefaciens SQR9 primes the immune system of Arabidopsis against the pathogens Pseudomonas syringae pv. tomato DC3000 and Botrytis cinerea (Wu et al., 2018). The final technique used by plant growth promoting microbes is antibiosis through the production of antibiotics or metabolites. For example, the biocontrol bacteria Pseudomonas protegens Pf-5 produces the antibiotic pyoluteorin to protect cotton plants against infection by the oomycete pathogen Pythium ultimum (Howell and Stipanovic, 1980).

Consistent and reliable performance in protecting crops against pathogens is an essential characteristic of any plant growth promoting bacteria before consideration for development into a commercial crop protection or crop enhancement product. As a commercial product a bacterial inoculum will be applied in a broad range of settings with diverse physico-chemical parameters. Beneficial bacteria need to not only cope with a range of environments and climatic conditions, but actively enhance plant growth under such conditions. In addition to colonizing the crop host, the bacteria need to compete with existing soil microbiota for resources and be safe for the environment (Rosconi et al., 2016). The bacterial inoculum may be applied in conjunction with chemical pesticides and fertilizers, so it is advantageous if it is able to survive and maintain plant growth promoting activity in the presence of these additions and must also be able to survive packaging, transport and storage before it is applied to a crop (Glick, 2015; Backer et al., 2018). Ideally, the biocontrol agent should also be able to tolerate a range of environmental stresses, for example, drought, salinity, temperature or heavy metal pollution (Compant et al., 2019).

Over 30 bacterial strains have been successfully commercialized into crop inoculants (not including the many Rhizobia spp. used for legume crop inoculation; Glick, 2015), but there are many more beneficial strains of bacteria that have potential for development. Even for the beneficial bacteria that have been developed into commercial biocontrol agents, there is often no strong functional understanding of their mechanisms of action (Bardin et al., 2015). There is a clear need for high throughput functional analyses to investigate beneficial bacteria

with potential for development into commercial agents (Backer et al., 2018). This would permit much faster identification of the metabolic pathways they use to gather resources, their modes of action in a variety of environments and hosts, and the roadblocks that are limiting their current application.

## SATURATION TRANSPOSON INSERTION MUTAGENESIS AND HIGH THROUGHPUT DNA SEQUENCING

To further our understanding of plant-microbe and microbemicrobe interactions a faster method of determining microbial gene function is needed. The traditional methods of ascribing functions to genes (for example, making single gene knockouts or heterologous expression) are laborious and very slow as mutants are screened individually (Royet et al., 2019). In addition, these methods can be problematic, for example, gene expression in a heterologous bacterial host may exhibit an altered phenotype due to the different genomic context of the original organism (Levy et al., 2018). These methods are also reductionist; mutants are studied in isolation meaning that interactions between genes cannot be identified and genes may not appear important in pure culture even though they may be crucial in vivo (Poole et al., 2018).

In the last 10 years, the combination of saturation transposon mutagenesis with high throughput sequencing has enabled identification of the suite of genes and pathways that are important for plant growth promotion in a single experiment (Cole et al., 2017). The fundamental element that underlies this methodology is the transposon; a class of genetic elements that can move to different locations within a genome (Hoffman and Jendrisak, 2002). When a transposon inserts into a genome it may interrupt or modify the function of a gene or regulatory element resulting in a change in the organism's phenotype, allowing phenotypic changes to be linked to specific gene disruptions (Hoffman et al., 2000; Reznikoff, 2008).

The first step in transposon insertion sequencing is generating a saturated mutant library by using a transposable element to generate a large population of cells with mutations at many locations throughout the genome (Barquist et al., 2013). The choice of transposable element depends on the organism; most commonly used are the Tn5 or mariner transposons (reviewed in Chao et al., 2016). Tn5 transposons insert randomly into the genome, while mariner transposons insert at TA sites. TA site occurrence is relatively regular across the genome but can vary at local scales. Knowing how many possible mariner transposons insertion sites there are in a genome allows for statistical calculations of transposon insertion saturation which is not possible in Tn5 based mutant libraries. On the other hand, as Tn5 transposons do not require a specific site for insertion into the genome these libraries can potentially have higher transposon insertion density (Chao et al., 2016).

A transposon insertion into an essential gene is most often fatal and mutants of these genes will not be included in the resulting mutant library (essentiality is relative to growth

conditions; Barquist et al., 2013). Insertions within non-essential genes result in recoverable cells which may show phenotypic or fitness differences under some growth conditions (Hoffman et al., 2000). The goal of this approach is to generate a saturated mutant library of cells which collectively have transposon insertions throughout the genome, enabling assessment of the importance of each gene to fitness under a range of testable growth conditions (Barquist et al., 2016). For example, simultaneous fitness assessment of the hundreds of thousands of mutants in a library is possible by subjecting the mutant library to a defined condition such as exposure to a pesticide or colonization of a host (Chao et al., 2016; Calero et al., 2017; **Figure 2**).

Multiple methods for transposon insertion sequencing have arisen in the past decade, including TraDIS (Langridge et al., 2009), Tn-Seq (van Opijnen et al., 2009), HITS (Gawronski et al., 2009), and INSeq (Goodman et al., 2009). These methods have been further refined, for example Tn-Seq Circle (Gallagher et al., 2011), RB-TnSeq (Wetmore et al., 2015) and improvements to INSeq (Goodman et al., 2011) to reduce the time and resources required to carry out these studies or improve the effectiveness of the procedure. Other developments, such as TraDISort, enable physical enrichment of desired mutants via cell sorting (Hassan et al., 2016). All of these methods use similar approaches (**Figure 2**): (1) construct a saturated mutant library by transposon mutagenesis of a bacteria with a transposable element containing an antibiotic resistance cassette; (2) subject the library to challenge assays that affect the fitness of the cells containing mutated conditionally essential genes; (3) DNA extraction from recovered mutants and enrichment for transposon-chromosome junctions; (4) massively parallel sequencing of the mutant library input (pre-challenge) and output (post-challenge) pools; (5) bioinformatic analysis to map the location of each transposon insertion site in the genome; and (6) comparison of the number of reads at each transposon insertion site to determine the conditionally essential genes for the condition of interest.

The differences between the transposon insertion sequencing methods lie in the construction of the mutant library and the enrichment for transposon-chromosome junctions before sequencing. Tn-Seq and INSeq mutant libraries are constructed using mariner transposons. The inclusion of MmeI restriction sites at each end of the transposon allows for the chromosomal DNA either side of the transposon to be cut at a fixed length of 20 base pairs during preparation of Tn-Seq and INSeq libraries before sequencing. Adapter ligation follows and PCR and purification steps are carried out. The original INSeq protocol includes a PAGE gel purification step, while Tn-Seq uses agarose gel purification (van Opijnen and Camilli, 2013). Improvements to INSeq purification steps include the use of a biotinylated primer for PCR and magnetic bead purification to decrease the time and cost involved as well as increasing the technique's sensitivity (Goodman et al., 2011). RB-TnSeq uses random barcodes with the transposon inserted into each cell. This allows for the chromosomal location of the transposon insertion to be mapped from the initial sequencing run and each subsequent sequencing run takes less time and funding (Wetmore et al., 2015). In HITS and TraDIS there is no restriction site included in the transposon, instead random shearing of the genomic DNA is employed, resulting in variable length DNA fragments. In TraDIS, adapters are ligated to all DNA fragments and enrichment of fragments of interest (those containing a transposon-chromosome junction) is achieved through PCR. The HITS technique includes these steps as well as an affinity purification step to purify the PCR amplicons of interest containing the transposon (Gawronski et al., 2009). The details of each method and the advantages and disadvantages have been comprehensively reviewed by Barquist et al. (2013) and van Opijnen and Camilli (2013).

A complementary approach to loss-of-function assays is dual-barcoded shotgun expression library sequencing (Dub-Seq), where gain-of-function overexpression libraries are used in competition assays to gather insights into gene function and the fitness of mutants that have increased gene dosage (Mutalik et al., 2019). Combining data from both gain- and loss-of-function approaches can lead to increased precision of biological insights and help to overcome the limitations when either approach is used in isolation (Mutalik et al., 2019).

The advantage of transposon insertion sequencing approaches is that gene functions can be identified without prior knowledge of their likely roles (Hassan et al., 2016). When using traditional methods of determining gene function, some information about a gene is necessary to establish a starting point for characterization; this can be problematic in situations when sequence databases and genome context do not give many (or any) clues about the function of hypothetical proteins. In contrast, transposon insertion sequencing approaches allow for the simultaneous collection of gene functional information for a large number of genes, even when no other information is available. By comparing the number of transposon-insertion reads in the mutant library before and after a selection pressure is applied the genes involved in coping with that pressure can be identified (diCenzo et al., 2018). Changes in the relative abundances of each mutant shows which genes have been positively or negatively affected by the challenge conditions and therefore their possible functions (Paulsen et al., 2017). Combining information from multiple challenge assays produces multi-dimensional data and leads to a greater functional understanding of both annotated and hypothetical proteins. Such information is useful in selecting candidate genes which are indicated to be required for improved fitness under the conditions of interest for further, targeted functional characterization.

Using a transposon insertion sequencing approach allows for the identification of the genetic factors likely to affect bacterial fitness in a complex condition or environment (Barquist et al., 2013). Transposon insertion sequencing techniques can be used to investigate the suite of genes required for in vivo growth in a single experiment, such as assays on plants or plant tissues (Cole et al., 2017). Employing the saturated transposon library in these types of complex, multi-faceted scenarios provides comprehensive information about the genes required to cope with in vivo selection pressures, how genes interact and create a complex phenotype, and can reveal the range of modes of action used by bacteria (van Opijnen and Camilli, 2013; Kohl et al., 2019).

To ensure transposon insertion sequencing is used appropriately there are some considerations that need to be taken into account in the design of these high throughput studies. Transposon insertion sequencing allows a mutant library of bacteria to be studied in conjunction with the existing microbiome of an organism to identify genes related to interacting with the host and the host's microbiome (including beneficial, commensal and pathogenic microbes; Barquist et al., 2013). This technique has been successfully applied in animal models. For example, a Salmonella enterica serovar Typhimurium saturated mutant library was screened using chickens, pigs and cattle to identify the genes required for infection in each host (Chaudhuri et al., 2013). When using a dense mutant library in agricultural studies, experiments need to be designed to avoid bottlenecking the population due to insufficient plant surface area or resources to support bacterial growth (van Opijnen and Camilli, 2013). A related consideration when working with bacterial inoculants as part of an existing microbiome is the challenge of recovering enough mutant cells and extracting enough DNA to be able to conduct the highthroughput sequencing. Failure to recover sufficient library after a challenge could lead to biasing of the samples and insufficient sequence depth for reliable results. When analyzing data from transposon insertion sequencing experiments it is important to consider that fitness effects may not be solely due to the gene interrupted by the transposon insertion. If a transposon inserts into an operon or promoter region there could be downstream effects on gene expression.

There are also some limitations to where and how transposon insertion sequencing approaches can be applied. As with other molecular techniques, this approach is only feasible for genetically tractable microbes which can be cultured in the laboratory (Levy et al., 2018). Additional limitations include the inability to identify genes for which there is functional redundancy in the genome (Mutalik et al., 2019) or those encoding "public goods" (products that are secreted and shared with the community; Royet et al., 2019) as these functional deficiencies are unlikely to reduce the fitness of the individuals affected whilst they are in a mixed population (Cole et al., 2017).

The majority of initial studies using transposon insertion sequencing approaches focused on human pathogens, but these methods are starting to be used to examine other microbial lifestyles. Recent studies of human and animal pathogens include susceptibility and resistance of Escherichia coli to phage infection (Cowley et al., 2018), the mechanism of antibiotic binding in Staphylococcus aureus (Santiago et al., 2018), and resistance of Borrelia burgdorferi to reactive oxygen and nitrogen species (Ramsey et al., 2017). Non-pathogenic microbes are also being explored using transposon insertion sequencing methods, for example, regulation and biosynthesis of holdfast assembly has been investigated in Caulobacter crescentus (Hershey et al., 2019), circadian clock proteins have been identified in cyanobacteria (Welkie et al., 2018) and previously unknown bacterial amino acid biosynthesis genes have been identified in heterotrophic bacteria (Price et al., 2018b).

# TRANSPOSON INSERTION SEQUENCING AND THE FUTURE OF AGRICULTURE

# Current Uses of Transposon Insertion Sequencing for Plant-Associated Microbes

Transposon insertion sequencing techniques have been applied to study a relatively small number of plant-associated bacteria (**Table 1**). One application has been the discovery of genes critical

### TABLE 1 | Summaryof transposon insertion sequencing studies of plant growth promoting and plant pathogenic bacteria.


(Continued)

Transposon Insertion Sequencing in Agriculture

TABLE 1 | Continued


Transposon Insertion Sequencing in Agriculture

fpls-11-00291 March 16, 2020 Time: 15:36 # 8

for leaf, root and rhizosphere colonization (Cole et al., 2017; Liu et al., 2018; Helmann et al., 2019a; Sivakumar et al., 2019). An RB-TnSeq study of P. syringae found 31 genes important for colonization of common bean leaf surfaces, including amino acid and polysaccharide biosynthesis genes. Sixty-five genes were important for fitness in apoplastic environments, including those that make up the type III secretion system and syringomycin synthesis genes (Helmann et al., 2019a). Further in planta investigation of single gene mutants confirmed that 1eftA and 1Psyr\_0920 mutants have decreased fitness in the apoplast, suggesting that these glycosyltransferase domain-containing genes are important for the evasion of plant surveillance systems (Helmann et al., 2019a). Investigations of Pseudomonas simiae found a large number of amino acid transporter genes are important for survival in the rhizosphere, suggesting that utilizing the amino acids exuded from plant roots confers a selective advantage over microbes that devote resources to the synthesis of these compounds (Cole et al., 2017). Inoculation of wild type and immunocompromised Arabidopsis plants with Pseudomonas sp. WC365 identified 231 genes involved in rhizosphere competence (Liu et al., 2018). Follow-up experiments with targeted single gene knockout mutants confirmed the role of the mutants 1spuC and 1morA in inducing pattern-triggered immunity of the Arabidopsis host and inhibiting plant growth. Point mutations in the conserved GGDEF motif of MorA showed that this protein acts as a phosphodiesterase which inhibits biofilm formation (Liu et al., 2018).

Another application of transposon insertion sequencing examined interactions between plants and bacteria using legumes and their nitrogen-fixing symbionts, Rhizobium leguminosarum and Sinorhizobium meliloti. These studies identified genes important for surviving the environmental conditions inside root nodules (Wheatley et al., 2017) and genes that confer resistance to plant antimicrobial peptides (Arnold et al., 2017). Investigations of a previously unknown, but highly conserved, gene smc03872 showed that it protects S. meliloti against the antimicrobial activities of the NCR247 peptide of the host plant Medicago truncatula. Site specific mutations in this gene revealed that SMc03872 is a lipoprotein with peptidase activity that can provide its protective effect as long it is anchored in a membrane (Arnold et al., 2017).

Transposon insertion sequencing in pathogenic bacteria can be used to rapidly identify virulence factors, essential genes and modes of action of bacterial pathogens (**Table 1**). An in planta Tn-Seq study of S. enterica showed that the colonization of tomatoes uses a distinct set of plant-associated genes which only partly overlap with the genes used for virulence in animals and by other phytopathogens during plant infection (de Moraes et al., 2017). Determining which genes are important for plant host colonization by this bacteria is important as the populations of plant-associated bacteria can act as reservoirs for infection of animal hosts, including humans. Another pathogenic lifestyle is the necrotrophic growth of rot-causing pathogens such as Dickeya dadantii. A Tn-Seq study of this pathogen showed that the biosynthetic pathways for uridine monophosphate, purines, and the amino acids leucine, cysteine and lysine, are essential for survival on chicory plants (Royet et al., 2019). The study also determined that the RsmC and GcpA regulators are important in the infection process and glycosylation of flagellin confers fitness during plant infection. Transposon insertion sequencing studies can also identify potential antimicrobial targets, such as genes that are essential for growth of pathogenic bacteria but are not essential in other bacterial strains.

Transposon insertion sequencing studies have also provided insights into the fundamental physiology of plant pathogenic and plant growth promoting bacteria (**Table 1**). As part of a larger study, Price et al. (2018a) created saturated mutant libraries of eight plant-associated bacteria and conducted fitness testing using multiple carbon and nitrogen sources as well as up to 55 different stress conditions. This study recorded phenotypes for thousands of genes and proposed specific functions for transporter proteins, catabolic enzymes, and domains of unknown function. Transporter substrates (Helmann et al., 2019b) and abiotic and biotic stress tolerance genes (Calero et al., 2017) can also be investigated using these techniques. A Tn-Seq study in P. putida KT2440 revealed that genes related to membrane stability are important for p-coumaric acid tolerance (Calero et al., 2017). Follow up studies with single gene knockout mutations showed strong involvement of the cytochrome c maturation system (ccm operon) and the ttg2 operon (encodes an ABC transporter) in this tolerance (Calero et al., 2017). Applying these techniques to a broad range of agriculturally relevant bacteria has the potential to inform efforts to cultivate beneficial plant-microbe interactions and reduce the burden associated with agricultural pathogens.

### Future Avenues to Explore With Transposon Insertion Sequencing

There are a wide variety of ways that the use of transposon insertion sequencing techniques in bacteria could be expanded beyond these current applications to rapidly increase our understanding of bacterial functions relevant for plant growth promotion and agriculture. Bacterial growth in the rhizosphere is influenced by the soil type and structure, soil organic matter, macronutrient levels and moisture levels (Shelake et al., 2019). These physico-chemical properties of the soil can vary at the centimeter scale so a bacterial inoculum applied to a crop may encounter a wide range of these conditions (Fierer, 2017). Bacteria residing in the phyllosphere (above ground plant surfaces) have to be able to cope with widely fluctuating environmental conditions, including light levels, water availability, temperature and UV radiation (Shelake et al., 2019). Transposon insertion sequencing techniques could be applied to plant-associated bacteria to determine which genes are involved in fundamental processes required for living in these environments and what genetic factors make some bacteria highly successful at colonizing these environments.

Bacterial chemotaxis and motility-related genes are essential for detecting signals produced by plants and moving toward chemical attractants or away from deterrents (Scharf et al., 2016). Flagella play a critical role in attachment to plant surfaces and the initial formation of biofilms; both essential processes for biocontrol bacteria to protect plants against pathogens (Nian

et al., 2007; Rudrappa et al., 2008). In an agricultural context, bacterial cells with enhanced motility and the ability to form biofilms are better competitors for resources and have greater colonization efficiency, leading to greater enhancement of plant growth (Barahona et al., 2010; Backer et al., 2018; Pinski et al., 2019). Transposon insertion sequencing has successfully been used to identify 14 non-flagellar genes and intergenic regions involved in enhanced motility of E. coli EC958, showing that genes outside the flagellar and chemotaxis regulons are important for motility (Kakkanat et al., 2017). This approach could be applied to identify genes important for motility in agriculturally relevant bacteria.

Transposon insertion sequencing can even provide new insights into processes that are extremely well studied. For example, despite decades of sporulation research, a recent Tn-Seq study of Bacillus subtilis led to the identification of 24 additional genes involved in sporulation and mutants in which sporulation is delayed or accelerated (Meeske et al., 2016). The genes rhizobial bacteria use to adapt to rhizospheric and root endophytic conditions have been extensively studied, but recent transposon insertion sequencing studies have revealed new insights into the genes important for utilizing carbon compounds exuded by roots and surviving in low oxygen environments inside root nodules (Wheatley et al., 2017).

Most genetic studies of plant-associated bacteria are a snapshot at a particular moment in time and cannot shed light on changing interactions, such as across plant life stages. Executing a succession of transposon insertion sequencing experiments could give insights into the different stages of a plant growth promoting bacteria's interaction with its host (diCenzo et al., 2019). An in vivo transposon insertion sequencing experiment of host infection over a 2-week time series was conducted using Edwardsiella piscicida, a fish pathogen. This study examined the fitness of mutants over the course of host infection and identified genes that contribute to colonization of the fish host (Yang et al., 2017). When compared with traditional endpoint transposon insertion sequencing studies the time series experiment identified more genes affecting in vivo fitness and yielded new insights into possible targets for vaccine development (Yang et al., 2017). The selective pressures bacteria experience during plant colonization or infection are not constant over time, so using time-series transposon insertion sequencing studies of plant associated bacteria could provide insights into the relative importance of particular genes during these dynamic processes.

When living on plant surfaces plant growth promoting bacteria need to be able to cope with biotic and abiotic stresses, such as plant toxins, anthropogenic pollutants, salinity, acidic or alkaline conditions and temperature. Some of these stresses are commonly experienced in combination, for example, drought and heat stress are often experienced together due to the changing climate. Stress physiologically affects both plants and microbes and can have flow-on effects for plant-microbe interactions (Naik et al., 2019). Transposon insertion sequencing has been successfully used in non-plant-associated organisms to study bacterial responses to stress conditions. A Tn-seq study using high osmolarity, reactive oxygen species and temperature successfully identified individual E. coli genes that confer stress resistance as well as gene combinations that work synergistically to impart improved stress resistance (Lennen and Herrgard, 2014). Transposon insertion sequencing analysis of the pathogen B. burgdorferi uncovered 66 genes not previously known to be involved in resistance to reactive oxygen and nitrogen species (Ramsey et al., 2017). Conducting in vitro or in planta transposon insertion sequencing studies that incorporate rhizospheric or phyllospheric stresses could rapidly advance our understanding of the genes employed by plant growth promoting bacteria in stressful conditions.

To extend our knowledge of how genes interact to create a phenotype of interest, transposon mutant libraries could be created using plant-associated bacteria with a genetic background where a specific gene of interest is inactivated. This would allow interactions with the inactivated gene, redundant genes and the components of regulatory networks to be identified (Barquist et al., 2013; van Opijnen and Camilli, 2013). To determine the interaction profiles of five genes of interest in the human pathogen Streptococcus pneumoniae, independent transposon mutant libraries were constructed using cells with a background knockout of one of the five genes (van Opijnen et al., 2009). This study revealed interactions that exacerbated or improved fitness defects compared to the single mutants and showed that one of the genes of interest is a master regulator of complex carbohydrate metabolism. Using this technique in plantassociated bacteria could reveal previously unknown gene interactions and shed light on gene networks crucial for growth on plant surfaces.

# POTENTIAL AGRICULTURAL APPLICATIONS

Using transposon insertion sequencing allows us to move beyond traditional methods of molecular genetic analyses where bacterial phenotypes and single mutations are linked and perform en masse identification of genes and pathways involved in plant growth promotion (diCenzo et al., 2019). These high-throughput techniques can rapidly accelerate the rate of gene functional determination and provide rapid insights into the genes and pathways that underlie biotic interactions, metabolism and survival of agriculturally relevant bacteria. This will dramatically increase our knowledge of mechanisms that beneficial bacteria use to promote plant growth and the genes and pathways that plant pathogens rely on when causing disease.

Understanding the growth requirements of plant growth promoting bacteria could assist with developing formulations for maximum efficiency when applied to crops (Kohl et al., 2019). For example, if a beneficial bacteria is known to be able to metabolize a particular compound the bacterial inoculant could be formulated to include that compound to create optimal growth conditions for the beneficial bacteria (Syed Ab Rahman et al., 2018; Rocha et al., 2019). This would increase the likelihood of it being able to effectively compete with the native microbiota, colonize the plant surface and persist over

time (Naik et al., 2019). A similar effect could be achieved if plant breeding selected crop genotypes that have the ability to support beneficial bacteria, for example by exuding a specific chemical compound from the roots (Syed Ab Rahman et al., 2018; Compant et al., 2019).

By conducting a series of high-throughput transposon insertion sequencing experiments a large amount of information could be leveraged in the design of a bacterial inoculum that is able to promote plant growth across diverse crop species, a broad range of soils, climatic conditions, and abiotic and biotic stresses (Compant et al., 2019). The development of inoculums consisting of multiple bacterial species that use a range of modes of action for biocontrol would reduce the risk of pathogens being able to evolve resistance to a single biocontrol strain (Backer et al., 2018). By incorporating bacteria with a suite of activities a bacterial consortium could be effective across more stages of the plant lifecycle, resulting in increased plant growth. Studies of bacterial consortia have shown that diverse bacterial inoculants show greater survival, reduced levels of plant disease and increased plant biomass (Hu et al., 2016, 2017). Harnessing rapidly generated information about bacterial gene function from transposon insertion sequencing studies has great potential for creating optimized bacterial consortia for crop inoculation across a range of conditions.

Alternative strategies to leverage the plant growth promoting effects of beneficial bacteria are using secreted bacterial metabolites or selecting desirable plant traits. The compounds that beneficial bacteria secrete can act as biostimulants and agricultural soils could be supplemented with these microbial metabolites to trigger plant responses and stimulate plant growth (Backer et al., 2018). The insights from transposon insertion sequencing could uncover ways to boost production of these bioactive secondary metabolites by plant-associated bacteria (Shelake et al., 2019). The findings from transposon insertion sequencing studies could also assist in the identification of desirable plant traits that support the attraction of beneficial bacteria or promote resistance to pathogen colonization. These traits could be selected in plant breeding programs (Syed Ab Rahman et al., 2018; Compant et al., 2019).

If there is a shift in public opinion about genetic engineering, then information gleaned from transposon insertion sequencing studies could be used to direct alterations of plant-associated microbes for the development of crop protection and crop enhancement products. Utilizing information from transposon insertion sequencing to identify targets for gene editing could lead to faster design and development of control methods for crop diseases (Miesel et al., 2003; Royet et al., 2019). Once high throughput analysis rapidly identifies a target gene or pathway, a pathogen could be genetically engineered to create an attenuated or avirulent isolate for use in biological control (Muñoz et al., 2019). These engineered strains can still colonize plants, compete with their pathogenic kin for space and nutrients, and may even induce plant defenses, but they themselves are no longer infectious (Couteaudier, 1992). For example, in glasshouse and field trials hrpG mutants of the tomato bacterial spot pathogen Xanthomonas campestris pv. vesicatoria 75-3 provided up to 76% reduction in disease severity (Moss et al., 2007). Use of genetic engineering in this way could dramatically shorten the timeframe for the development of crop protection products.

Traditional methods of bacterial gene editing and the alteration of protein production, such as gene knockouts and heterologous expression, are now complemented by new gene editing techniques, such as CRISPR-Cas9. Researchers repurposed this bacterial and archaeal cell defense mechanism as a generic gene editing system when they realized that it could be used to recognize and cut specific DNA sequences (Barrangou and Doudna, 2016). As a gene editing mechanism CRISPR-Cas9 is faster and easier to carry out and less expensive than traditional methods, such as homologous recombination (Shelake et al., 2019). Further into the future, synthetic biology and synthetic genomics hold the promise of designing and building bespoke microorganisms capable of specific plant growth promoting functions. Transposon insertion sequencing will play an important role in identifying the genes and pathways that underpin plant growth promotion and will facilitate the design of such synthetic microbes. The performance of these synthetic beneficial microorganisms can be iteratively improved through the classic synthetic biology "design, build, test, learn" cycle (McArthur et al., 2015; Pouvreau et al., 2018).

# CONCLUSION

We are currently in the midst of a paradigm shift about the role microbes play in our lives. We are transitioning from a simplistic view that microbes are largely disease-causing agents to a more sophisticated understanding of the ubiquitous nature of commensal microbes and the benefits microbes can provide for their hosts. Increasingly society is looking to microbes as a solution for complex problems. Just as research into the human microbiome is revolutionizing medicine and personalized medicine is on the horizon, the future of agriculture may come from rapid advances in our understanding of plantassociated bacterial gene functions and creating tailor made microbial solutions for particular environments, diseases and cropping regimes.

The end goal of these studies is not to have a one-size-fitsall plant growth promoting solution. Conducting transposon insertion studies on plant-associated bacteria will rapidly provide vast amounts of functional information about substantial numbers of agriculturally relevant bacterial genes. As the pressure to use agricultural land most effectively with the least chemical inputs continues to increase, this knowledge will inform agricultural practice so we can increase crop yields and allow us to tackle the challenges of overpopulation and the changing climate.

# AUTHOR CONTRIBUTIONS

BF wrote the manuscript. ST and IP conceptualized the idea for this work and critically revised the manuscript. BF, ST, and IP approved the final version of the manuscript.

### FUNDING

fpls-11-00291 March 16, 2020 Time: 15:36 # 12

This work was supported by an Australian Research Council Discovery Grant (#DP160103746). ST is the

### REFERENCES


recipient of an Australian Research Council Discovery Early Career Researcher Award (#DE150100009) and BF is the recipient of an Australian Government Research Training Pathway Scholarship.

knowledge, predictions and knowledge gaps. Nucleic Acids Res. 44, D330–D335. doi: 10.1093/nar/gkv1324


online at: www.frac.info/docs/default-source/publications/list-of-resistantplant-pathogens/list-of-resistant-plant-pathogenic-organisms\_may-2018.pdf?sfvrsn=a2454b9a\_2 (accessed September 21, 2019).




**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Fabian, Tetu and Paulsen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Tweaking Photosynthesis: FNR-TROL Interaction as Potential Target for Crop Fortification

Hrvoje Fulgosi\* and Lea Vojta

Laboratory for Molecular Plant Biology and Biotechnology, Division of Molecular Biology, Institute Rud¯er Boškovic, Zagreb, ´ Croatia

Keywords: photosynthesis, linear electron transfer, ROS, electron partitioning, stress-tolerance

Photosynthesis not only supplies plants with needed energy for growth and nutrient storage, but links light-to-chemical energy conversion with redox regulatory networks of the entire cell. By modulating photosynthetic electron flow, plants can adapt to constantly changing light and environmental conditions. Different mechanisms direct electrons formed during light reactions to either energy-conserving or energy-dissipating pathways. Recently described dynamic interactions of the flavoenzyme ferredoxin:NADP<sup>+</sup> oxidoreductase (FNR) with protein TROL represent elegant molecular switch that can partition electrons downstream of photosystem I. FNR-TROL bifurcation can control energy transfer to either linear flow, which results in NADPH production, or to the rapid electron sink that efficiently prevents reactive oxygen species propagation. Plant genome editing of TROL represents an unexploited alley for improvement of plant stress and defense responses, productivity, and eventually agricultural yield.

### Edited by:

Briardo Llorente, Macquarie University, Australia

### Reviewed by:

Iftach Yacoby, Tel Aviv University, Israel Guy Hanke, Queen Mary University of London, United Kingdom Eduardo A. Ceccarelli, CONICET Instituto De Biología Molecular Y Celular De Rosario (IBR), Argentina

\*Correspondence:

Hrvoje Fulgosi fulgosi@irb.hr

### Specialty section:

This article was submitted to Crop and Product Physiology, a section of the journal Frontiers in Plant Science

Received: 20 October 2019 Accepted: 04 March 2020 Published: 24 March 2020

### Citation:

Fulgosi H and Vojta L (2020) Tweaking Photosynthesis: FNR-TROL Interaction as Potential Target for Crop Fortification. Front. Plant Sci. 11:318. doi: 10.3389/fpls.2020.00318

The most well-known and established function of plant flavoenzyme ferredoxin: NADP(H) oxidoreductase (FNR) in photosynthetic energy conversion is the synthesis of NADPH (Forti and Bracale, 1984; Carillo and Ceccarelli, 2003; Shin, 2004; Mulo, 2011). This enzymatic reaction utilizes two molecules of reduced ferredoxin (Fd) to produce one molecule of NADPH (Arakaki et al., 1997; Medina and Gómez-Moreno, 2004). This conversion is known as the last step in the linear electron transfer (LET) chain (Allen, 2003; Rochaix, 2011). Generated NADPH is utilized in numerous downstream biochemical reactions, both in chloroplast stroma and in the rest of the cell (Rochaix, 2011; Scheibe and Dietz, 2012). Electron cycling from ferredoxin to NADPH only occurs in the light. In the dark, as well as in non-photosynthetic organisms, the FNR primarily works in reverse, utilizing NADPH to provide reduced ferredoxin for various metabolic pathways, such as oxidative stress response, nitrogen fixation, terpenoid biosynthesis, steroid metabolism, and iron–sulfur protein biogenesis (Aliverti et al., 2008). Thus, FNR links fundamental process of light-to-chemical energy conversion with general plant metabolism (Foyer and Noctor, 2005). The electron donor Fd is a small iron-sulfur protein that acts simultaneously as a hub, and as a bottle neck for the distribution of electrons supplied by photosystem I (PSI) (Paul and Foyer, 2001; Hanke et al., 2004; Balmer et al., 2006; Joliot and Johnson, 2011). FNR binding to thylakoid membranes of vascular plants has been proposed to take place via cytochrome b6/f complex (Clark et al., 1984; Zhang et al., 2001), PSI-E subunit (Andersen et al., 1992), and in a complex with NADPH dehydrogenase (Quiles and Cuello, 1998). However, in all these studies the exact protein association domain responsible for FNR-binding has not been identified. Most recently, the FNR interaction with photosynthetic membranes has also been demonstrated in organisms other than vascular plants (Mosebach et al., 2017; Pini et al., 2019). In a number of essential publications, it has been demonstrated that the distribution of FNR between soluble- and thylakoidbound states can have profound influence on FAD cofactor assembly (Miyake et al., 1998; Onda and Hase, 2004), Fd-FNR interactions, FNR catalytic properties, oxidative and photo-oxidative stress tolerance, and the regulation of photosynthesis (Palatnik et al., 1997; Rodriguez et al., 2007; Hanke et al., 2008; Peng et al., 2008; Benz et al., 2009, 2010; Juric et al., 2009; Alte et al., 2010; ´ Yacoby et al., 2011; Twachtmann et al., 2012; Vojta and Fulgosi, 2012; Vojta et al., 2012; Goss and Hanke, 2014; Lintala et al., 2014). Particularly interesting, transgenic plants overexpressing chloroplastic FNR acquire increased tolerance to oxidative stress, while simultaneously displaying normal rates of photosynthesis (Rodriguez et al., 2007). Further, FNR knock-down plants, which have stunted growth and restricted photosynthetic activity, are particularly susceptible to photo-oxidative damage (Hajirezaei et al., 2002; Lintala et al., 2012). Perhaps the most intriguing finding was that the release of FNR from thylakoid membranes could be regulated by oxidative stress, in methyl viologendependent fashion (Palatnik et al., 1997). Lastly, in tic62trol double mutants no binding of FNR to thylakoid proteins could be detected, whatsoever (Lintala et al., 2014).

The hypothesis describing additional functions and physiological roles of chloroplastic FNR in vascular plants has been proposed (Benz et al., 2010; Goss and Hanke, 2014). This concepts rest on a series of publications mostly describing binding and release of FNR from the soluble chloroplastic protein Tic62 (Benz et al., 2009; Lintala et al., 2014). Tic62 (62 kDa component of the translocon at the inner envelope of chloroplasts) has initially been discovered as a component of chloroplast inner envelope protein translocation machinery (Küchler et al., 2002), but has subsequently been associated with at least two other compartments, chloroplast stroma, and chloroplast photosynthetic membranes (Balsera et al., 2007). The essential protein region recognized in Tic62 as being responsible for the binding of FNR is present in multiple repeats in many, but not all, Tic62 family members studied (Balsera et al., 2007). The same motif was further identified at the C-terminus of the thylakoid rhodanase-like protein TROL (Juric et al., 2009 ´ ). This domain, dubbed membrane recruitment motif (MRM), binds FNR with high efficiency. In Tic62, the interaction was postulated to be light- and pH-dependent (Benz et al., 2009). In 2010, Benz et al. proposed that the majority of thylakoid-localized FNR is bound to the membrane via two interaction partners, Tic62 and TROL (Benz et al., 2009). They further postulate that soluble form of FNR is sufficient to sustain photosynthetic energy conversion, while the thylakoid pool largely performs other functions. Finally, they suggest that lightand redox-states regulate the distribution of FNR. In the view of that, FNR is stored at the membranes in the dark and during morning hours, thus allowing advantageous flow of electrons to various Fd-dependent pathways (Benz et al., 2009; Vojta et al., 2012). Such scenario could assure proper metabolic adjustments depending to environmental cues.

Soluble Tic62, however, does not contain MRM in all plant species (Balsera et al., 2007), and its attachment points at the thylakoids have yet to be determined. TROL, however, is bona fide integral membrane protein with dual localization (Juric´ et al., 2009; Vojta et al., 2018). It is present in the mature form in thylakoid membranes, and in the precursor form at the inner chloroplast envelope. TROL is located near PSI, mostly in non-appressed thylakoid regions (Juric et al., 2009 ´ ). All known vascular plant TROL sequences contain FNR MRM (Juric et al., ´ 2009). Additionally, a single MRM of TROL binds FNR with several fold higher affinity then the Tic62 MRM (Juric et al., ´ 2009). In Arabidopsis, TROL associated with FNR can be isolated in the dynamic supramolecular complex of ∼190 kDa (Juric et al., ´ 2009).

The unique hallmark of TROL is the lumen-located rhodanase-like domain (RHO), which is most likely inactive in a sense of sulfur detoxification, as it contains the aspartate residue instead of the conserved cysteine in the putative active site (Juric´ et al., 2009). RHO domains are intriguing due to their ancient origin and structural identity with the catalytic domains of celldivision-cycle (CDC25) dual-specificity phosphatases. Inactive RHO domains are implicated in redox-sensing and were shown to interact with quinolinediones (Brisson et al., 2005). Plant sulfur-transferases and rhodanases have recently been reviewed in two comprehensive papers (Moseler et al., 2019; Selles et al., 2019).

Recently, alternative mechanism of FNR binding and release from TROL, involving redox sensing by the RHO domain, has been put forward (Vojta and Fulgosi, 2012, 2019). According to this model, certain (redox) signal(s) of lumenal origin can be sensed by the RHO and further transduced across thylakoid membrane, resulting in differential FNR binding on the stromal side (Vojta and Fulgosi, 2012). The role of the proline-rich region of TROL, dubbed PEPE, which precedes the MRM, has also been postulated (Juric et al., 2009 ´ ). Due to a repeating sequence of turn-inducing proline residues, PEPE might serve as molecular swivel, allowing free movement of bound FNR, or even its alternative associations with various supramolecular complexes, or membrane domains. Such dynamic FNR recruitment might be responsible for alternative partitioning of photosynthetic electrons, and/or prioritization of Fd-dependent pathways. Further, it has been demonstrated that chloroplasts of Arabidopsis trol mutants proliferate substantially less superoxide anion radicals then the WT (Vojta et al., 2015). This reduction can be recorded in trol chloroplast pre-acclimated to dark and growth-light conditions. Even more remarkable, trol chloroplast proliferate almost 40 percent less superoxide anion radicals even in the presence of ROS-generating methyl viologen (paraquat herbicide) (Vojta et al., 2015). These finding are in line with the previously published observations and suggest that FNR permanently detached from TROL can either efficiently scavenge superoxide anion, or that electrons are very rapidly partitioned into certain other pathways(s), different from the LET. In fact, results suggest that LET is preferential only when TROL-FNR association is established (Juric et al., 2009; Vojta ´ et al., 2015). Alternatively, in the absence of TROL, electrons rapidly flow to other sinks downstream of PSI donor site (Vojta et al., 2015). Finally, it has been demonstrated that in the absence of TROL light- and/or pH-dependent dynamic recruitment of FNR to thylakoids is entirely abolished (Vojta and Fulgosi, 2016). Apparently, TROL-FNR interaction might be the most prominent, if not exclusive, dynamic-type interaction of FNR with photosynthetic membranes of vascular plants.

TROL-FNR interaction could be an entrance to crop improvement via various modifications to the interaction itself, or to upstream and/or downstream reactions and pathways. For example, TROL itself could be down- or up-regulated, or its FNRbinding or regulatory domains could be modified by genome engineering. FNR could also be a target of engineering, for example by exchange of FNR enzymes from C4 to C3 plants or vice versa. C4 FNR interacts with TROL ITEP domain with many fold higher affinity than the C3 FNR (Rac and Fulgosi, 2019). The increase of TROL levels by itself would not necessarily imply an increase in photosynthetic efficiency, since the improvement would depend on the availability of reduced ferredoxin and the ability of the system to provide the necessary reduction equivalents. However, alterations in the production of reduction equivalents could affect the redox homeostasis of chloroplasts and ultimately produce aberrations instead of plant benefit.

To conclude, we further iterate the concept of photosynthetic membrane recruitment of FNR and emphasize the importance

# REFERENCES


of TROL-FNR dynamic interaction. We posit that TROL-FNR interaction is an important and overlooked mechanism for regulation and prioritization of energy-conserving and energydissipating pathways in vascular plant photosynthesis. We propose that TROL interaction with FNR is useful target for genome editing of agriculturally important species, potentially providing more stress-tolerant crops.

# AUTHOR CONTRIBUTIONS

HF and LV wrote the manuscript.

# FUNDING

This work has been funded by the Croatian Science Foundation Grant IP-2014-09-1173 to HF.


the photosynthetic apparatus modulates electron transfer in Chlamydomonas reinhardtii. Photosynth. Res. 134, 291–306. doi: 10.1007/s11120-017-0408-5


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Fulgosi and Vojta. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Cynara cardunculus L. as a Multipurpose Crop for Plant Secondary Metabolites Production in Marginal Stressed Lands

Helena Domenica Pappalardo, Valeria Toscano, Giuseppe Diego Puglia, Claudia Genovese and Salvatore Antonino Raccuia\*

Consiglio Nazionale delle Ricerche-Istituto per i Sistemi Agricoli e Forestali del Mediterraneo, Catania, Italy

### Edited by:

Edward Rybicki, University of Cape Town, South Africa

### Reviewed by:

Klára Kosová, Crop Research Institute (CRI), Czechia Luisa C. Carvalho, University of Lisbon, Portugal

### \*Correspondence:

Salvatore Antonino Raccuia salvatoreantonino.raccuia@cnr.it; salvatore.raccuia@cnr.it

### Specialty section:

This article was submitted to Crop and Product Physiology, a section of the journal Frontiers in Plant Science

Received: 20 October 2019 Accepted: 17 February 2020 Published: 31 March 2020

### Citation:

Pappalardo HD, Toscano V, Puglia GD, Genovese C and Raccuia SA (2020) Cynara cardunculus L. as a Multipurpose Crop for Plant Secondary Metabolites Production in Marginal Stressed Lands. Front. Plant Sci. 11:240. doi: 10.3389/fpls.2020.00240 Cardoon (Cynara cardunculus L.) is a Mediterranean crop, member of the Asteraceae family, characterized by high production of biomass and secondary metabolites and by a good adaptation to climate change, usable in green chemistry, nutraceutical, and pharmaceutical sectors. Recent studies demonstrated the ability of cardoon to grow up in a stressful environment, which is associated with enhanced biosynthesis of biologically active compounds in these plants, and this effect is increased by abiotic stresses (salt, heat, pollution, and drought stress) that characterize many world marginal areas, affected by the climate changes. The plant response to these stresses consists in implementing different processes that modify some plant biological functions, such as alleviating both cellular hyperosmolarity and ion disequilibrium or synthesizing antioxidant molecules. The aim of this work was to investigate different cardoon response mechanisms to abiotic stresses and to evaluate their influence on the biologically active compounds biosynthesis. Following this purpose, we analyzed the ability of cardoon seeds to germinate under different salt stress conditions, and on the sprouts obtained, we measured the total phenol content and the antioxidant activity. Moreover, the growth of cardoon seedlings grown under heavy metals stress conditions was monitored, and the expression levels of heavy metal transport–associated genes were analyzed. The results showed the ability of cardoon plants to tolerate abiotic stress, thanks to different defense mechanisms and the possibility to obtain biomass with high content of biologically active molecules by exploiting the natural tolerance of this species for abiotic stresses. Moreover, we identified some important genes encoding for metal transportation that may be involved in arsenic and cadmium uptake and translocation in C. cardunculus. Then, this species can be considered as a promising crop for green chemistry and energy in marginal lands.

Keywords: heavy metals, salt, sprout, gene expression, antioxidant activity

# INTRODUCTION

fpls-11-00240 March 27, 2020 Time: 17:40 # 2

The climate, over the centuries, has always changed because of natural processes, but in the last 100 years, these changes have been much more severe and much faster than the changes that occurred in the past.

Climate change caused an increase in unfavorable or stressful environment. Abiotic stresses, such as drought, heat, cold, salt, or heavy metals like arsenic (As) and cadmium (Cd) in the soil, are exacerbated by climate change (Fedoroff et al., 2010). Drought and salinity are prime environmental stresses that influence the geographical distribution of plants in nature, limit their agriculture productivity, and threaten food security (Zhu, 2016). More than 6% of the world's total land and approximately 20% of irrigated land are affected by salt stress (Munns and Tester, 2008).

Human activities, industrialization, and modern agricultural practices are mainly responsible for the increase in environmental contamination by heavy metals (Kavamura and Esposito, 2010; Miransari, 2011; Singh et al., 2016). The use of pesticides, fertilizers, municipal and compost wastes, and also heavy metal release from smelting industries and metalliferous mines contributed to contaminate, in a decisive way, large areas of land with heavy metals (Yang et al., 2005; Singh et al., 2016). Either the presence of salt or heavy metals in soils affect the biological cycle of the plants. Although salt stress influences all growth stages of a plant, seed germination and seedling growth stages are known to be more sensitive for most plant species (Begum et al., 1992; Cuartero et al., 2006), and germination has been reported to decline with increasing salinity levels (Houle et al., 2001). Instead, the presence of heavy metals during seedling growth and plant establishment stage causes morphological abnormalities leading to yield reduction (Amari et al., 2017). Both these stresses give rise to the production of reactive oxygen species (ROS) compounds, such as O2, H2O2, and OH<sup>−</sup> (Mittler, 2002), which damage membranes and macromolecules. The plants, as a response to these stresses, have developed several strategies. One of these is the accumulation of compatible solutes in their organs in response to osmotic stress; the primary function of these solutes is to maintain cell turgor and thus the driving gradient for water uptake (Gupta and Goyal, 2017). Another strategy is the production of antioxidant compounds (ROS scavengers), such as polyphenols, which improve the antioxidant defense and can thus increase tolerance to different stress factors (Cushman and Bohnert, 2000).

The most commonly found heavy metals at contaminated sites are arsenic and Cd (Mulligan et al., 2001). These elements are non-essential metals with no known biological function in plants, so there is not a specific transporter system for them, but they can use the same transporters of the essential nutrient uptake (Verbruggen et al., 2009; Mendoza-Cózatl et al., 2011). Different gene families were proposed to be putatively involved in the uptake of As and Cd in plants (Fan et al., 2018). The Zinc Iron Protein (ZIP) family was mainly associated with the metal transportation from the soil (Guerinot, 2000). ZIP5 and ZIP6 obtained from Thlaspi caerulescens cloned in Arabidopsis thaliana indicated that both genes act in metal homeostasis (Wu et al., 2009). Natural Resistance of Macrophage (NRAMP) are metal transporters, and in A. thaliana, a non-hyperaccumulator plant, NRAMPs have high affinity with Fe and Mn transporters (Thomine et al., 2003; Lanquar et al., 2010) and also retain heavy metal transport (Ni and Cd) ability (Thomine et al., 2003; Oomen et al., 2009). In rice, OsNRAMP1 expression is induced during As stress at the same time of other stress-responsive genes, transporters, heat shock proteins, metallothioneins, and sulfate-metabolizing proteins (Tiwari et al., 2014). NRAMP3 and NRAMP4 are responsible for Cd2<sup>+</sup> efflux from the vacuole (Verbruggen et al., 2009). The heavy metal ATPases (HMAs) operate in heavy metal transport and play a role in metal homeostasis and tolerance (Gupta et al., 2013). The HMA3 was proposed to be involved in the vacuolar storage of Cd in A. thaliana (Verbruggen et al., 2009). Among the phosphate transporters (PHT) family, in Arabidopsis, overexpression of PHT1 or PHT7 causes hypersensitivity to arsenate, due to increased As uptake, whereas As resistance is enhanced through YCF1-mediated vacuolar sequestration (LeBlanc et al., 2013). The large ATP-binding cassette (ABC) transporters are involved in the transfer of different substances, including carbohydrates, lipids, xenobiotics, ions, and heavy metals (Kim et al., 2007). The AtABCC1 and AtABCC2 are the major vacuolar transporters of peptide-chelating heavy metals (Song et al., 2010) mediating AsIII–PC complex transport to the vacuole in Arabidopsis.

The possibility of using some species for phytoremediation of soils has already been widely studied, but the identified hyperaccumulators are mainly herbaceous plants, which have some limits: metal selectivity, low biomass, shallow root systems, and slow growth rates (Krämer, 2010; Cun et al., 2014; Fan et al., 2018). Therefore, some high-biomass perennial plants have been studied recently as potential candidates for phytoremediation (Fan et al., 2018). Among these crops, Cynara cardunculus L. plays an increasingly important role; previous research (Llugany et al., 2012), in fact, demonstrated both the ability of this species to survive quite well in polluted and stressed soil and its beneficial properties linked to its nutritional and nutraceutical characteristics to protect the body from oxidative stress.

Cardoon (C. cardunculus L.) is a perennial species of the Asteraceae family with annual growth cycle. The plant is well adapted to Mediterranean climates characterized by hot and dry summers (Raccuia and Melilli, 2007; Toscano et al., 2016). It comprises three taxa, C. cardunculus L. subsp. scolymus (L.) Hegi = C. cardunculus L. var. scolymus (L.) Hayek (globe artichoke), C. cardunculus L. var. altilis DC. (leafy or domestic cardoon), and C. cardunculus L. var. sylvestris Lam. (wild cardoon), considered to be the wild ancestor of globe artichoke (Rottenberg and Zohary, 1996; Raccuia et al., 2004b). The wild cardoon is a robust thistle well-adapted to Mediterranean semiarid climate. The domestic one is well known for its high biomass and for its use as raw material in green chemistry. In fact, from this plant, it is possible to realize an innovative range of bioproducts (bioplastics, biolubricants, home and personal care items, food fragrances, plant protection additives, etc.), with a positive impact on the environment and farmers' income (Toscano et al., 2016). Moreover, the cardoon biomass, which contains cellulose, hemicellulose, and lignin, can be used to produce energy (Raccuia and Melilli, 2007;

Ierna et al., 2012; Ciancolini et al., 2013; Toscano et al., 2016; Ottaiano et al., 2017; Gominho et al., 2018; Petropoulos et al., 2019; Turco et al., 2019). Recent studies, on morphological and physiological characteristics and on seed germination process, showed intraspecific variability among different cardoon populations under salt and moisture stresses (Mauromicale and Ierna, 2004; Raccuia et al., 2004a; Benlloch-González et al., 2005; Pagnotta and Noorani, 2014; Toscano et al., 2016). The resistance developed by this plant to both salt and heavy metal stresses was shown to be associated with a greater synthesis and the accumulation of the secondary metabolites (Mauromicale and Licandro, 2002; Raccuia et al., 2004b; Argento et al., 2016; Leonardi et al., 2016; Pappalardo et al., 2016), consisting of high amounts of polyphenolic compounds and inulin conferring healthy properties to this plant (Raccuia et al., 2004a; Pandino et al., 2011; Genovese et al., 2016a,b).

The aim of this work was to investigate the cardoon response mechanisms to different abiotic stresses. To this end, we focused on the ability of the seed to germinate under saline stress and its effect on the synthesis of antioxidant compounds. Moreover, we monitored the seedling growth in presence of heavy metals (As and Cd) and the expression levels of ion transporter genes associated with their translocation.

# MATERIALS AND METHODS

### Plant Materials

For the different trials of this study, two genotypes were used: a wild cardoon genotype, "A14SR" (C. cardunculus L. var. sylvestris), and one domestic cardoon variety (C. cardunculus L. var. altilis). All the genotypes belong to the cardoon germplasm of the section of Catania of the National Research Council–Institute for Agricultural and Forest System in the Mediterranean (CNR-ISAFOM of Catania, Italy). The seeds were collected by hand from dried flower heads by shaking and lightly pounding them, making sure not to damage the seeds themselves during the summertime. The seeds from domestic variety were used to carry out germination tests in salt stress conditions, whereas seeds of both genotypes were used to measure the expression levels of the genes associated with salt tolerance and heavy metal transport.

# Germination Test With NaCl

To verify the effect of salt stress on the biosynthesis and the concentration of secondary metabolites, stress conditions were induced during the germination of domestic cardoon seeds by three different salt concentrations: 0, 60, and 120 mM of NaCl.

For this experiment, 10-cm diameter Petri dishes were used with a filter paper placed at the bottom. Three experimental replicas were prepared, consisting of eight Petri dishes each for every salt concentration (0, 60, and 120 mM). All dishes were filled with 30 seeds and 10 mL of saline solution; then, each plate was closed with parafilm and placed in the growth chamber with a 12-/12-h light-dark cycle photoperiod and a 15◦C/25◦C temperature cycle (**Figure 1**) (Raccuia et al., 2004a). The young sprouts obtained were grown for 20 days.

# Total Phenol Content and Antioxidant Activity

After 20 days of growth, fresh and dry weight of 1,000 sprouts was recorded, and all the sprouts were stored at −80◦C to be used for TPC (total phenol content) and the AA (antioxidant activity) measurements.

Phenolic compounds were extracted from the plant material with 80% methanol solution with a ratio of 1:10 (wt/vol); the mixture thus obtained was sonicated for 44 min in an ultrasonic tank. Afterward, the sample was centrifuged at 3000 g for 3 min, and the supernatant obtained filtered with a 0.20-µm filter. The obtained sample was used for the determination of TPC and AA.

The TPC was determined by the Folin–Ciocalteu method, as described by Dewanto et al. (2002). The results were expressed as milligrams of gallic acid equivalent (GAE)/g of sample. The calibration curve of gallic acid ranged from 25 to 200 µg/mL.

FIGURE 1 | Fresh weight (A) and dry weight (B) percentage of altilis sprouts grown at different salt concentrations (0, 60, and 120 mM). The error bars represent the error mean of three biological replicates. Different letters indicate differences at P ≤ 0.05 (ANOVA).

The AA was determined on cardoon sprouts using 2,2 diphenyl-1-picrylhydrazyl (DPPH) radical according to Brand-Williams et al. (1995). The antioxidant capacity was calculated by using a calibration curve obtained from known concentrations of Trolox in MeOH (10–200 µmol/L). The results were expressed as micromoles of Trolox equivalent (TE)/g of sample. All samples were analyzed in triplicates.

# Seedlings Growth Analysis in Heavy Metal Stress Conditions

For this experiment, altilis and sylvestris seeds were sown in onehalf MS medium with the addition of 0, 25, or 50 µM of cadmium sulfate hydrate and sodium arsenate dibasic heptahydrate, at 20 ◦C/25 ◦C and 12-h photoperiod. After 21 days from the seed sowing, the seedlings were collected and dissected into shoot and root portions measuring the length of the two organs. Shoot height was measured from root collar to the longest leaf extremity, and root length was measured from the root collar to the longest root apex.

# Identification of Cardoon Genes Likely Associated With Plant Stress Response

To identify the cardoon orthologous genes involved in plant stress response, up to 100 mg of tissue was used to extract total RNA using protocols based on Chang et al. (1993). One microgram of total RNA, for each biological replicate, was reverse transcribed using ImProm-IITM Reverse Transcription System (Promega, Madison, WI, United States) according to the manufacturer's instructions.

To isolate cardoon genes involved in heavy metal uptake, we selected the NRAMP3, ZIP11, ABCC, HMA, and PHT A. thaliana genes sequences, and we searched orthologous sequences within the C. cardunculus var. scolymus genome v.1 1 and cardoon transcriptome (Puglia et al., 2019) using a local BLASTX and BLASTN analysis, respectively, with an E-value cutoff of 10 − 5 for both algorithms. The same procedure was carried out for the isolation of cardoon housekeeping genes, EF1 alpha, and GAPDH, which were used in the reverse transcriptase–quantitative polymerase chain reaction (RT-qPCR) analyses. The cloning primers were designed within the conserved domain regions using Primer3 software (Rozen and Skaletsky, 1999) (**Table 1**). The PCR was performed with PerfectTaq DNA polymerase (5 PRIME, Hilden, Germany), according to the manufacturer's instructions and cloned into the pJET vector (CloneJET PCR Cloning Kit; Thermo Scientific, Waltham, MA, USA). The DNA sequences were deposited to GenBank database (GenBank accession numbers from MN889990 to MN889996). To confirm the isolation of partial coding sequence of the genes, the obtained sequences were searched over the nucleotide collection (nt) database with BLASTN algorithm, and the first match was considered to confirm the correctness of isolated sequence. Moreover, to identify the protein region domain within the obtained contig sequences, we carried


<sup>1</sup>www.artichokegenome.unito.it

TABLE 2 | Analysis of variance of plant length, separated in shoot and root, measured after 21 days of treatment with As and Cd.


Partition of the treatment sum of squares into main effect and interaction. \*Significant at 0.05 probability level. \*\*Significant at 0.01 probability level. \*\*\*Significant at 0.001 probability level. ns, non-significant.

FIGURE 3 | Root and shoot lengths of sylvestris (A) and altilis (B) harvested after 21 days of treatment under different concentrations of As and Cd. The error bars indicate the standard deviation of three biological replicates.

out a BLASTP analysis of the translate sequences choosing nr (non-redundant protein sequences) database excluding C. cardunculus species (taxid 4265) or limited to C. cardunculus species (taxid 4265).

### Transcriptional Levels of the Genes Likely Associated With Heavy Metal Stress Response

RNA was extracted from shoots and roots of plants grown for 2 and 3 weeks in presence of different concentrations of Cd or As. Reverse transcription reactions were performed by using ImProm-IITM Reverse Transcription System (Promega) according to the manufacturer's instructions. One microgram of total RNA was used for the cDNA synthesis. The RT-qPCR primers for all the heavy metal– associated cardoon genes were designed using primer3 website (Rozen and Skaletsky, 1999), setting annealing temperature at 60◦C, and the gene expression levels were normalized using isolated housekeeping genes. The realtime PCR reactions were performed on a Rotorgene 6000 cycler (Qiagen, Hilden, Germany) with the QuantiNova SYBR Green Kit (Qiagen). At least three biological and three technical replicates per biological replicate were analyzed using real-time PCR analysis.

### Data Analysis

All data were submitted to Bartlett test for the homogeneity of variance and then analyzed using analysis of variance (ANOVA) with CoStat program (CoHort Software, Monterey, CA, United States). Means were separated on the basis of the least significant difference, when the F test was significant at least at 0.05 probability level. Analysis of variance at three ways completely randomized was used to analyze the factors that mostly influenced the transcriptional levels of the considered genes, whereas ANOVA at one way was used to highlight

mean of three biological replicates. Letters indicate only significantly different values (P ≤ 0.05) between concentrations in each moment of sampling.

the factors' influence on the variable analyzed. In order to provide a more comprehensive analysis of the effect of the individual metal stress treatments, we carried out a principal component analysis (PCA) using the RT-qPCR expression foldchange variation of the five metal-response–associated genes along with the length of shoot and root measured at less than 25 and 50 µM of As and Cd.

### RESULTS

### NaCl Stress

The domestic cardoon germination percentage under NaCl stress was always greater than 87% at all concentrations considered (data not reported).

Fresh weight of 21-day-old sprouts was, on the average of the treatments, 138.1 g. The three different concentrations did not show much difference between them, although the highest weight (147.5 g) recorded for sprouts not subjected to saline treatment was reduced linearly as the salt concentration increased from 60 mM NaCl (138.5 g) to 120 mM (133.0 g) (**Figure 1**). Dry weight of the shoots showed no significant differences among the tests with different salt concentrations (data not shown). The data obtained led us to deduce that dry matter content found in the sprouts comes essentially from that one contained in the seed and therefore is not affected by salt treatment. This is in accordance with the data obtained concerning the percentage of dry matter, which was affected by salt treatments. In fact, the lowest value was recorded for the control (12.88%), and the highest value was found in the presence of NaCl maximum concentration (15.41%) (**Figure 1**).

On the average of the treatments, the TPC in sprouts was 2.26 mg GAE/g, and we observed an increase of up to 25% in sprouts subjected to a higher saline treatment (120 mM) (**Figure 2**). The AA of sprouts, on the average of the treatments, was 24.96 µmol TE/g, linearly varying between the value of

21.92 µmol TE/g found for the extracts of control shoots and 26.87 µmol TE/g found in the shoots subjected to the higher salt treatment (120 mM) (**Figure 2**).

# Seedlings Growth Analysis in Heavy Metal Stress Conditions

Length of seedlings grown under As and Cd stress conditions (0, 25, and 50 µM concentrations) measured after 3 weeks of treatments was differentially affected in the two genotypes analyzed (**Figure 3**) as the variance analysis confirmed (**Table 2**). In sylvestris genotype, organ–concentration interaction, with 16.97%, was the main cause of variation, differently with respect to altilis for which this interaction was not significant. On the contrary, in altilis genotype, the metal–concentration interaction was the main and only significant cause of variation (P < 0.001), whereas for sylvestris, this interaction was not significant.

Under As treatment, altilis genotype presented a growth rate similar to the control (0 µM), whereas under Cd treatment, a significant reduction in roots and shoots length was observed, showing a lower tolerance to the metal than that observed in the previous germination phase (**Figure 3**). However, shoots remained vital with no evidence of chlorosis. In sylvestris subjected to both metal stresses, root and shoot lengths decreased with the increase in heavy metal concentration, but at 25 µM, this was more evident in Cd than in As. Root length resulted more than twofold lower compared to untreated controls.

The root/seedling length ratio was also influenced by the metal and concentration used. In particular, for sylvestris in presence of Cd, the ratios were 80, 42, and 47% at 0, 25, and 50 µM, respectively, whereas they were 43, 39, and 33% in altilis at the same metal concentrations. As for As treatments, the ratios were

concentrations. The fold change is the Ct value with respect to the housekeeping genes (EF1α and GAPDH) that is considered 1. The error bars represent the error mean of three biological replicates. Letters indicate only significantly different values (P ≤ 0.05) between concentrations in each moment of sampling.

80 and 35% at 0 and 25 µM in sylvestris, and at 50 µM, seeds germinated, but their growth was arrested eventually. For altilis at 0, 25, and 50 µM of As, the root/seedling length ratios were 43, 49, and 52%, respectively, showing minor changes compared to control. Regarding the shoot/roots length ratio across the treatments, it varied very strongly in sylvestris, compared to both altilis and control.

### Identification of Cardoon Genes Likely Associated With Plant Stress Response

The BLASTN and BLASTP analyses (**Supplementary Tables S1, S2**) of obtained cDNA contigs confirmed the identity and reliability of our sequences, not previously identified in cardoon. In fact, the nucleotide sequences present in cardoon database are just predicted from amino acid sequences.

In both genotypes, altilis and sylvestris, transcriptional levels can be influenced by the concentration of metals in the medium, type of metals, and time of growth. As regards the NRAMP3 gene, its expression level was strongly influenced by the genotype and the time of growth (P < 0.05) (**Figure 4**). The ANOVA at one way showed a high relation between concentration of metals and gene expression, more in sylvestris than in altilis (P < 0.01). For both shoots and roots, NRAMP3 was expressed higher in sylvestris and down-regulated in altilis. In particular, after 3 weeks, at least a twofold and fourfold increase in expression levels was observed in sylvestris, respectively, with 25 and 50 µM regardless of the type of metal used.

Similarly to NRAMP3, the transcriptional levels of ZIP11 were influenced by the genotype and time of growth (P < 0.05). In both genotypes, the type of metal did not influence the

mean of three biological replicates. Letters indicate only significantly different values (P ≤ 0.05) between concentrations in each moment of sampling.

transcriptional levels. In roots of sylvestris after 3 weeks, the level of ZIP11 mRNAs significantly increased in presence of Cd and As (P < 0.05), whereas in altilis, the transcriptional levels decreased with increasing levels of both metals (**Figure 5**).

The expression of HMA was influenced by concentration and genotype. In particular, highly significant values, contributed by two of the three sources of variation (genotype, time, and concentration of metal), were observed in root treated with Cd (P < 0.05). Shoots and roots showed similar response to stresses, averaging the contribution of all the three sources of variation (P < 0.05), but the transcriptional level at 3 weeks in shoots of altilis with As was threefold increased than in sylvestris (**Figure 6**).

The transcriptional level of PHT was strongly influenced by the metal and genotype. In particular, highly significant values, contributed by the three sources of variation (genotype, time, and concentration of metal), were observed in shoot and root treated with As (P < 0.01). In roots, but not in shoots, the expression of the PHT transcript, averaging the contribution of all the three sources of variation, increased compared to control, and this was more evident in sylvestris than altilis (P < 0.01). The type of metal affected the expression of PHT. In fact, in presence of As, the PHT transcriptional level showed a twofold increase with respect to Cd. In particular, after 2 weeks in the sylvestris roots, the expression increased linearly with the concentration of metal ranging from zerofold, threefold, and to fivefold at 0, 25, and 50 µM of Cd, respectively (**Figure 7**). Under As treatment, the genotype (sylvestris) was the factor that more markedly contributed to transcriptional increase in both roots and shoots after 3 weeks of treatment (P < 0.001).

Moreover, the ABCC transcriptional levels were mainly influenced by the treatments used (type of metal and its concentration). In particular, highly significant values, contributed by the three sources of variation (genotype, time,

mean of three biological replicates. Letters indicate only significantly different values (P ≤ 0.05) between concentrations in each moment of sampling.

and concentration of metal), were observed in root treated with As (P < 0.001).

On average, ABCC gene expression was not affected by the organ type (shoots or roots), but it was mainly influenced by the type of metal. In fact, with As, and not with Cd, the ABCC transcriptional levels increased by twofold in sylvestris compared to altilis (**Figure 8**). In particular, in wild cardoon, in shoots at 3 weeks, and in roots with As at 2 and 3 weeks, the ABCC mRNA was up-regulated, showing a clear involvement of this gene in As response (P < 0.01).

### Principal Component Analysis

The PCA obtained using gene expression data and morphological characteristics in response to the presence of metals shows that for both varieties the values range from the first to the fourth quarter (**Figure 9**). However, altilis forms more compact clusters, which are more dispersed for sylvestris, instead. Moreover, the cluster localization is more influenced by metal type for altilis, whereas it is more dependent on the metal concentration for sylvestris.

### DISCUSSION

In this study, we observed that salt stress did not reveal any significant effects on germination of domestic cardoon, while it linearly limited the development of fresh sprout biomass with the increase in NaCl concentrations. However, dry weight of sprouts did not show any significant variation between the salt and the control conditions, suggesting that the NaCl presence influences the water uptake capability of the plant decreasing water content of the seedlings. This salinity tolerance trait, in the early developmental stages, can be considered a plant adaptive strategy that allows us to consider C. cardunculus as a facultative halophyte species as reported by Benlloch-González et al. (2005).

As expected, salt stress induced the synthesis of phenolic substances in proportion to the increase in NaCl concentrations, confirming the important role of these molecules for the tolerance to stress conditions in plants. As well as for the total phenols, the trend of AA was directly proportional to salt concentration, demonstrating that the increase in the polyphenol content corresponds to a greater AA. These results reveal the possible use of salinity as an efficient technique for increasing the secondary metabolite content in plants grown for nutraceutical use, as documented for other elicitors or species (Bistgani et al., 2019; Hassini et al., 2019). In particular, in C. cardunculus, which is well-known for its high amount of polyphenolic compounds (Pandino et al., 2011), technical interventions based on the variation of NaCl concentrations in the germination solution could effectively modulate the bioactive molecules content in the sprouts.

In this study, for the first time, the influence of As and Cd on the growth of different genotypes of C. cardunculus sprouts was investigated. Their presence reduced the sprout growth in a significantly different way depending on the C. cardunculus genotypes. In particular, the sylvestris root and shoot lengths were mainly influenced by the concentration of metals; instead, the altilis growth was more affected by both the type of metal and its concentration. This finding is in accordance with growth rate documented for wheat and halophytes species in presence of heavy metals (Öncel et al., 2000; Leonardi et al., 2016; Amari et al., 2017).

Moreover, in the present study, we identified five C. cardunculus genes, NRAMP3, ZIP11, HMA, PHT, and ABCC, which seem to be involved in heavy metal stress response. Our results showed that in sylvestris, NRAMP3 expression is up-regulated in roots in presence of As or Cd with an increase in expression levels with longer treatments. This result agrees with Fallen et al. (2005), in which NRAMP3 and NRAMP4 were associated with Cd2<sup>+</sup> efflux from the vacuole. Their overexpression increased Cd sensitivity and determined the release of vacuolar Fe2<sup>+</sup> in Arabidopsis. Instead, no data are available in literature on NRAMP3 expression with respect to As.

ZIP proteins are generally responsible for the metal-ion homeostasis through the uptake of cations into the cytosol (Colangelo and Guerinot, 2006; Hou et al., 2017). Usually ZIP transporters are involved in the uptake and accumulation of Fe and Zn, but may also be responsible for Cd or other heavy metal transport (Guerinot, 2000). In Solanum torvum roots, IRT2 and

ZIP11 are associated with Zn transport (Xu et al., 2012). In the present study, expression of the ZIP11 transporter in wild cardoon was higher in shoot and roots subjected to Cd treatment. Similarly, ZIP11 mRNA levels increased after 3 weeks of exposure of the seedlings to As, and this is the first study to document the expression variation of this gene in association with As presence.

The uptake of As(V) in plants occurs via inorganic phosphate (Pi) system, because Pi transporters cannot distinguish between the similar electrochemical profiles of Pi and As(V) (Sánchez-Pardo et al., 2015). In our experiments, in sylvestris genotype, the phosphate transporter was upregulated in roots under As treatment, expression of which was strongly influenced by the concentrations of metal. In altilis, the expression levels of PHT were elevated also in control; probably for this reason, we did not observe any significant variation in the gene expression levels across the treatments.

These results can be associated with what was observed by Di Tusa et al. (2016), who showed in Pteris vittata an increase in As accumulation when the plants expressed PvPht1;3. In Arabidopsis, the expression pattern of PHT1;1 in the presence of As(V) decreased significantly as compared to limiting Pi condition in the natural variants, whereas the expression of PHT1;4 was higher in presence of As(V) to limiting Pi condition (Shukla et al., 2015).

Despite the documented role of the HMAs in the heavy metal transport in other species (Verbruggen et al., 2009; Fan et al., 2018), the identified cardoon HMA did not show a significant difference in the expression levels in the presence of Cd or As as measured for the other genes. Our results indicate that its expression level was mainly influenced by concentration and genotype, with the altilis showing to be more sensitive with respect to sylvestris.

In cardoon, ABCC transcriptional levels, measured under As treatment, in roots of sylvestris, remained up-regulated compared to untreated sample. The increase in the expression was influenced by the time of exposure, with the highest level at 50 µM after 3 weeks. A similar response was observed in shoots after 3 weeks of As treatment. These results are in accordance with Song et al. (2010), who showed that Arabidopsis isoforms AtABCC1 and AtABCC2 mediate AsIII–PC complex transport to the vacuole, and overexpression of AtABCC1 increases As tolerance only when coexpressed with phytochelatins (PCS). In rice, a similar ABC transporter, OsABCC1, is critical for the vacuolar AsIII–PC sequestration and As detoxification,

### REFERENCES


thus reducing As accumulation in rice grains. For this reason, knockout of OsABCC1 leads to the increase in As sensitivity (Song et al., 2014).

The results obtained in the present study lead us to conclude that cardoon seed germination and seedling establishment can take place in salt and heavy metal stress conditions. In addition, we documented the possibility to use abiotic stresses to improve the bioactive molecules content in cardoon sprouts. Moreover, we identified some important genes encoding for metal transportation that may be involved in the uptake and translocation of As and Cd in C. cardunculus.

These findings, in a context of climate change and environmental pollution, can be useful tools for the possibility of exploiting marginal lands for the cultivation of species, such as cardoon plants, able to develop in stressed environments and suitable for green chemistry and energetic purposes. This chance could represent an economically valid alternative for farmers and for the agriculture of the future.

# DATA AVAILABILITY STATEMENT

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation, to any qualified researcher.

# AUTHOR CONTRIBUTIONS

All authors had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis, approved the final version of the manuscript. SR, HP conceived and designed the study. HP, CG, VT, and GP contributed to production and assembly of data. HP, CG, VT, GP, and SR analyzed and interpreted the data. HP, CG, VT, and GP drafted the manuscript. SR critically revised the manuscript for important intellectual content. HP and GP contributed to the statistical analysis. SR supervised the study.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2020.00240/ full#supplementary-material

accumulation in germinating seeds of Triticum aestivum L. cv. Akbar. Plant Cell Physiol. 33, 1009–1014. doi: 10.1093/oxfordjournals.pcp.a078324



growth in different cardoon genotypes. Acta Hortic. 1147, 281–288. doi: 10. 17660/actahortic.2016.1147.39


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Pappalardo, Toscano, Puglia, Genovese and Raccuia. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Enhancing Biomass and Lutein Production From Scenedesmus almeriensis: Effect of Carbon Dioxide Concentration and Culture Medium Reuse

Antonio Molino<sup>1</sup> , Sanjeet Mehariya1,2, Angela Iovine1,2, Patrizia Casella<sup>1</sup> , Tiziana Marino<sup>2</sup> , Despina Karatza<sup>2</sup> , Simeone Chianese<sup>2</sup> \* and Dino Musmarra<sup>2</sup>

<sup>1</sup> Department of Sustainability-CR Portici, ENEA Italian National Agency for New Technologies, Energy and Sustainable Economic Development, Portici, Italy, <sup>2</sup> Department of Engineering, University of Campania "Luigi Vanvitelli", Aversa, Italy

### Edited by:

Briardo Llorente, Macquarie University, Australia

### Reviewed by:

Norbert Mehlmer, Technical University of Munich, Germany Jianhua Fan, East China University of Science and Technology, China

### \*Correspondence:

Simeone Chianese simeone.chianese@unicampania.it

### Specialty section:

This article was submitted to Crop and Product Physiology, a section of the journal Frontiers in Plant Science

Received: 16 June 2019 Accepted: 23 March 2020 Published: 21 April 2020

### Citation:

Molino A, Mehariya S, Iovine A, Casella P, Marino T, Karatza D, Chianese S and Musmarra D (2020) Enhancing Biomass and Lutein Production From Scenedesmus almeriensis: Effect of Carbon Dioxide Concentration and Culture Medium Reuse. Front. Plant Sci. 11:415. doi: 10.3389/fpls.2020.00415 The main purpose of this study is to investigate the effects of operative parameters and bioprocess strategies on the photo-autotrophic cultivation of the microalgae Scenedesmus almeriensis for lutein production. S. almeriensis was cultivated in a vertical bubble column photobioreactor (VBC-PBR) in batch mode and the bioactive compounds were extracted by accelerated solvent extraction with ethanol at 67◦C and 10 MPa. The cultivation with a volume fraction of CO<sup>2</sup> in the range 0–3.0%v/v showed that the highest biomass and lutein concentrations – 3.7 g/L and 5.71 mg/g, respectively – were measured at the highest CO<sup>2</sup> concentration and using fresh growth medium. Recycling the cultivation medium from harvested microalgae resulted in decreased biomass and lutein content. The nutrient chemical composition analysis showed the highest consumption rates for nitrogen and phosphorus, with values higher than 80%, while sulfate and chloride were less consumed.

Keywords: microalgae, photo-autotrophic cultivation, Scenedesmus almeriensis, lutein, ASE

# INTRODUCTION

Lutein is a carotenoid found in plants and phototrophic microorganisms. It is classified as a primary xanthophyll because of the presence of two hydroxyl functional groups in its chemical structure (Chen et al., 2017; Xie et al., 2017; Bowen et al., 2018). Lutein acts as a light-harvesting pigment that improves photosynthesis efficiency and prevents photo-damage in plant cells (Ruban et al., 2007). For its outstanding antioxidant, anti-inflammatory and colorant properties, lutein is widely used as a nutraceutical for human health (Fernández-Sevilla et al., 2010). Lutein is used in the treatment of age-related macular degeneration, and for the prevention of cardiovascular diseases and some types of cancers (Coleman and Chew, 2007; Cerón et al., 2008). The existing commercial source of natural lutein is marigold (Tagetes erecta L.) petals (Lin et al., 2015b). However, the lutein content of marigold petals is around 0.03% (based on dry biomass weight), and their harvesting and separation carry a high cost (Lin et al., 2015a). Therefore, there is a need for an alternative source for cost-effective production.

Microalgae have been considered as a potential source of natural lutein as their pigment content (0.5-1.2% based on dry biomass weight) is far higher than that of conventional sources (Cerón et al., 2008; Sánchez et al., 2008b; Lin et al., 2015a,b; Chen et al., 2017). Additionally, producing lutein from microalgae offers several environmental benefits, such as a carbon dioxide mitigation as their photosynthesis efficiency is 10-15 times higher than that of terrestrial plants (Lam et al., 2012; Yen et al., 2013; Xie et al., 2014). The microalgae growth rate is 5–10 times higher than that of higher plants. They can be grown in seawater and brackish water, as well as on non-arable land, therefore they do not compete with conventional agriculture crops for resources (Del Campo et al., 2001; Utomo et al., 2013; Ho et al., 2014; Lin et al., 2015a). Scenedesmus almeriensis is recognized as a rich source of lutein, containing up to 4.5 mg/g (dry weight), when grown in outdoor culture conditions (Del Campo et al., 2007); moreover, lutein content can be increased by manipulating growth conditions, such as light intensity and temperature, till to 5.4 mg/g (dry weight) (Sánchez et al., 2008b).

Christian et al. (2018) repot that the astaxanthin content of Haematococcus pluvialis using high concentrations of CO<sup>2</sup> (15%v/v) as the carbon source can achieve the value of about 36 mg/g (dry weight). Cheng et al. (2016) observed the highest biomass (0.65 g/L) and astaxanthin (45 mg/L) concentrations in Haematococcus pluvialis grown in 6% CO2.

Microalgae cultivation requires large amounts of water and nutrients, which reduces the cost-effectiveness of the entire biocompound extraction process (Hadj-Romdhane et al., 2013). The re-use of culture media could be a solution for the development of large-scale cultures to minimize water use and nutrient consumption (Fret et al., 2017). For the cultivation of Scenedesmus obliquus, Livansky et al. (1996) found that water use and nutrient consumption could be reduced of about 64 and 16%, respectively, by recycling the growth medium. For Chlorella vulgaris Hadj-Romdhane et al. (2012) reported that the use of optimized culture conditions during medium recycling could decrease water use and nutrient consumption of 75 and 62%, respectively.

As CO<sup>2</sup> is used as the carbon source, microalgae cultivation also contributes to CO<sup>2</sup> sequestration, which makes it one of the most promising approaches to deal with global warming (Salih, 2011; Chen and Liu, 2018). Not only does the higher CO<sup>2</sup> concentration increase the carbon source available for the growth of microalgae, but it also improves the assimilation of nutrients in their biomass (de Assis et al., 2019). The microalgae can efficiently convert atmospheric CO<sup>2</sup> into organic biomass via carbon fixation. Several inexpensive sources of CO<sup>2</sup> can be explored, such as CO<sup>2</sup> -rich industrial exhaust gases and fermentation effluent gases that could make the microalgae cultivation process cost-effective and eco-friendly (Lam et al., 2012; Xie et al., 2014; Chen and Liu, 2018). Patel et al. (2019) cultivated Chlorella protothecoides in 5% of CO<sup>2</sup> and obtained 4.12 g/L of biomass. Xie et al. (2017) cultivated Desmodesmus sp. F51 and evaluated the effect of CO<sup>2</sup> concentration on microalgae biomass and lutein production. Six different CO<sup>2</sup> concentrations (0.03, 2.5, 5.0, 7.5, 10.0, and 12.5%) were used. The biomass productivity and the specific growth rate were higher when CO<sup>2</sup> concentration increased from 0.03 to 2.5% and decreased when CO<sup>2</sup> concentration was further increased to 12.5% (Xie et al., 2017). On the basis of the above-mentioned studies 3.0% was chosen as maximum value to evaluate the effect of CO<sup>2</sup> concentration in this study.

Moheimani (2016) showed that Tetraselmis suecica can be grown in CO<sup>2</sup> from a coal-fired power plant flue gas and by reusing the growth medium. CO<sup>2</sup> biofixation with nutrient recycling and the addition of monoethanolamine were tested on Spirulina sp. cultivation by da Rosa et al. (2015). They highlighted that Spirulina can be produced using recycled medium, in spite of a reduce protein and lipid content. Cui et al. (2019) showed that using flue gas from a biomass plant and recycling the growth to cultivate Spirulina sp. a 42% lower nutrient consumption was achieved, with no significant differences between fresh medium and recycled medium in terms of protein and phycocyanin contents.

To the best of the authors knowledge, the combined effect of CO<sup>2</sup> concentration and medium recycling on the S. almeriensis growth and lutein production has not been investigated anywhere before. The present study aims at this dual purpose: developing a lab-scale methodology to recycle the supernatant/filtrate growth medium obtained from the harvesting of the microalgae biomass, and assessing how different CO<sup>2</sup> concentrations (0–3%v/v) and fresh and recycled growth media influence biomass and lutein production. The biomass was harvested by filtration, then lutein was obtained by accelerated solvent extraction (ASE) at 67◦C and 10 MPa. Ethanol, a green solvent belonging to the class of the Generally Recognized as Safe (GRAS) solvents, was used during the extraction step. The lutein content was measured by u-HPLC.

# MATERIALS AND METHODS

## Microalgae and Growth Medium

Seed culture of S. almeriensis was provided by AlgaRes Srl (Rome, Italy), and used for the cultivation under laboratory conditions. Microalgae cells were cultivated in a modified Mann & Myers medium (Mann and Myers, 1968; Barceló-Villalobos et al., 2019), consisting of NaNO<sup>3</sup> (1.0 g/L), K2HPO<sup>4</sup> (0.1 g/L), MgSO<sup>4</sup> ∗ 7H2O (1.2 g/L), and CaCl<sup>2</sup> (0.3 g/L).

Moreover, 10 mL of a solution of micronutrients, containing Na2EDTA (0.001 mg/L), MnCl<sup>2</sup> (1.4 mg/L), ZnSO<sup>4</sup> ∗ 7H2O (0.33 mg/L), FeSO<sup>4</sup> ∗ 2H2O (2 mg/L), CuSO<sup>4</sup> ∗ 5H2O (0.002 mg/L), and Co(NO3)<sup>2</sup> ∗ 6H2O (0.007 mg/L), were added to 990 mL of the growth medium.

# Photo-Bioreactor

Scenedesmus almeriensis was cultivated in a vertical bubble column photo-bioreactor (VBC-PBR), made of plexiglass, with a working volume of 1.25 L (effective height = 680 mm; external diameter = 60 mm; thickness = 10 mm) and with a volume to surface ratio (V/S) of 11.5 L/m<sup>2</sup> . The VBC-PBR was fed with a gaseous mixture (N2/O2/CO2) from tanks and was equipped to a monitor and control system which allowed fine tuning of the gaseous mixture flow rate, the temperature, the pH and the light intensity. The bottom of the reactor was equipped with three

TABLE 1 | Scheme of the inoculum and the medium reuse during the cultivation of Scenedesmus almeriensis cultures under batch mode at different cultivation conditions.


\*Inoculum derived from sample A; \*\*inoculum derived from sample B; \*\*\*inoculum derived from sample G; \*\*\*\*inoculum derived from sample D; \*\*\*\*\*inoculum derived from sample C; \*\*\*\*\*\*inoculum derived from sample F.

sintered steel spargers, installed through 3 fileted holes (1/800), to feed the gaseous mixture into the reactor. The gaseous mixture flow rate was regulated by Bronkhorst gas flow controllersTM, with a flow control accuracy of 0.5%. The top of the reactor was equipped with a temperature sensor (thermocouple) and with a pH sensor. The temperature control system consisted of an AISI 316L coaxial pipe (diameter = 60.3 mm; thickness = 1 mm) in which water was used as cooling fluid. Thanks to a heat pump, the temperature control system allowed to regulate the temperature inside the reactor with a precision of ±1 ◦C in the range 15–35◦C.

The lighting system consisted of a semi-cylinder structure, located at a distance of 100 mm from the VBC-PBR, with blue, white and red lights from a selective LED system (only blue/only white/only red or a mix of them). The lighting system was controlled and regulated by a SCADA (Supervisory Control and Data Acquisition) system, equipped with a touchscreen, a custom software and a PC to collect and to record experimental data of temperature, gas flow rate, pH, and light intensity. A schematization of the experimental set-up is sketched in **Figure 1**.

### Growth Conditions

The microalgae were grown under white light with a lux intensity of 4000 lux on the surface of the VBC-PBR, and with a gaseous mixture flow rate (N2/O2/CO2) of 50 mL/min, in which the CO<sup>2</sup> content was varied in the range 0–3.0%v/v (O<sup>2</sup> = 21%v/v). The temperature was kept constant at 28◦C. The pH of the culture medium changed in the range 7.5–8.5 due to the addition of CO2.

Two cultivation conditions were investigated. The microalgae were first cultured in the fresh medium from Mann and Myers with the composition described herein (see section "Microalgae and growth medium"). Then around 300 mL of the culture medium were stored to be used as an inoculum in further experiments and the remaining culture was filtered to measure the biomass and lutein content. In the second step, the growth medium recovered by filtration was mixed with the fresh medium and with a given amount of inoculum to achieve an optical density of 0.6–0.7 at 420 nm (see section "Microalgae and growth medium"). In both cases, CO<sup>2</sup> varied in the range 0–3.0%v/v. Each experimental step was carried out with a working volume of 1.2 L and the compositions of the medium reused are reported in **Table 1**. Each experimental condition for microalgae growth (expressed as chlorophyll content and biomass concentration), including, nutrient chemical analysis, extraction yield and lutein content measurements, was investigated in replicates, and for each condition, the standard deviation (SD) value was calculated.

### Microalgae Growth Assessment

The S. almeriensis cell growth was monitored by determining the absorbance of the samples at 420 nm (Chlorophylla), 480 nm (Chlorophyll-b), 690 nm (Chlorophyll-a), and 620 nm (Chlorophyll-b) with a UV/Visible spectrophotometer (Multiskan, Thermo Fisher Scientific, United States), as reported in the literature (Lu et al., 2017; Almomani and Örmeci, 2018; Chirivella-Martorell et al., 2018). The biomass dry weight (BDW) was calculated using the absorbance values at different biomass concentrations measured during the growth phase in the recycled medium. The following calibration curve between absorbance and concentration was obtained:

$$DBC = (0.0867 \ast A) - 0.1868$$

where DBC is the concentration of biomass on dry weight (g/L) and A is the total absorbance obtained from the sum of the absorbance values at the four chlorophyll wavelengths (**Supplementary Figure 1**).

For the final dry weight determination, cell cultures were dewatered by vacuum filtration using a vacuum filter with a pore size of 0.45 µm (Sigma-Aldrich, United States) and the

pellets were lyophilized for 24 h. Three biological replications were carried out.

### Accelerated Solvent Extraction

Lutein was extracted from mechanically pre-treated S. almeriensis cells by the accelerated solvent extraction method, using the Dionex-ASE 200 extractor (Salt Lake City, UT, United States). The pre-treatment was performed via ball milling according to the procedure described elsewhere (Mehariya et al., 2019). Four consecutive extraction cycles were performed using ethanol, a green solvent belonging to the class of the Generally Recognized as Safe (GRAS) solvents, at 67◦C and 10 MPa for the complete biomass discoloring. The extraction conditions were optimized elsewhere (Molino et al., 2018c). At the end of each extraction run (20 min), the extracts were collected in 40 mL amber glass vials, by flushing the system with 6.6 mL of fresh solvent, and the system was purged for 1 min with nitrogen (Purity ≥ 99.999%). Five technical replications were carried out.

### Growth Medium Characterization

The chemical analysis of the nutrient concentrations (initial and final) was carried out using an ion Chromatograph (Dionex ICS-1100, Thermo Fisher Scientific, Massachusetts, United States). The Dionex ICS-1100 is an integrated ion chromatography system equipped with a pump, an injection valve, and a conductivity detector. Several nutrients, such as Mg2+, SO<sup>4</sup> <sup>2</sup>−, Na+, NO<sup>3</sup> <sup>−</sup>, NO<sup>2</sup> <sup>−</sup>, Ca<sup>2</sup> <sup>+</sup>, Cl−, K+, and PO<sup>4</sup> <sup>3</sup><sup>−</sup> were analyzed.

The extract obtained after each extraction cycle was divided in equal parts and placed in two different vials adding

TABLE 2 | Scenedesmus almeriensis productivity (based on dry biomass weight) at different CO<sup>2</sup> contents with fresh and with recycled growth medium: ANOVA (one-way; α = 0.05) results.


BHT at 0.1wt% as an antioxidant for saponification and gravimetric analysis.

# Lutein Measurement

The total lutein content was gravimetrically quantified after the complete removal of the solvent using a Zymark TurboVap evaporator (Zymark, Hopkinton, MA, United States). Before measuring lutein, the saponification of the samples was carried out in order to remove lipids and chlorophyll, for avoiding the overlap of the spectra with the species present in the carotenoid family (Vechpanich and Shotipruk, 2011; Molino et al., 2018a,b; Sanzo et al., 2018). In particular, the saponification was carried out adding 1 mL of a NaOH solution in methanol (0.05 M) to 5 mL of the extract. This solution was left in the dark in an inert atmosphere for 7 h. Once this step was completed, the sample was neutralized using 3 mL of a NH4Cl solution in methanol (0.05 M). After saponification, lutein was measured using a u-HPLC Agilent 1290 Infinity II with Zorbax reverse phase C18 column with the methanol-water (95:5, v/v) mixture as the mobile phase solvent. Flow rate and column temperature were kept constant at 0.4 mL/min and 28◦C, respectively. Five technical replications were carried out.

# Statistical Analysis

ANOVA analysis (one-way; α = 0.05) was carried out to compare the results of the effect of the growth medium (fresh and recycled) at different CO<sup>2</sup> on the biomass productivity, on the nutrient consumption, and on the extraction yield and lutein content.

# RESULTS AND DISCUSSION

## CO<sup>2</sup> Content and Recycled Medium Effects on Chlorophyll Content

During their autotrophic growth, the microalgae perform photosynthesis using CO<sup>2</sup> as the inorganic source of carbon. The effects of CO<sup>2</sup> concentration and medium reuse on the chlorophyll content during the autotrophic cultivation of S. almeriensis were evaluated (**Figure 2**). The results show that the accumulation of chlorophyll increases when CO<sup>2</sup> content raises and decreases when the medium is recycled. The results also demonstrate that the higher the CO<sup>2</sup> concentration the higher the chlorophyll content and the lower the time required to reach the peak. The maximum concentrations of chlorophyll-a at 420 nm and chlorophyll-b at 480 nm, 13.6 and 11.6 respectively, were achieved in 10 days with a CO<sup>2</sup> content of 3.0%v/v and the fresh medium. In addition, the recycled growth medium exhibited a lower photosynthetic efficiency, in terms of chlorophyll content reduction (**Figure 2**), which could be due to a lower amount of nutrients available. The amount of chlorophyll needed for an efficient light absorption in autotrophic cultivation, where light is the only source of energy, may explain these results. Moreover, the toxic compounds accumulating in the medium during the first culture step may inhibit the photosynthesis efficiency, which may lead to a decrease in the chlorophyll content.

# CO<sup>2</sup> Content and Recycled Medium Effects on Biomass Concentration

**Figure 3** reports S. almeriensis concentration (based on dry biomass weight) as a function of growth time for different CO<sup>2</sup> contents with fresh and recycled growth medium. **Figure 4** reports S. almeriensis productivity (based on dry biomass weight) for different CO<sup>2</sup> contents with fresh and with recycled growth medium. Results in **Figure 3** show that with the fresh growth medium, the higher the CO<sup>2</sup> content the higher the microalgae concentration and the lower the cultivation time. The highest biomass concentration was equal to 3.7 g/L and was achieved by aeration with 3.0%v/v of CO<sup>2</sup> in 10 days. With the CO<sup>2</sup> contents of 1.5%v/v and of 0.5%v/v, the biomass concentration was about 2.3 g/L (cultivation time = 14 days) and about 1.9 g/L (cultivation time 18 days), respectively. The biomass concentration decreased when the recycled growth medium was used. After 6 days of cultivation, the reused medium caused saturation in cell growth for all the investigated CO<sup>2</sup> concentrations. Results in **Figure 4** show that S. almeriensis productivity markedly increases as CO<sup>2</sup> concentrations is increased, and evidence that the productivity is higher by using fresh medium than recycled medium. These results agree with those found for continuous cultures of C. vulgaris, grown in a recycled medium (Hadj-Romdhane et al., 2012). The recovered medium might lead to an osmotic stress with adverse effects on biomass production and quality (Hadj-Romdhane et al., 2012). Rodolfi et al. (2003) reported that the cultivation of Nannochloropsis sp. in fed-batch mode with the recycled medium came to a halt when cell concentration reached about 3 g/L (growth time of 16 days). When the fresh medium was used, the biomass concentration increased above 5 g/L (growth time of 28 days). According to Rodolfi et al. (2003), Nannochloropsis sp. released soluble inhibitors and particulate organic matter in the supernatant that seemed to inhibit the cell growth, which could justify the finding of this study. Nevertheless, a statistical analysis (one-way ANOVA with α = 0.05) on the outcomes shown in **Figure 4** was performed and the results are reported in **Table 2**. Significant differences in biomass productivity were observed between fresh and recycled growth media at 0.5% and 1.5% v/v CO<sup>2</sup> contents. No statistically significant differences were observed at 3% v/v CO<sup>2</sup> content.

Both in the presence of the fresh medium and of the recycled medium, the biomass productivity increases concomitantly with the CO<sup>2</sup> content, passing from about 100 to about 360 mg/L/day with the fresh medium, and from about 75 mg/L/day to about 340 mg/L/day with the recycled medium.

Dineshkumar et al. (2015a) reported that, by increasing CO<sup>2</sup> concentration from 0.8 to 2.5%, the photosynthesis efficiency improved, which may improve the biomass productivity of Chlorella minutissima MCC-27, while a further increase in CO<sup>2</sup> (>5%) negatively influenced the productivity of microalgae. The optimal value of CO<sup>2</sup> content, able to enhance the growth rate and the product accumulation in C. minutissima MCC- 27, of 3.5%v/v was found by the particle swarm optimization technique. However, it should be noted that the optimal CO<sup>2</sup> content for microalgae growth usually differs from species to species and depends on the culture medium and the growth conditions. It was found that Chlorella sp. and Nannochloropsis oculata exhibited an optimal growth with a CO<sup>2</sup> concentration of 2%v/v and the growth of microalgae was completely inhibited when the CO<sup>2</sup> was higher than 5%v/v (Chiu et al., 2008, 2009). Liu et al. (2019) found that the dynamics of biomass concentration, and the biomass productivity, was higher in Arthrospira platensis cultures aerated with a CO<sup>2</sup> content of 0.5%, while the growth decreased at lower or higher CO<sup>2</sup> contents than 0.5%v/v. When CO<sup>2</sup> contents were higher than 10%v/v, the algal cells only exhibited a slight increase at initial hours to days and finally bleached by the end of experiments. In summary, the optimum CO<sup>2</sup> level should be identified for each microalgae species to attain the highest biomass productivity (Santos et al., 2018).

# CO<sup>2</sup> Content and Recycled Medium Effects on Nutrient Consumption

Microalgae growth requires an adequate amount of several nutrients. Among them, nitrogen (N) and phosphorus (P) are

TABLE 3 | Initial and final nutrient concentration during the cultivation of Scenedesmus almeriensis cultures under batch mode at different cultivation conditions.


Standard deviation was less than 5% in all operative conditions. Where- IC, initial concentration; FC, final concentration.

(B). Standard deviation was calculated on five technical replications and it was less than 5% in all operative conditions.

essential for the development of cells and their metabolic activity. They are usually used as buffer agents (Choi et al., 2017). In this study, initial and final concentrations of nutrients were measured (**Table 3**). The chemical analysis shows that the highest nutrient concentrations were found in the fresh medium. The initial concentration of nutrients was slightly different due to the varied nutrient concentration in the inoculum and filtrate.

**Figure 5** shows nutrient consumption during the cultivation of S. almeriensis with fresh and reused growth medium in different CO<sup>2</sup> contents. N and P were highly consumed in all experimental conditions, as they are the main contributors to sustain microalgae growth (Lam et al., 2012; Choi et al., 2017). At the end of the experiments PO<sup>4</sup> <sup>3</sup><sup>−</sup> was totally used up in every CO<sup>2</sup> concentration with both fresh and reused growth media, even if its initial concentration was varied in the range 17.81–52.79 mg/L (**Table 3**). This may suggest that PO<sup>4</sup> <sup>3</sup><sup>−</sup> is necessary for S. almeriensis growth, which confirms the results by Hadj-Romdhane et al. (2012). A similar consideration can be put forward for NO<sup>3</sup> <sup>−</sup>: 98–100% consumption rate with the fresh medium and a slightly lower rate (80–97%) with the recycled medium.

Results in **Table 3** indicate that the higher the CO<sup>2</sup> content the higher the consumption of nutrients, except for Cl<sup>−</sup> and SO<sup>4</sup> <sup>2</sup>−. The consumption of Cl<sup>−</sup> and of SO<sup>4</sup> <sup>2</sup><sup>−</sup> was lower than

37 and 18%, respectively, with both the fresh and the recycled growth medium for all the investigated CO<sup>2</sup> contents, this results suggests that by growing S. almeriensis in the presence of CO<sup>2</sup> the biomass produced has a different composition and in particular a lower amount of Cl<sup>−</sup> and SO<sup>4</sup> <sup>2</sup>−. Similar results were obtained in tests of Chlorella vulgaris cultivation by growth medium recycling (Hadj-Romdhane et al., 2013). The consumption of Ca<sup>2</sup> <sup>+</sup>, Cl−, K+, Mg<sup>2</sup> <sup>+</sup>, Na+, NO<sup>3</sup> <sup>−</sup>, and SO<sup>4</sup> <sup>2</sup><sup>−</sup> was reduced up to 40% using the recycled medium. Remarkably, during the growth with a CO<sup>2</sup> content of 0.5%v/v, the consumption of Cl<sup>−</sup> was around 37 and 12% with the fresh and the reused growth



medium, respectively. During the growth with the recycled medium, an increase in osmolality can be found, leading to a decrease of the consumption of nutrients (Hadj-Romdhane et al., 2012). Therefore, the recycled growth medium could limit the growth of S. almeriensis, by decreasing the microalgae nutrient utilization.

Results of statistical analysis are reported in **Table 4**. Significant differences in nutrient consumption were observed at all CO<sup>2</sup> contents and for all nutrients, except NO<sup>3</sup> <sup>−</sup>, between fresh and recycled growth media. No statistically significant differences were observed in NO<sup>3</sup> <sup>−</sup> consumption at 0.5 and 3% v/v CO<sup>2</sup> contents.

# CO<sup>2</sup> Content and Recycled Medium Effects on Extraction Yield and Lutein Production

The effects of CO<sup>2</sup> and of the recycled growth medium on the extraction yield and lutein content are shown in **Figure 6**. The higher the CO<sup>2</sup> concentration the higher the extraction yield and the lutein content, both with the fresh and with the recycled growth medium. However, with the recycled growth medium, slightly lower extraction yield and lutein content values were found with respect to the growth with the fresh one. The highest extraction yield (307.44 mg/g) and the highest lutein content (5.71 mg/g) were achieved with a CO<sup>2</sup> content of 3.0%v/v with the fresh growth medium.



With the recycled growth medium, a 1.3-fold decrease in the extraction yield and a 2-fold decrease in the lutein content (maximum reductions) were measured, which might be attributed to the lower photosynthetic activity observed during the cultivation with the recovered growth medium (Hadj-Romdhane et al., 2012, 2013).

A slightly lower lutein content (5.56 mg/g) was obtained with 150 mg/L ammonium-N from Desmodesmus sp. F51 (Xie et al., 2017). Chen and Liu (2018) evaluated the effect of medium replacement during a two-stage cultivation process of Chlorella sorokiniana MB-1. The 60% replacement of the effluent from the first stage to the second stage with the fresh medium resulted in a lutein content below 3.5 mg/g, while the 80% replacement of the effluent showed a lutein concentration lower than 2.5 mg/g (Chen and Liu, 2018).

Results of statistical analysis are reported in **Table 5**. Significant differences in extraction yield and lutein extraction were observed between fresh and recycled growth media at all CO<sup>2</sup> contents.

The lutein content obtained in this study, equal to 5.71 mg/g, was lower than the value of 8.54 mg/g reported by Molino et al. (2019) who investigated the performance of the same microalgal species using a vertical bubble column photo-bioreactor with a volume to surface ratio (V/S) of 56.6 L/m<sup>2</sup> . The different lutein content can be explained by the different photo-bioreactor configuration in the two research works: V/S ratio = 11.5 L/m<sup>2</sup> (this work) and = 56.6 L/m<sup>2</sup> (Molino et al., 2019). On the other hand, the value reported in this work is slightly higher than those reported in microalgae-based lutein studies using batch phototrophic conditions (Sánchez et al., 2008b; Ho et al., 2014; Xie et al., 2014, 2017), while it is comparable with the values reported by Sánchez et al. (2008a,b) who studied S. almeriensis growth in a continuous mode. Ho et al. (2014) optimized the effect of light intensity founding that 75 µmol/m<sup>2</sup> /s was the optimum choice to attain higher lutein contents with continuous aeration of 2.5%v/v CO2; the maximum lutein content of 5.52 mg/g from Scenedesmus obliquus FSP-3 was produced by supplying 8.0 mM calcium nitrate as the nitrogen source during cultivation (Ho et al., 2014). Dineshkumar et al. (2015a) cultivated Chlorella minutissima MCC-27 in a 2-L airlift photo-bioreactor and achieved a maximum lutein productivity of 3.45 mg/L/d in batch system. Moreover, lutein productivity of Chlorella minutissima MCC-27 could be further enhanced to 4.32 mg/L/d by optimization of the operative parameters such as light intensity, CO<sup>2</sup> concentration and gaseous flow rate (Dineshkumar et al., 2015a). These observations highlight the

TABLE 6 | Lutein content: microalgae comparison (adapted from Molino et al., 2019).


importance of the photo-bioreactor configuration choice for the selected microalgae strain.

A comparison between the lutein content extracted in this study and the literature results is reported in the following table (**Table 6**):

# CONCLUSION

The effects of the fresh and the recycled growth medium, at different CO<sup>2</sup> concentrations on the productivity and the concentration of S. almeriensis, and on the lutein production were investigated. The recycled medium was found to limit the growth of S. almeriensis, as a decreased amount of available nutrients was observed, which corresponded to a slight decrease in the biomass productivity and the lutein content. On the other hand, S. almeriensis resulted to be potentially effective for CO<sup>2</sup> bio-fixation with promising performance rates; in particular, the higher the CO<sup>2</sup> content the higher the biomass productivity, as well as the microalgae concentration and the lutein content. S. almeriensis could be an ideal candidate for the commercial production of lutein from microalgae and for bio-fixation of CO<sup>2</sup> at the same time.

### DATA AVAILABILITY STATEMENT

All datasets generated for this study are included in the article/**Supplementary Material**.

# AUTHOR CONTRIBUTIONS

DM and AM: conceptualization. SM, AI, and DM: data curation. AI: formal analysis. SM: investigation. AI, PC, DK, and SC: methodology. SM and TM: writing – original draft. SM: writing – review draft. AM: project administration and resources. DM and AM: supervision.

### REFERENCES


### FUNDING

This research was funded by a Bio Based Industries Joint Undertaking under the European Union's Horizon 2020 Research and Innovation Program under grant agreement no. 745695 (VALUEMAG).

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2020.00415/ full#supplementary-material


on biomass production and medium quality. Bioresour. Technol. 132, 285–292. doi: 10.1016/j.biortech.2013.01.025


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Molino, Mehariya, Iovine, Casella, Marino, Karatza, Chianese and Musmarra. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Genetic Control of Reproductive Traits in Tomatoes Under High Temperature

Maria José Gonzalo<sup>1</sup> , Yi-Cheng Li<sup>2</sup> , Kai-Yi Chen<sup>2</sup> , David Gil<sup>3</sup> , Teresa Montoro<sup>3</sup> , Inmaculada Nájera<sup>4</sup> , Carlos Baixauli<sup>4</sup> , Antonio Granell<sup>1</sup> and Antonio José Monforte<sup>1</sup> \*

1 Instituto de Biología Molecular y Celular de Plantas, Universitat Politècnica de València-Consejo Superior de Investigaciones Científicas, Valencia, Spain, <sup>2</sup> Department of Agronomy, National Taiwan University, Taipei, Taiwan, <sup>3</sup> Enza Zaden Centro de Investigación S.L., Almería, Spain, <sup>4</sup> Centro de Experiencias de Cajamar en Paiporta, Paiporta, Spain

### Edited by:

Domenico De Martinis, Italian National Agency for New Technologies, Energy and Sustainable Economic Development (ENEA), Italy

### Reviewed by:

Gustavo R. Rodríguez, Research Institute of Agrarian Sciences of Rosario (IICAR - CONICET), Argentina Anna Maria Mastrangelo, Council for Agricultural and Economics Research, Italy Amit Gur, Agricultural Research Organization (ARO), Israel

\*Correspondence:

Antonio José Monforte amonforte@ibmcp.upv.es

### Specialty section:

This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science

Received: 18 October 2019 Accepted: 05 March 2020 Published: 24 April 2020

### Citation:

Gonzalo MJ, Li Y-C, Chen K-Y, Gil D, Montoro T, Nájera I, Baixauli C, Granell A and Monforte AJ (2020) Genetic Control of Reproductive Traits in Tomatoes Under High Temperature. Front. Plant Sci. 11:326. doi: 10.3389/fpls.2020.00326 Global climate change is increasing the range of temperatures that crop plants must face during their life cycle, giving negative effects to yields. In this changing scenario, understanding the genetic control of plant responses to a range of increasing temperature conditions is a prerequisite to developing cultivars with increased resilience. The current work reports the identification of Quantitative Trait Loci (QTL) involved in reproductive traits affected by temperature, such as the flower number (FLN) and fruit number (FRN) per truss and percentage of fruit set (FRS), stigma exsertion (SE), pollen viability (PV) and the incidence of the physiological disorder tipburn (TB).These traits were investigated in 168 Recombinant Inbred Lines (RIL) and 52 Introgression Lines (IL) derived from the cross between Solanum lycopersicum var. "MoneyMaker" and S. pimpinellifolium accession TO-937. Mapping populations were cultivated under increased temperature regimen conditions: T1 (25◦C day/21◦C night), T2 (30◦C day/25◦C night) and T3 (35◦C day/30◦C night). The increase in temperature drastically affected several reproductive traits, for example, FRS in Moneymaker was reduced between 75 and 87% at T2 and T3 when compared to T1, while several RILs showed a reduction of less than 50%. QTL analysis allowed the identification of genomic regions affecting these traits at different temperatures regimens. A total of 22 QTLs involved in reproductive traits at different temperatures were identified by multi-environmental QTL analysis and eight involved in pollen viability traits. Most QTLs were temperature specific, except QTLs on chromosomes 1, 2, 4, 6, and 12. Moreover, a QTL located in chromosome 7 was identified for low incidence of TP in the RIL population, which was confirmed in ILs with introgressions on chromosome 7. Furthermore, ILs with introgressions in chromosomes 1 and 12 had good FRN and FRS in T3 in replicated trials. These results represent a catalog of QTLs and pre-breeding materials that could be used as the starting point for deciphering the genetic control of the genetic response of reproductive traits at different temperatures and paving the road for developing new cultivars adapted to climate change.

Keywords: pollen viability, fruit set, QTL, introgression line, tipburn, abiotic stress

Under the current scenario of global warming, temperature projections estimate a 2–5◦C increase in temperature by the end of the twenty-first century (IPCC, 2014). Agriculture production will be greatly affected by this temperature rise, as high temperatures have a negative impact on crops, causing an array of morpho-anatomical, physiological and biochemical changes, which negatively affect plant growth and development and may lead to a drastic reduction in yield (Wahid et al., 2007; Bita and Gerads, 2013). The major losses due to heat stress are expected to occur at low latitude regions (temperate and tropical areas). In fact, yield reduction due to heat stress has been documented in many crops such as wheat, rice, barley, sorghum, maize, chickpea, canola, and more (Hasanuzzaman et al., 2013; Challinor et al., 2014; Alsamir et al., 2019). In the case of tomatoes, a 28% yield reduction due to high temperatures has already been reported (Alsamir et al., 2019).

Tomatoes are one of the most important horticultural crops worldwide and they are currently cultivated in a wide range of agroclimatic regions, either in open fields or under greenhouse conditions. Vegetative growth in tomatoes is well adapted to high temperatures. However, high temperature affects growth and development of different tomato plant organs or structures. For instance, a decrease in the flower number with increased temperatures has been observed (Charles and Harris, 1972) although the effect on fruit set has more dramatic consequences for the yield. Optimal temperatures for setting fruit in field conditions are between 21 and 24◦C (Geisenberg and Stewart, 1986) temperatures exceeding 32◦C during the day, and/or not decreasing below 21◦C during the night (Moore and Thomas, 1952) has dramatic consequences for fruit set and total yield (El Ahmadi and Stevens, 1997). The decrease in tomato fruit set under long-term mildly elevated temperatures has been shown to correlate with a decrease in pollen viability (Dane et al., 1991; Peet et al., 1998; Pressman et al., 2002, 2006; Xu et al., 2017b). An inserted stigma is also an important trait to ensure selfpollination in cultivated tomatoes (Rick and Dempsey, 1969; Chen and Tanksley, 2004). Growth at high temperatures may lead to the protrusion of the style above the anther cone (exsertion, Levy et al., 1978; Sato et al., 2006) negatively affecting flower pollination (Charles and Harris, 1972; Dane et al., 1991; Xu et al., 2017b). High temperatures also increase the incidence of tipburn, necrosis of the apical vegetative and reproductive tissues that have been related to insufficient water absorption and nutritional unbalance (Starck et al., 1994; Chung et al., 2010).

Screening for heat tolerance among tomato cultivars has been carried out in a limited number of cultivars, uncovering just a few "thermotolerant" genotypes (Dane et al., 1991; Opena et al., 1992; Abdul-Baki and Stommel, 1995; Grilli et al., 2007; Kugblenu et al., 2013; Xu et al., 2017b; Ruggieri et al., 2019). Most of the screenings were based on the capacity of plants to set fruit at high temperatures. Given the complexity of this trait, which is also affected by many other factors (Charles and Harris, 1972; Abdul-Baki, 1991; Dane et al., 1991; Adams et al., 2001; Alsamir et al., 2019) some authors have based their screenings on pollen viability as well (Dane et al., 1991; Paupière et al., 2017; Driedekons et al., 2018) as this trait appears to be the most sensitive to high temperatures and under the hypothesis that the dissection of the processes from flower to fruit set is an effective strategy to determine the genetic basis of thermotolerance in tomatoes.

Accessions from wild species of Solanum spp. such as S. pimpinellifolium, L., S. pennellii Correll, S. habrochaites S. Knapp and D. M Spooner, S. chmielewskii C. M. Rick et al. and S. cheesmaniae (L. Riley) Fosberg have been found to be tolerant to high temperatures (Alam et al., 2010; Nahar and Ullah, 2011, 2012; Golam et al., 2012; Paupière et al., 2017; Driedekons et al., 2018). However, our understanding of the genetic control of heat tolerance in tomatoes is still very limited. Up to the present, very few reports have addressed this issue, resulting in very few QTLs involved in heat tolerance identified in tomatoes (Grilli et al., 2007; Lin et al., 2010; Xu et al., 2017b; Ruggieri et al., 2019; Wen et al., 2019). In general, these studies have been performed using simple mapping populations (F2), with a limited sample size or with low dense marker coverage. As a consequence, the number of reported QTLs involved in heat tolerance is relatively low and the stability of their effects still needs to be verified.

There are many reasons that may explain the lack of information on the genetic control of heat tolerance in tomatoes. The choice of the proper trait to evaluate this tolerance is a crucial matter. Reproductive traits have been used extensively, including flowering traits, pollen viability and/or fruit set (for example Xu et al., 2017a). On the other hand, the heat injury index and physiological traits have been investigated less frequently (Wen et al., 2019). The relationship between both types of traits has not been studied, although it would be critical for understanding the global response to abiotic stress. Only a tiny fraction of the tomato germplasm has been screened for heat tolerance, more screening efforts would help to identify tolerance sources and to develop mapping populations suitable to investigate the different factors involved in this tolerance. Lastly, the lack of powerful mapping populations such as recombinant inbred lines (RILs), introgression lines (ILs) or multiparent advanced generation intercross (MAGIC) designed for the study of heat tolerance hampers the identification of QTLs.

In this manuscript, we take advantage of the availability of two populations derived from the cross between S. lycopersicum cv. 'Moneymaker' and S. pimpinellifolium accession TO-937: a set of 168 RILs (Alba et al., 2009) and a set of 52 ILs (Barrantes et al., 2014). Even though TO-937 has not been cataloged as a heat tolerant genotype, transgressive segregants are often obtained when crossing cultivated and exotic or wild genotypes (de Vicente and Tanksley, 1993; Diaz et al., 2014). In fact, transgressive QTLs have been identified in the TO-937-x-MoneyMaker RIL and IL populations for other traits previously, for example QTLs involved in disease resistance, vegetative growth and fruit quality (Alba et al., 2009; Powell et al., 2012; Salinas et al., 2013; Capel et al., 2015, 2017; Barrantes et al., 2016). In the current work, a first evaluation of both populations for reproductive traits in different temperature regimens was conducted, finding a promising segregation of heat tolerance in both of them. Therefore, we decided to study them in further replicated experiments to obtain insights on the genetic control

Gonzalo et al. Heat Stress Tomato

of the heat tolerance segregation found in these populations. The joint analysis of those populations could also be a powerful approach for such complex traits. For example, Rambla et al. (2017) successfully studied the genetic control of fruit volatile composition with the joint analysis of both populations. On one hand, RIL genetic architecture is appropriate for the study of traits under complex genetic control and to map QTLs with a relatively good resolution, whereas IL genetic architecture allows the accurate estimation of the effects of single QTL in a desired genetic background, facilitating QTL cloning and integration of QTL in applied marker assisted selection breeding.

This report describes the identification of candidate QTLs involved in the capacity of set fruits under different temperature regimens that can be used to improve heat tolerance in tomatoes.

### MATERIALS AND METHODS

### Plant Material and Growing Conditions

Two mapping population derived from the cross between S. lycopersicum cv. "MoneyMaker" (MM) and the S. pimpinellifolium accession TO-937: 167 Recombinant Inbred Lines population (RILs) (Alba et al., 2009) and 56 Introgression Lines (ILs), each one carrying a different introgression from TO-937 into MM genetic background (Barrantes et al., 2014), were investigated in the current research.

Mapping populations were cultivated in greenhouses under controlled temperature conditions in two facilities: Centro de Experiencias Cajamar belonging to Fundación Cajamar Comunidad Valenciana (FCCV, Paiporta, Spain) and National University of Taiwan (NTW, Taipei, Taiwan).

The 167 RILs were assayed for two consecutive years, 2016 and 2017, at FCCV, with three plants per RIL. The three plants were planted in the same bag, thus for statistical analysis, the RIL value was the mean of the three plants. Also, 25 replicates of MM and three of TO-937 were assayed in 2016. In 2017, due to limitations of seed availability for TO-937, only eight replicates of MM were assayed. Plants were grown under a stepwise temperature increase (T1: 25◦C day/20◦C night; T2: 30◦C day/25◦C night; T3: 35◦C day/30◦C night) as follows: each temperature regimen was established for 4 weeks. For 2 weeks, the plants were allowed to flower without any restriction. In the third week, the number of flowers was recorded and in the fourth week the fruit set was recorded as the number of observed developing fruits. After recording the fruit set, all the flowers and fruits were pruned from the plant, in order to avoid the physiological effects of previous fruit load in the new inflorescences, and the temperature was increased to the next regimen.

In addition, the RIL population was also analyzed in the year 2018 in greenhouses under controlled conditions at the National University of Taiwan (NTW). RILs were grown under one of the following temperatures conditions: T2 (30◦C day/25◦C night), T3 (35◦C day/30◦C night). The plants were not subjected to stepwise temperature increase; rather they were cultivated in different greenhouses for each temperature treatment.

A preliminary analysis of the 56 ILs was carried out in 2016 at FCCV with the same experimental set up as the RILs, with one replicate of three plants per IL and 25 replicates for MM, evenly distributed among the ILs. A selection of the 12 more promising ILs (SP\_1-3, SP\_1-4, SP\_2-2, SP\_5-5, SP\_6-3, SP\_7- 4, SP\_11-4, SP\_12-1, SP\_12-2, SP\_12-3, SP\_12-4, SP\_12-5) were re-evaluated in 2017 at FCCV using the same temperature regimens. In 2019, the experiment was replicated again - three ILs (SP\_12-1, SP\_12-2 and SP\_12-4) with low FRS in 2017 were replaced with ILs with introgressions from chromosome 2 and 7 (SP\_2-4, SP\_2-5, SP\_7-3) as QTLs were previously identified in those genomic regions using the RILs. For all these experiments, ILs were grown following a completely randomized design with five and three replicates (with three plants each) of each genotype in 2017 and 2019, respectively, and 6 replicates of MM with three plants per replicate. Additionally, ILs SP\_7-1, SP\_7-2, SP\_7-3 and SP\_7-4, selected to verify tolerance to tipburn (TB, see below), were included in the 2017 experiment at FCCV with a completely randomized design with four replicates of three plants.

### Phenotyping

On the third week of each temperature treatment the number of flowers (FLN), the degree of stigma exsertion (SE, scored as 0: not exserted, 1: slightly exserted and 2: very exserted) were recorded in the second and third truss. The number of fruits set (FRN), and fruit set proportion (FRS = 100 × FRN/FLN) and the incidence of tipburn (TB) on apical tissues was recorded on the fourth week as presence/absence (**Table 1**).

Several traits related to pollen viability were measured for the 167 RILs in 2016 and 2017. For pollen tube germination (TG), recorded in 2016 and 2017, pollen was collected directly from fresh flowers and cultivated for 16 h at room temperature in an 18% sucrose medium. TG was measured as the number of germinated grains (scored as 0: no germination, 1: between 1 and 25% germination, 2: between 26 and 75% germination and 3: more than 75% of the pollen grains germinated) counted with a stereomicroscope with epi-illumination. Pollen viability was assessed in 2017 with two techniques: Aniline Blue staining (AB) and flux cytometry of three flowers from each plant. The aniline blue stain was used to identify and evaluate pollen grains by visualization with a Nikon Eclipse E600 microscope, the grain number was scored as 0: no pollen, 1: 25% of the pollen stained, 2: between 26 and 75% stained and 3: more than 75% of the pollen grains stained. The images from the microscope were analyzed with Image J software<sup>1</sup> to calculate the pollen number (AB) (number of grains in 200 µM). Cytometry analysis was performed in the Enza-Zaden España S.L. facilities with an AmphaTMZ32 flow cytometer (Amphasys AG, Switzerland) to measure the percentage of viable pollen (VP).

### Statistical Analysis

The basic statistics (mean, standard deviation, maximum and minimum values), trait distribution and the Pearson correlations were calculated among traits, years and treatments. IL means were contrasted with the recurrent control MM mean with a Dunnett's test (p < 0.05) in 2017 and 2018 experiments.

<sup>1</sup>https://imagej.nih.gov/ij/



The analyses were implemented with JMP (2019) software (version 12.1.0).

### QTL Analysis

The map used for the QTL analysis was previously generated and contained 4,932 Single Nucleotide Markers (SNP) from the 8K SNP SOLCAP Infinium chip (Sim et al., 2012). The map was condensed to 1,279 SNPs with QTL IciMapping (Meng et al., 2015) to facilitate the computational analysis. Multienvironmental QTL analysis was performed with IciMapping. LOD threshold for a significance level p < 0.05 was obtained by a permutation test with 1,000 resamplings. Additionally, composite interval mapping (CIM, Zeng, 1994) was performed for each independent experiment with Windows QTL Cartographer 2.5<sup>2</sup> (Wang et al., 2007). Multi-environment QTLs were named with an abbreviation of the trait, followed by the chromosome number, the number of the QTL within chromosome, the temperature regimen (T1, T2 or T3) and the suffix \_2E (indicating two environments). QTLs for single experiments were named accordingly, adding a suffix with the experiment year (i.e., \_16 and \_17). As TP was scored as presence/absence, a χ 2 test was performed for each marker.

### RESULTS

### Phenotypic Variation for Reproductive Traits at Different Temperatures in the RIL Population

**Table 2** depicts the mean for the reproductive traits among the parent genotypes TO-937 and MM in the 2016 and 2017 experiments (TO-937 could not be assayed in 2017). MM maintained flower production in the three temperature regimens, but fructification decreased drastically (between 75 and 85%) at T2 and T3 in both years as a consequence of the heat stress. TO-937 did not set fruit at T2 and T3, or flower at T3, showing even more sensitivity to heat stress.

The RIL population MM x-TO-937 was evaluated for reproductive traits at the different temperature regimens in a preliminary experiment in 2016. A transgressive segregation was observed at T2 and T3, with a number of RILs showing a high fruit set at high temperatures, suggesting the presence of genetic variability for heat tolerance in the current RIL population (**Figure 1** and **Supplementary Table S1**). Therefore, RILs were evaluated in two additional experiments (2017, 2018). The distributions of the reproductive traits FLN, FRN and FRS among different experiments and temperature regimens are shown in **Figure 1** and **Supplementary Table S1**. At T1 (optimal temperature), trait distributions generally fitted into a normal distribution. The range of the distributions were wide, transgressive segregants and were observed for all traits in both directions (very low and very high values). However, at T2 the shape of the distributions changed significantly (**Figure 1**), skewing toward lower values. FRN and FRS dropped more drastically than FLN, although transgressive segregants toward higher values were obtained. At T3, the skew toward low values was even more dramatic with an important proportion of RIL displaying very low values, although a transgressive segregation to higher values was also observed. In the case of FRN and FRS, the median corresponded to values equal or very close to 0. The effect of the themperature increase was even more dramatic in the 2018 experiment performed by NTW. The different facilities used in NTW (a medium sized greenhouse compared with a large greenhouse in the FCCV facilities), external environmental effects (higher humidity in NTW) and the different method used to impose the temperature regimen (stepwise in FCCV vs. direct in NTW) may explain the differences in trait distribution. Nevertheless, these results showed that the reduced ability to produce fruit at high temperature was mostly due to the reduction in the plant's ability to set fruits rather than due to the reduction in the number of flowers. The broad range of the distributions and the observation of transgressive segregants in different experiments for heat tolerance demonstrated the presence of genetic variability for heat tolerance in this population, highlighting that several RILs were capable of setting fruit even at extremely high temperatures.

The consistency and robustness of the results was tested by correlation analysis for each trait-temperature combination over the different experiments. Correlations among years for the FLN were positive and highly significant at every temperature regimen (**Table 3**), although correlations were more robust among the 2016 and 2017 experiments than with the 2018 experiment. FRN was also highly correlated between the 2016 and 2017 experiments at the three temperature treatments. Correlations of FRN and FRS between both 2016 and 2017

<sup>2</sup>http://statgen.ncsu.edu/qtlcart/WQTLCart.htm

TABLE 2 | Means and standard deviation for reproductive traits flower number (FLN), fruit number (FRN) and fruit set proportion (FRS) among the parent genotypes TO-937 and MoneyMaker in the three temperature regimens (T1: 25◦C day/20◦C night; T2: 30◦C day/25◦C night; T3: 35◦C day/30◦C night) from the 2016 and 2017 recombinant inbred line experiments.


nd: no data.

with 2018 were not significant indicating that non-controlled factors affected that experiment as discussed above. Regarding the correlations between traits within temperature regimens, high correlations were observed between FLN and FRN in 2016 and 2017 experiments at all temperatures. Correlations between FRN and FRS were also generally significant and high in the all experiments and temperatures. On the other hand, FLN and FRS showed variable correlations, but these were mainly nonsignificant or negative.

### Identification of QTL Involved in Reproductive Traits in the RIL Population

Due to the low correlation of the 2018 experiment results with 2016 and 2017 data (**Table 3**), a multi-environment QTL analysis was performed only with 2016 and 2017 data. A total of 22QTL were detected across traits and temperature regimens (**Table 4**). QTL analysis performed independently in each experiment showed a total of 46 QTLs, 10 of them from the 2018 experiment (**Supplementary Table S2**) and 34 of them were detected in two experiments or at least in two temperature conditions. QTLs detected in just one experiment or temperature regimen were not considered as reliable QTLs, so they were not taken into account for further discussion. Most of the QTLs detected by multi-environment analysis were also detected by single environment analysis (**Table 4** and **Supplementary Table S2**).

A total of five QTLs were detected for FLN by multienvironmental analysis across temperatures (**Table 4**). Among them, the QTLs with more stable effects among temperatures and years was localized to chromosome 2 in the 105 cM positon (fln2.1\_T1-2E, fln2.1\_T2\_2E). The phenotypic variance explained by this QTL ranged between 8 and 10% with low QTL-x-environment (QTL-x-E) interaction and with the TO-937 allele increasing FLN. Other QTLs detected in two temperature regimens mapped onto chromosomes 4 and 6 (**Table 4**). Additionally, a QTL was detected in the 2018 experiment on chromosome 11 at T3 (fln11.1\_T3\_18, **Supplementary Table S2**), which could correspond to fln11.2\_T2 (**Table 4**), although they were detected at different temperature regimens. Interestingly, a number of QTLs on chromosomes 2, 4, 6, and 11 showed effects on different temperatures, with MM or TO-937 alleles, depending on the QTL, increasing FLN.

Eight multi-environment QTLs were identified for FRN (**Table 4**). QTLs on chromosome 4, 6, and 7 were detected at T2, with low QTL-x-E interaction and the TO-937 allele increasing FRN for frn7.1\_T2\_2E, whereas the MM allele increased the trait for the other two (frn6.2\_T1\_2E and frn6.2\_T2\_2E). These QTLs were also detected by single environment analysis (**Supplementary Table S2**). QTLs on chromosome 2 were detected in the three temperature conditions (frn2.1\_T1\_16, frn2.1\_T1\_17, frn2.1\_T2\_16, frn2.1\_T2\_17, frn2.2\_T3\_17, **Supplementary Table S2**) and located in the same region as frn2.2\_T1\_2E (100–130 cM, **Table 4**). QTLs on chromosome 12 (frn12.1T2\_18 and frn12.1\_T3\_18) were also detected in the 2018 experiment (**Supplementary Table S2**).

Two multi-environment QTLs for FRS on chromosomes 6 and 7 were mapped in T2, overlapping with FRN QTLs in the same chromosome region (**Table 4**). As for FRN, the MM allele of frs6.1\_T2\_2E increased FRS, whereas the TO-937 allele increased the trait for frs7.1\_T2\_2E. frs6.1\_T2\_2E displayed low QTL-x-E interaction, whereas it was more important for frs7.1\_T2\_2E. Additionally, QTLs on chromosome 12 frs12.1\_T2\_18 and frs12.1\_T3\_18 were detected in the 2018 experiment in the two high temperature regimens, with TO-937 alleles increasing FRS (**Supplementary Table S2**). Overall, the percentage of variance explained by the QTLs detected by multienvironment analysis or in two temperatures/experiments by single environmental analysis were relatively modest (5–12%, **Table 4** and **Supplementary Table S2**).

One multi-environment QTL for SE was detected on chromosome 2 at T1 and T2 (**Table 4**). These QTLs explained between 9.5 and 15% of the phenotypic variance and showed low QTL-x-E interaction. QTLs in the same genome region were also detected in the 2018 experiment, with the TO-937 alleles increasing SE (**Supplementary Table S2**).

# Analysis of Reproductive Traits in Introgression Lines

The complete set of 56 ILs was analyzed for the 2016 experiment. At T1 and T2, the FLN means were similar in MM and IL

TABLE 3 | Linear correlations among the three temperature treatments (T1: 25◦C day/20◦C night; T2: 30◦C day/25◦C night; T3: 35◦C day/30◦C night) and experiments (2016, 2017, and 2018) for flower number (FLN), fruit number (FRN) and fruit set proportion (FRS) in the MM-xTO-937 recombinant inbred population. Correlations among years are highlighted in yellow.

### T2 (30◦C/25◦C; day/night)


### T3 (35◦C/30◦C; day/night)


Correlations among traits within experiment are highlighted in green (2016), blue (2017) and orange (2018) at \*p < 0.05 and \*\*p < 0.01.

population for both temperatures, whereas FLN clearly decreased at T3 (**Supplementary Table S3** and **Figure S1**). In the case of FRN and FRS, a drastic decrease was already observed at T2 (**Supplementary Table S3** and **Figure S1**). Mean differences between ILs and MM were not important for FLN at T1 and T2, and no IL overcame MM at T3 (**Supplementary Table S3**). Regarding FRN, ILs with higher FRN than MM were observed at T2 and T3 (**Supplementary Table S3**), suggesting that some ILs may carry heat tolerance genes. Lastly, few ILs displayed higher FRS than MM at T1, 19 ILs at T2 and 12 ILs at T3 (**Figure 2**). Interestingly, ILs with introgression in chromosomes 1, 2, and 12 showed high FRS, overlapping with some of the reproductive trait QTLs described above in the RIL population.

In order to verify the heat tolerance of candidate ILs, an assay with five replicates per IL was carried out in 2017. Twelve ILs with better performance under high temperatures in the previous year's assay were analyzed at the same three temperatures regimens. No significant differences were found between MM and ILs for FLN at all three temperatures (**Supplementary Table S3**). The same result was found for FRN and FRS at T2 (**Supplementary Table S3**). At T3, SP\_12-5 displayed higher FRN and FRS than MM, whereas SP\_1-4 showed higher FRS (**Supplementary Table S3**). In 2019, a similar experiment was implemented with 9 of the 12 previously selected ILs and three additional ones selected for carrying an introgression in the same position of a QTL detected in the RIL population. SP\_1-4 and SP\_12-2 showed higher FRN and FRS than MM at T3 (**Supplementary Table S3**). SP\_1-4 also showed a high FRS at T3 in all three experiments. Thus, the QTLs on chromosome 12, frn12.1\_T3\_18 and frs12.1\_T3\_18 were confirmed with SP\_12-2. On the other hand, QTLs for FRN on chromosome 1 were detected in the RILs at T1 (frn.1\_T1\_2E) and T2 (frn1.1\_T2\_16), although not at T3, neither for FRS, whereas SP1-4 showed effects on FRN and FRS at T3.

### Pollen Viability

Pollen viability was determined by three different approaches: pollen tube germination (TG), aniline blue staining (AB)

TABLE 4 | QTLs detected for the reproductive traits flower number (FLN), fruit number (FRN) and fruit set proportion (FRS) and stigma exsertion (SE) at different temperature regimens (T1: 25◦C day/20◦C night; T2: 30◦C day/25◦C night; T3: 35◦C day/30◦C night) in two experiments (2016 and 2017) by multi-environment QTL analysis with ICIMapping.


QTLs are named with the trait abbreviation, followed by the chromosome number, the number of QTLs within the chromosome, temperature regimen and the suffix \_2E. The marker closest to the maximum LOD score and its genetic position are also shown. The statistical estimates for each QTL include: Maximum LOD score for genetic effects (LOD), LOD score of additive effects [LOD (A)], LOD score for the interaction of additive effects with environment [LOD (AbyE)], percentage of phenotypic variance explained by the QTL (PVE), by the additive effects (A), interaction additive by environment (AbyE) and the additive value (Add), being negative when TO-937 alleles increased the trait and positive when MM increased it.

and cytometry (VP) for the 2017 experiment (at all three temperatures), whereas in 2016 only TG was analyzed at T2 and T3. The distribution of the traits showed a decrease of all pollen viability traits as temperature increased, mostly in T3 (**Figure 3**). A low correlation was found for TG between 2016 and 2017 (**Supplementary Table S4**). On the other hand, correlations between AB and VP were positive and highly significant at all three temperature regimens (**Supplementary Table S4**). However, TG did not show a significant correlation with the previous traits. Regarding to the correlation between pollen and the other reproductive traits, no significant correlations were found between pollen viability traits and FLN, FRN or FRS (**Supplementary Table S4**).

In 2016, no QTL was identified for TG. In 2017, 4 QTLs for TG, 2 QTLs for AB and 2 QTLs for VP were identified (**Table 5**). A region at the end of chromosome 3 contained QTLs involved

in all three pollen viability traits at T1, with the TO-937 allele increasing viability, which could reflect the presence of a QTL involved in pollen viability at normal temperatures. Interestingly, the QTL for TG and AB were detected at the top of chromosome 7 at T2, which could be a candidate for a pollen viability QTL under mild heat stress. Nevertheless, as these QTLs were studied in only one experiment, their effects should be verified with additional experiments.

### Tipburn Incidence

Tipburn is a physiological disorder that usually occurs as a consequence of heat stress (Jenni et al., 2013) and it is characterized by the necrosis of the youngest leaves and inflorescences at the tip of the apex of the plant, bringing production of new vegetative and reproductive structures to a halt. A segregation of tipburn incidence (TB) was observed initially in the RIL population during the 2016 experiment, and it was also recorded in the 2017 experiment. The incidence of TB depended on the temperature regimen: at optimal temperatures TB was present in less than one-third of the plants, while at T3, over half of the RILs showed TB (**Figure 4**).

Several markers showed an association with TP incidence at chromosome 7 at T1 in 2016 and at T3 in 2017 experiments. The marker with the strongest association was located in position 58 cM (**Table 6**), although a large region of chromosome 7 also showed association with TP (**Supplementary Figure S2**), what could indicate the presence of a linked QTL. In order to verify this QTL, the incidence of TB was evaluated in ILs with introgressions in chromosome 7. ILs SP\_7-3 and SP\_7-4 did not show incidence of TP (**Figure 5**). The introgressions carried by these ILs only overlapped in the interval between markers solcap\_snp\_sl\_70992 and solcap\_snp\_sl\_70912, so it is reasonable to suggest that the QTL is within that marker interval. Nevertheless, the possibility of two linked genes located in the non-overlapping regions of the two ILs cannot be ruled out, and further fine mapping would be necessary to discern between these two hypotheses. SNP genotype data and genetic for RILs can be found in **Supplementary Table S5**. RIL and IL phenotypic data for all experiments are included in **Supplementary Tables S6, S7**.

### DISCUSSION

Heat stress affects both vegetative growth and reproduction of plants, and in both cases, the plant response is complex and controlled by multiple genes. The choice of the specific vegetative or reproductive trait for studying heat tolerance will determine the genetic mechanism that can be identified. The genetic mechanism may be common for vegetative and reproductive traits or specific, i. e., depending on the trait a component of the tolerance would be studied. The effects of heat stress on vegetative traits are evident at high temperatures (i.e., 40◦C, Wen et al., 2019), whereas reproductive traits are already affected by mild heat stress when minimum temperatures are above 25◦C (Xu et al., 2017b). Currently, night temperatures over 25◦C are not unusual during summer at regions where tomatoes are cultivated, such as the Mediterranean basin, so we decided to focus our research on identifying sources of heat tolerance on reproductive traits when temperatures increase above the minimum of 25◦C.

Reproductive traits can be measured in different ways including the number of flowers, fruits per inflorescence and pollen viability. The last one has been proposed as an adequate

TABLE 5 | QTLs detected for the pollen viability traits pollen tube growth (TG), aniline blue staining (AB) and viable pollen by cytometry (VP) at different temperature regimens (T1: 25◦C day/20◦C night; T2: 30◦C day/25◦C night; T3: 35◦C day/30◦C night) in the 2017 experiment with the Moneymaker-x-TO-937 recombinant inbred lines.


QTLs are named with the trait abbreviation, followed by the chromosome number, the number of QTLs within the chromosome and the temperature regimen. The additive value (A) is negative when TO-937 alleles increase the trait and positive when "MoneyMarker" alleles increase it. QTLs detected in at least two temperatures or experiments are highlighted in bold.

TABLE 6 | Incidence of Tipburn (TB) among recombinant inbred lines (RILs) in the 2016 experiment at T1 (25◦C day/20◦C night) (16T1) and the 2017 experiment at T3 (35◦C day/30◦C night) (T3) and association with marker solcap\_snp\_sl\_53335 (located in position 58 cM of chromosome 7).


Genotype indicates homozygous "MoneyMarker" (MM) and homozygous TO-937 (PP) for the marker, followed by the total number of RILs with the genotypes and number of RILs within each genotypic class displaying or not displaying TB. In the lower part of the table, the significance of the χ 2 test for marker/TB association is shown.

indicates the statistically significant mean differences at p < 0.05, means with the different letters indicates significant mean differences.

indicator of heat tolerance due to its correlation with fruit set and a likely simpler genetic control mechanism (Xu et al., 2017b; Driedekons et al., 2018). In the current report, we have evaluated pollen viability using three different methods: TG, AB and VP. AB and VP showed strong correlations at all temperatures, but correlations with TG were not significant. On the other hand, QTLs involved in TG, AB and VP were detected in the same region of chromosome 3 at T1 with the TO-937 allele increasing pollen viability, although the effects of the QTLs were relatively modest (explaining less than 10% of phenotypic variance). In this same region in chromosome 3, a MetaQTL for pollen viability was detected from the joint analysis of four different experiments (Ayenan et al., 2019). Similarly, QTLs in the same region of chromosome 7 were found for TG and AB in T2, also with modest effect (also less than 10% of phenotypic variance). These results suggested that, even though TG is in general controlled by a different mechanism than AB and VP, some common mechanisms may exist. Xu et al. (2017b) detected a QTL for pollen viability on chromosome 11, but we did not find QTLs for any of the assayed pollen viability traits in that chromosome, which can be explained by the different germplasms used in the two reports. We also did not find correlation between pollen viability traits and fruit set, although the lack of biological replications in the RIL population might limit conclusions from this observation. Xu et al. (2017a) found a significant correlation at high temperature, but this was not significant at the control temperature, suggesting that the correlation may be dependent on the experimental conditions. Given the low correlation between pollen viability traits and the modest effects of the QTLs detected in the current report, the genetic variability for pollen variability traits in the current RIL seems to be low and likely not sufficient for accurately studying their genetic control.

The high and positive correlation observed in the reproductive traits FLN, FRN, and FRS between the 2016 and 2017 experiments indicated a significant hereditability at all the assayed temperatures. The low correlation observed with the 2018 experiment may be due to the environmental differences between NTW and FCCV greenhouse facilities and/or developmental stages of the plants when they were subjected to high temperatures in the NTW facilities (Charles and Harris, 1972). The limitation of biological replicates does not allow us to infer sturdy conclusions from these observations. Also, several QTLs and ILs displayed consistent effects at different temperature regimens and across years, indicating that the current mapping populations harbored sufficient genetic variability for studying the genetic control of these reproductive traits. Xu et al. (2017a) did not find a correlation between pollen viability and female fertility, so it is likely that the fruit set variability in the current experiment could be related to female fertility instead of pollen variability, which would explain the lack of correlation between pollen viability and fruit set.

Therefore, the discussion will be focused on FLN, FRN, and FRS. At mild heat stress (T2), FLN only showed a slight reduction, with this being more drastic at T3. On the other hand, detrimental effects of increased temperature were apparent for mild heat stress on FRN and FRS. Thus, even though, MM and TO-937 showed a drastic reduction of FRN and FRS at T2, and more drastic at T3, transgressive segregations were observed in the RIL population. The transgressive phenotypes appeared due to allelic combinations from the different parents, with a portion of RILs setting fruit at both high temperatures. Transgressive segregation is commonly reported when crossing exotic germplasms with cultivated tomatoes (de Vicente and Tanksley, 1993; Monforte et al., 2001; Shivaprasad et al., 2012; Capel et al., 2015). The fact that transgressive segregation for heat tolerance in reproductive traits has been found in a recombinant population derived from non-heat-tolerant parents reinforces the power of interspecific crosses for uncovering hidden genetic diversity. In some cases both TO-937 and MM alleles are able to increase reproductive traits at a high temperature in different QTLs. Xu et al. (2017a) in a cross between the heat-tolerant cultivar "Nagcarlang" and heat-susceptible NCHS-1, and Wen et al. (2019) in a cross between the heat-tolerant LA2093 (S. pimpinellifolium) and heat-susceptible LA1698 (cultivated tomato), also found that alleles from both parents were associated to tolerance. Thus, the tolerance determined in parents does not seem to be a perfect indicator of the genetic potential of the alleles carried by them, which increases the potential of the tomato germplasm for developing new heat tolerant cultivars by generating new genetic combinations.

The number of QTLs detected at each temperature regime/experiment combination was relatively low. Likely, a large number of biological replication could help to detect a larger number of QTLs. This low QTL detection was also observed in the previous works by Xu et al. (2017a); Ruggieri et al. (2019) and Wen et al. (2019). In fact, the meta-QTL analysis carried out by Ayenan et al. (2019) only found one meta-QTL in chromosome 1 for the number of flowers per inflorescence (Ayenan et al., 2019). The low QTL detection probably reflects the genetic complexity of the studied heat tolerance traits, resulting in only a fraction of the genes involved being detected by using common QTL mapping approaches. In the current report, the QTLs displaying more consistent effects across temperatures and experiments were involved in FLN and FRN. Among them, the QTL with the most consistent effect is located on the distal region of chromosome 2. Interestingly, the association of fruit numbers at high temperature with makers located in chromosome 2 was also reported by Ruggieri et al. (2019). Another region which can be highlighted is the FLN QTL on chromosome 11 that may correspond with the QTL qFP11 involved in flowers per inflorescence as reported by Xu et al. (2017a). While it is true that the number of QTL experiments dissecting the genetic control of heat tolerance in tomatoes is still very limited, the identification of similar genomic regions involved in heat tolerance in this and previous reports is encouraging and provides additional proof that we are on the right track.

The analysis of the IL population was initially intended to verify the QTLs detected in the RIL population, but also to identify new ones. Our results show that there was not a very high correspondence between the QTLs identified in both populations. The lack of verification may be due to the complex genetic control of the traits, i.e., the effect of the QTL may greatly depend on the genetic background, so specific multi-loci combinations would be necessary to express the heat tolerance. These multi-loci combinations may occur in a number of RILs, but they disappeared in ILs. Nevertheless, some interesting results were found among ILs. For instance, IL SP12-2 showed higher FRN and FRS at T3 than MM in the 2019 experiment which could indicate the effects of QTLs frn12.1\_T3\_18, frs12.1\_T3\_18 detected in the RIL population. SP\_1-4 displayed higher FRN and FRS than MM at T3, whereas the QTLs detected in this region with the RILs were involved only in FRN at T1 and T2 temperatures (frn11.1\_T1\_16, frn1.1\_T2\_16). It is likely that the QTL for FRN has stronger effects in the RILs at T1 and T2, whereas in the ILs the effects become more evident at T3 due to the differences in interactions with the genetic background as a result of the different genetic structure of both populations. Nevertheless, we cannot rule out that those are different QTLs. The changes in the genetic background during IL development can also induce new phenotypes that were not observed previously in early segregating populations. Unexpected phenotypes are commonly reported in IL populations, such as the increased intensity of the internal red color in tomatoes observed in a green-fruited S. habrochaites IL (Monforte et al., 2001) increased fruit weight from S. pimpinellifolium (Barrantes et al., 2016) fruits of a melon IL showing climacteric ripening from non-climacteric parents (Vegas et al., 2013) or production of round fruits from parents that produce oval or elongated fruits (Diaz et al., 2014). As each IL carries a single introgression on the otherwise MM genetic background, the genetic complexity of the traits was reduced to a single locus, which will facilitate their future use in breeding programs and at the same time to identify the causal genes of the heat tolerance.

In addition to the reproductive traits, heat stress affects other factors such as stigma exsertion, increasing the protrusion of the style out of the anther. QTLs with strong effects on SE across experiment and temperature regimens were located on chromosome 2 in the same region as se2.1 that was map-based cloned by Chen and Tanksley (2004). Xu et al. (2017a) also detected SE QTLs in that region under heat stress, supporting the finding that se2.1 control of stigma exsertion is mostly independent on temperature conditions. The non-significant correlation between SE and FRS also reinforces the idea that SE is not an appropriate trait for evaluating the tolerance to high temperature in tomatoes (Lohar and Peat, 1997).

Lastly, TB has been studied thoroughly in leafy vegetables (such as lettuce) as the presence of this physiological disorder has a direct impact on the market value. In the case of tomato, the incidence of TB affects young developing tips and inflorescences, dramatically reducing tomato yield. We report a consistent QTL for TB on chromosome 7 successfully validated in ILs SP\_7-3 and SP\_7-4. According to the RIL experiment, the QTL would map in the central region of the chromosome (around 58 cM), whereas the IL analysis suggested that the QTL would map on the distal regions (around 95 cM), assuming that the QTL is located in the region where the SP\_7-3 and SP\_7-4 introgressions overlap. A closer look to the association of markers with TB incidence across chromosome 7 showed that markers located in the region between 50 and 100 cM were also associated with TB tolerance (**Supplementary Figure S2**), which may be interpreted as the presence of multiple linked QTLs. Therefore, we cannot rule out that SP\_7-3 and SP\_7-4 tolerance to TB is due to the same QTL or different linked QTLs on chromosome 7. Anyway, both ILs showed a very mild, negligible incidence of TB, whereas other ILs and MM were severely affected. This result suggested that TB can be prevented with just one locus (i. e., even though the ILs may carry different QTLs, the presence of only one of them is sufficient for tolerance). To the best of our knowledge, this is the first report of the mapping of TB resistance in tomatoes. Major QTLs for TB have also been recently reported in lettuce (Jenni et al., 2013; Macias Gonzalez et al., 2019) however, no candidate gene has been suggested yet. The precise location of QTL or QTLs on chromosome 7 will allow an assessment of whether the genetic control on TB resistance could be similar in both species, as soon as candidate genes can be defined in posterior research studies.

### CONCLUSIONS

The genetic control of heat stress tolerance in tomatoes is largely unknown. In our experimental design, no true biological replicates were studied in the RIL population, which limited

some of the conclusions about the general genetic control of heat tolerance in the RIL population. Nevertheless, as geneticists, our interest was focused in the cosegregation of markers and traits to identify genomic regions associated with the tolerance. We identified new sources of variability for heat tolerance and dissected the tolerance to heat stress for reproductive traits using previously developed mapping populations. The strategy was successful, as heat tolerance associated RILs, ILs and markers were identified. The heat tolerance was mainly associated with female fertility rather than pollen viability, although some QTLs involved in pollen viability were also found. Some ILs showed tolerance in several replicated experiments, making them the proper choice for increasing the heat tolerance of tomato cultivars or to define candidate or casual genes. Some QTLs detected in the RIL population could not be verified with the ILs, indicating that the effects of these QTLs are dependent on the genetic background. For those cases, a single QTL selection approach (as the IL strategy) would not be the right approach and other strategies such as genomic selection may be more appropriate (Voss-Fels et al., 2019) for managing them in breeding programs. The study of two mapping populations with different genetic structures has allowed us to obtain some insights into different heat tolerance mechanisms. Also, for the first time, a locus for TB tolerance has been mapped on chromosome 7, and its transfer to applied or basic research will be straightforward. In summary, we report a large catalog of QTLs involved in tomato reproductive traits at different temperatures. Most of these QTLs are not involved in pollen viability traits, but they increase the fruit set at high temperatures. Therefore, it would be expected that in combination with the pollen viability QTL from other sources this could speed up the development of heat tolerant tomato cultivars. Nevertheless, fruit set is one of the components of global heat tolerance. Research on other traits such as vegetative development, fruit quality and postharvest behavior at high temperatures would complement the current study to design a holistic strategy to develop heat tolerant cultivars.

### REFERENCES


# DATA AVAILABILITY STATEMENT

SNP genotype data for RILs is depicted in **Supplementary Table S5**. RIL and IL phenotypic data for all experiments are included in **Supplementary Tables S6** and **S7**. All data are also deposited in doi: 10.5281/zenodo.3567014.

### AUTHOR CONTRIBUTIONS

MG carried out the experiments, analyzed the data, and drafted the manuscript. Y-CL, K-YC, DG, and TM carried out experiments. IN and CB supervised greenhouse experiments. AG and AM designed the study and wrote the manuscript.

### FUNDING

Sara Gimeno was supported by the program "Youth Employment Initiative" from the European Union and the Spanish Ministry of Economy and Competitiveness. This work was supported by the European Commission H2020 research and innovation program through the TOMGEM project agreement No. 679796.

# ACKNOWLEDGMENTS

We thank Soledad Casal and Sara Gimeno, and all staff from FCCV for technical support.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2020.00326/ full#supplementary-material


fruit cracking in a tomato RIL Solanum lycopersicum × S. pimpinellifolium population. Theor. Appl. Genet. 130:213. doi: 10.1007/s00122-016-2809-9



**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer GR declared a past co-authorship with several of the authors AM and MG to the handling Editor.

Copyright © 2020 Gonzalo, Li, Chen, Gil, Montoro, Nájera, Baixauli, Granell and Monforte. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Field Resistance to Phakopsora pachyrhizi and Colletotrichum truncatum of Transgenic Soybean Expressing the NmDef02 Plant Defensin Gene

Natacha Soto<sup>1</sup> \*, Yuniet Hernández<sup>1</sup> , Celia Delgado<sup>1</sup> , Yamilka Rosabal<sup>1</sup> , Rodobaldo Ortiz<sup>2</sup> , Laura Valencia<sup>1</sup> , Orlando Borrás-Hidalgo1,3, Merardo Pujol<sup>1</sup> and Gil A. Enríquez<sup>1</sup>

<sup>1</sup> Soybean Biotechnology Laboratory, Plant Biotechnology Department, Center for Genetic Engineering and Biotechnology, Havana, Cuba, <sup>2</sup> National Institute of Agricultural Sciences, San José de las Lajas, Cuba, <sup>3</sup> Shandong Provincial Key Laboratory of Microbial Engineering, School of Biotechnology, Qilu University of Technology, Jinan, China

### Edited by:

Domenico De Martinis, Italian National Agency for New Technologies, Energy and Sustainable Economic Development (ENEA), Italy

### Reviewed by:

Josefina Leon-Felix, Centro de Investigación en Alimentación y Desarrollo (CIAD), Mexico Marco Loehrer, RWTH Aachen University, Germany

> \*Correspondence: Natacha Soto natacha.soto@cigb.edu.cu

### Specialty section:

This article was submitted to Crop and Product Physiology, a section of the journal Frontiers in Plant Science

Received: 19 November 2019 Accepted: 15 April 2020 Published: 26 May 2020

### Citation:

Soto N, Hernández Y, Delgado C, Rosabal Y, Ortiz R, Valencia L, Borrás-Hidalgo O, Pujol M and Enríquez GA (2020) Field Resistance to Phakopsora pachyrhizi and Colletotrichum truncatum of Transgenic Soybean Expressing the NmDef02 Plant Defensin Gene. Front. Plant Sci. 11:562. doi: 10.3389/fpls.2020.00562 Fungal diseases lead to significant losses in soybean yields and a decline in seed quality; such is the case of the Asian soybean rust and anthracnose caused by Phakopsora pachyrhizi and Colletotrichum truncatum, respectively. Currently, the development of transgenic plants carrying antifungal defensins offers an alternative for plant protection against pathogens. This paper shows the production of transgenic soybean plants expressing the NmDef02 defensin gene using the biolistic delivery system, in an attempt to improve resistance against diseases and reduce the need for chemicals. Transgenic lines were assessed in field conditions under the natural infections of P. pachyrhizi and C. truncatum. The constitutive expression of the NmDef02 gene in transgenic soybean plants was shown to enhance resistance against these important plant pathogens. The quantification of the P. pachyrhizi biomass in infected soybean leaves revealed significant differences between transgenic lines and the non-transgenic control. In certain transgenic lines there was a strong reduction of fungal biomass, revealing a less severe disease. Integration and expression of the transgenes were confirmed by PCR, Southern blot, and qRT-PCR, where the Def1 line showed a higher relative expression of defensin. It was also found that the expression of the NmDef02 defensin gene in plants of the Def1 line did not have a negative effect on the nodulation induced by Bradyrhizobium japonicum. These results indicate that transgenic soybean plants expressing the NmDef02 defensin gene have a substantially enhanced resistance to economically important diseases, providing a sound environmental approach for decreasing yield losses and lowering the burden of chemicals in agriculture.

Keywords: NmDef02 defensin gene, fungal resistance, Colletotrichum truncatum, Phakopsora pachyrhizi, soybean, Bradyrhizobium japonicum

# INTRODUCTION

Although soybean [Glycine max (L.) Merryll] is one of the most economically important crops worldwide (Hartman et al., 2011; Rosa et al., 2015), its outstanding role in feeding the world through its contribution of both protein meal and vegetable oil, is jeopardized by the attack of fungal diseases at all growth stages, producing a considerable reduction in yields (Hartman et al., 2015).

Asian soybean rust caused by Phakopsora pachyrhizi (Sydow & Sydow) is the most destructive disease in soybean, causing early defoliation while affecting the weight and quality of the seeds (Hartman et al., 2005). P. pachyrhizi can reduce yields by over 80% when environmental conditions are favorable for disease development, and it can affect yields with a disease incidence of just 0.05% (Tremblay et al., 2012). It is found in the soybeanproducing countries of South America (Yorinori et al., 2005), the United States (Schneider et al., 2005), and Mexico (Cárcamo Rodríguez et al., 2006). Asian soybean rust has also been reported in other Latin America countries (Yorinori et al., 2005) including Cuba (Pérez-Vicente et al., 2010).

Another important disease that affects soybean is anthracnose produced by Colletotrichum truncatum. The soybean plant can be infected at any stage of development (Yang and Hartman, 2015). It is prevalent in tropical and subtropical countries, causing severe effects on grains that lead to seedling loses (Yang and Hartman, 2015; Marmat and Ratnaparkhe, 2017). Furthermore, C. truncatum can systemically infect mature plants, and damages are greater under heavy rains with a high plant population (Yang and Hartman, 2015; Pawlowski and Hartman, 2016).

There are now no commercially available soybean cultivars with good agronomic characteristics that are resistant to these diseases. Moreover, modern fungicides cannot effectively control these pathogens, while increasing production costs, and having a strong negative impact on the environment through the use of chemicals (Abdallah et al., 2010; Godoy et al., 2016; Kawashima et al., 2016).

Certain Asian soybean rust resistance genes (Rpp1-Rpp7) have been identified in the soybean genome (Lemos et al., 2011; Yamanaka et al., 2013; King et al., 2017; Childs et al., 2018; Hossain and Yamanaka, 2019). However, these genes only confer pathotype-specific resistance, controlled by the interaction of the R genes in soybean with the virulence genes in pathotypes of P. pachyrhizi (Vittal et al., 2014; Yamanaka et al., 2015; Langenbach et al., 2016). The effectiveness of specific pathotype resistance genes, whether resistance is complete or incomplete, is usually short-lived, especially when it is evaluated against obligate pathogens such as P. pachyrhizi with a high variability and virulence (Yamanaka et al., 2013, 2016). Nevertheless, Kawashima et al. (2016) identified and cloned a gene (CcRpp1) from Cajanus cajan that confers resistance to different isolates of P. pachyrhizi when expressed in soybean.

To counteract fungal infection, plants develop innate immune systems that recognize the presence of pathogens and start effective defense responses (Lay and Anderson, 2005; van Loon et al., 2006). Plants produce pathogenesis-related protein-like defensins (Van Loon, 1997). Defensins are small antimicrobial peptides that play a fundamental role in the innate immunity of plants (Thomma et al., 2002, 2003; van der Weerden et al., 2013; Vriens et al., 2014; Moosa et al., 2018). Their biological activities consist of the inhibition of proteases, blocking of ionic channels, and inhibition of protein synthesis, among others (Graham et al., 2008). Defensins may inhibit the growth of a wide range of microorganisms and phytopathogenic insects; they may also be involved in abiotic stress adaptation (Tavares et al., 2008). This means that not only do defensins produce a defense against plant pathogens, but they also generate adaptations to difficult conditions, a characteristic that makes them even more attractive for modern agriculture. The structure stabilized by disulfide bridges and cationic charge presented by defensins makes them very stable molecules, which is essential for the development of biotechnological products based on them (Tavares et al., 2008).

Growth inhibition by plant defensins of a wide range of pathogenic fungi is not associated with toxicity in mammalian or plant cells (Thomma et al., 2002). Studies of the biological activity, stability, and range of toxicity of an isolated chickpea defensin (Ca-AFP) revealed that there are no risks in using the gene in the production of transgenic crops (Islam, 2008). However, to date, three defensin-related proteins have been described as allergens (Singh et al., 2006; Petersen et al., 2015). It is therefore essential to assess the allergenicity and toxicity of genetically modified crops carrying defensin before they become a product for human and animal use. Consequently, plant defensins can be used to produce transgenic crops with improved resistance to pathogens.

Several genes encoding defensins have been successfully transferred to important plant species such as tobacco (Portieles et al., 2010; Lee et al., 2018), tomato (Abdallah et al., 2010), potato (Gao et al., 2000; Portieles et al., 2010; Kumar and Chakrabarti, 2018), rice (Kanzaki et al., 2002; Jha and Chattoo, 2010), and beans (Espinosa-Huerta et al., 2013), among others, producing resistance to different pathogens. It has been demonstrated that the expression of the NmDef02 defensin gene in tobacco and potato transgenic plants produced a strong resistance against Phytophthora infestans under greenhouse and field conditions (Portieles et al., 2010).

Therefore, our objective in this study was to determine whether transgenic soybean plants expressing the NmDef02 defensin gene are better equipped to overcome infection by P. pachyrhizi and C. truncatum under field conditions. The efficiency of the symbiosis of Bradyrhizobium japonicum with transgenic soybean plants carrying defensin was also evaluated, since the association with this bacterium is essential for atmospheric nitrogen fixation in soybean plants, eliminating the need for chemical nitrogen fertilization. This strategy is in line with the goals of decreasing yield losses, decreasing the use of chemicals, and contributing to an increase of 100% in yields that are required for sustaining a world population of nearly 10 billion people in 2050, all of which are challenges acknowledged by Next Generation Agriculture.

# EXPERIMENTAL PROCEDURES

### Plant Material

The soybean [Glycine max (L.)] was of the variety DT-84 from Vietnam. Embryonic axes of mature seeds were used as explants for their bombardment-mediated transformation, according to Soto et al. (2017).

# Vector Construction

The pCP4EPSPS-DEF vector carrying the cp4epsps gene and the NmDef02 defensin gene isolated from Nicotiana megalosiphon

was the vector system used for the transformation. The cassette with p35S/NmDef02/tnos obtained by Portieles et al. (2010) was cloned into the pCP4EPSPS binary vector (Soto et al., 2017) to generate the pCP4EPSPS-DEF vector (**Figure 1A**). This was done at the soybean biotechnology laboratory at CIGB, Havana.

# Transformation, Selection, and Plant Regeneration

A total of 150 explants were bombarded with the pCP4EPSPS-DEF vector and selected in a MSB5 medium with 20 mg/L of glyphosate. The controls used were explants derived from cultivar DT-84, which were cultured under the same conditions and without selective agent. The regenerated shoots were excised and transferred to the same medium without selecting for rooting, as described by Soto et al. (2017). Plantlets were transferred to pots containing a mixture of organic material and zeolite (50/50) in an acclimatized greenhouse at 26–27◦C to produce seeds. All seeds collected from each of the R0 generation transgenic lines were germinated under greenhouse conditions to obtain the T<sup>1</sup> generation and the following generations.

# Analysis of the Integration of the NmDef02 Gene Using Polymerase Chain Reaction (PCR)

Total genomic DNA was isolated from young leaves of glyphosate-resistant and control plants using the CTAB protocol (Doyle and Doyle, 1987). PCR was used to screen for transformants (T1) carrying the NmDef02 gene. Each reaction was performed in a total volume of 25 µl, and the PCR mixture consisted of 10 mM buffer Go Taq Green 5x, 10 mM dNTP; 20 pmol/µl of each primer, 1 unit of GoTaq DNA polymerase (Promega, United States), and 400 ng of the genomic DNA. The primers used were forward 5 0 -GCTGGCTTATGCTTCCTCTTCTTG-3<sup>0</sup> and reverse 5<sup>0</sup> - TCACAGACTTGGACGCAGTTCG-3<sup>0</sup> . The reaction started with an initial denaturing step at 95◦C for 3 min, followed by 30 cycles of the following profile: denaturing at 95◦C for 1 min, annealing at 64◦C for 1 min, synthesis at 72◦C for 1 min followed by an extension at 72◦C for 10 min. The PCR products were loaded onto 2% electrophoresis agarose gel and visualized using ethidium bromide.

# Relative Expression of the NmDef02 Gene Using qRT-PCR

Total RNA for the qPCR analysis was extracted from frozen leaf tissues of six soybean transgenic lines and non-transgenic plants using Tri-Reagent (Sigma-Aldrich, United States) according to the manufacturer's protocol. The RNA was sequentially treated with DNase I (Promega, United States) at 37◦C for 15 min to remove the remaining genomic DNA. The integrity and yield of RNA were evaluated using agarose gel electrophoresis and a NanoDrop Spectrophotometer (Thermo Scientific), respectively. The cDNA was synthesized from 1 µg of total RNA using an oligo-(dT) primer and the Super-Script III reverse transcriptase kit (Invitrogen, United States) according to the manufacturer's instructions. qPCR reactions were carried out in a final volume of 15 µl containing 0.2 mM of each primer, 10 µl SYBR (QuantiTect SYBR Green PCR kit; Qiagen, Germany), and a dilution of the 25x cDNA. The soybean β-actin gene was selected as the housekeeping gene and used for normalizing the data. The primer sequences to amplify the NmDef02 gene were forward 5<sup>0</sup> -AAGCTTATGCGTGAGTGCAAGGCTC-3 0 and reverse 5<sup>0</sup> -CTGCAGTTAGCACTCGAATATAC-3<sup>0</sup> . The primer sequences from β-actin gene were forward 5 0 -GTGTCAGCCATACTGTCCCCATTT-3<sup>0</sup> and reverse 5<sup>0</sup> - GTTTCAAGCTCTTGCTCGTAATCA-3<sup>0</sup> . The amplification conditions included: an initial 95◦C denaturation step for 15 min, followed by denaturation for 15 s at 95◦C, annealing for 30 s at 60◦C, and extension for 30 s at 72◦C for 40 cycles. Quantitative PCR was conducted using a Rotor-Gene 3000 PCR machine (Corbett, Sydney, NSW, Australia). The efficiency of the primers was determined by using serial dilutions of a mixture of different cDNAs (from each sample) with concentrations of 5x, 25x, 125x, and 625x. Further analysis of the dissociation temperature of the PCR products was performed to determine their specificity. The dissociation analysis and the Ct values were used by the Rotor-Gene equipment program (version 6.1) to determine the efficiency of the qPCR reactions.

The q-gene method was used to obtain the relative expression of the qPCR values, and they were analyzed with the Q-Gene 96 program (Muller et al., 2002). The results represent the mean of three biological and technical replicates on each transgenic line and the non-transgenic control. The amplified products were sequenced to verify their identity.

## Southern Blot Analysis

Southern blot and hybridization were performed by following the protocol described by Sambrook et al. (1989). Genomic DNA (15 µg) from soybean plants (T4 generation) selected with glyphosate and evaluated in the field was digested with EcoRV. The digested DNA was electrophoresed on a 0.8% agarose gel and blotted onto a nylon membrane (Hybond N, Amersham Biosciences). Hybridization was carried out with a- [32P]-dATP-labeled cp4epsps gene as the probe, using the DNA random primer labeling kit (Promega, United States). The probe was obtained by PCR with cp4epsps-gene-specific primers to generate the 887 bp fragment. It was isolated from a 1% agarose gel and purified using the SV Gel Wizard Clean-Up System (Promega, United States).

### Symbiosis of Bradyrhizobium japonicum With Transgenic Soybean Plants Expressing the NmDef02 Defensin

Because of the essential role of B. japonicum in atmospheric nitrogen fixation in soybean, we determined the efficiency of its symbiosis with transgenic soybean plants carrying defensin. The test was performed with 30 transgenic plant seeds and 20 non-transgenic seeds, which were used as the control. The Semia 5080 strain of B. japonicum was used for inoculations. Seeds were planted in pots with zeolite and placed in plastic trays with water to maintain humidity within a greenhouse. A week after seed germination, seedlings were inoculated with 1 mL of the

diluted bacterial culture at 2 × 10<sup>6</sup> viable cells/pot. Uninoculated transformed and non-transformed seedlings were used for the control of the assay. The plants were collected during flowering and the following symbiotic efficiency indicators were quantified: number of nodules per plant, fresh weight of nodules per plant (g), fresh leaf weight per plant (g), and dry leaf weight per plant (g). There were three replicates of the experiment.

### Phakopsora pachyrhizi Field Trials

Thirteen transgenic lines obtained by self-pollination (T3) and non-transgenic plants (variety DT-84) were grown on an experimental area in Havana during the winter (November– March). The field experiment was authorized by the National Center for Biological Safety of Cuba with the license: LH47-L (95) 13. Seeds were inoculated with B. japonicum and planted in a field near soybean plants affected by P. pachyrhizi. A randomized block design was used, with three blocks/line and 450 seeds/line. Plants were not treated with fungicidal products. The experiment was assessed daily. After the outbreak of rust symptoms, affected leaves were analyzed using an Envirologix QuickStix kit (Envirologix, United States) to confirm the presence of P. pachyrhizi.

The incidence of P. pachyrhizi was calculated by dividing the number of plants showing symptoms by the total number of plants in the experiment and multiplying the resulting value by 100. When the first symptoms of Asian soybean rust appeared, the severity of the disease (% of the area of the leaf affected by rust) was calculated according to the protocol proposed by

Ploper et al. (2006), through which the central folioles of the lower, middle, and upper parts of the plants were sampled. The following scale was used to calculate severity: Grade 1 (0%); Grade 1.5 (0.6–1%); Grade 2 (1–5%); Grade 3 (6–25%); Grade 4 (26–50%); Grade 5 (>50%). The second evaluation took place 10 days after the outbreak, and 20–30 plants were analyzed. The percentage of defoliated plants was calculated at 36 and 60 days after the start of symptoms. Plants were harvested and the following morpho-agronomic parameters of 30 plants for each line were evaluated: height of the plant (cm), height of the 1st pod (cm), number of branches, number of pods, number of seeds, and weight of seeds/plant (g).

### Phakopsora pachyrhizi Biomass

Quantitative PCR was used to measure the fungal biomass of P. pachyrhizi in leaves as described by Lamour et al. (2006). Two folioles (from the upper and lower parts of the plants) were collected from each transgenic and non-transgenic plant 10 days after Asian soybean rust symptoms were observed. Thirty plants from each transgenic line and the non-transgenic control were used in this analysis. The collected plant material was frozen at −80◦C. The leaves from each line and from the control were pooled separately, macerated in liquid nitrogen, and homogenized to use 1 g of tissue.

Genomic DNA was extracted from frozen leaf tissues of transgenic and non-transgenic plants using a modified CTAB protocol (Doyle and Doyle, 1987). The integrity and yield of DNA were evaluated using agarose gel electrophoresis and a NanoDrop Spectrophotometer (Thermo Scientific), respectively. The qPCR reactions were carried out in a final volume of 20 µl containing 200 ng DNA, 0.8 mM of each primer, and 10 µl SYBR (QuantiTect SYBR Green PCR kit; Qiagen, Germany) by using a Rotor-Gene 3000 PCR machine (Corbett, Sydney, NSW, Australia). The specific primers to amplify an ITS sequence of P. pachyrhizi were Ppm1 5<sup>0</sup> - GCAGAATTCAGTGAATCATCAAG-3<sup>0</sup> forward and Ppa4 5<sup>0</sup> - TCAAAATCCAACAATTTCCC-3<sup>0</sup> reverse (Frederick, 2006). The amplification conditions used included: an initial 95◦C denaturation step for 15 min, followed by denaturation for 15 s at 95◦C, annealing for 30 s at 50◦C, and extension for 30 s at 72◦C for 40 cycles.

For the quantification of biomass, a standard curve (1/10, 1/100, 1/1000) was made with DNA isolated from pustules of the fungus of highly infested plants. Data were analyzed in Rotor-Gene 3000 software (Corbett). The amplified products were sequenced to verify their identity.

# Colletotrichum truncatum Field Trials

Three transgenic lines obtained by self-pollination (T4) that were selected for resistance to Asian soybean rust and nontransgenic plants (susceptible variety DT-84) were grown on an experimental area of the National Institute of Agricultural Sciences (INCA), Mayabeque province, during the winter (November–March). Seeds were inoculated with B. japonicum and planted in soil with a history of a high incidence of anthracnose caused by C. truncatum. A total of 360 seeds from each line and 200 seeds from the non-transgenic plants were used in this study. A randomized block design was used, with three blocks per line. Plants were not treated with fungicidal products, and they were evaluated weekly. After the outbreak of symptoms, infected pod samples were collected and the fungus was isolated and identified (Chen et al., 2006). The incidence of C. truncatum in the experiment was calculated by dividing the number of plants with symptoms by the total number of plants in the experiment and multiplying the result by 100. Plants were harvested and the morpho-agronomic parameters of 30 plants of each line were evaluated, specifically the height of the plant (cm), height of the 1st pod (cm), number of branches, number of pods, number of seeds, and weight of seeds/plant (g). Soybean plants transformed with the pCP4EPSPS-DEF (**Figure 1A**) and pCP4EPSPS (Soto et al., 2017) plasmids were also grown in disease-free soil, using non-transgenic plants (DT-84) as the control. Plants were harvested and the morpho-agronomic parameters of 30 plants were evaluated.

# Statistical Analysis

Data were statistically analyzed by IBM SPSS Statistics 25 using ANOVA at the P ≤ 0.05 level. The means of the experimental replicates were plotted, and the standard deviations are shown as error bars.

# RESULTS

# Transformation and Plant Regeneration

Particle acceleration-mediated transformation was carried out using a pCP4EPSPS-DEF vector carrying the glyphosate resistance gene and the NmDef02 defensin gene under the control of the cauliflower mosaic virus 35S promoter (**Figure 1A**). The first herbicide-resistant shoots from explants via direct organogenesis were observed after 15 days in the selection medium. Data obtained in the transformation experiment showed that 19 out of 150 bombarded explants developed shoots in the selection medium with glyphosate. All transgenic lines showed similar growth to the non-transformed control and were transferred to greenhouse conditions until the T<sup>2</sup> seeds were harvested. After having developed their second trifoliate leaf, the plants were sprayed with a concentration of 360 g/L of glyphosate for resistant plant selection. In addition, the expression of the CP4 EPSPS protein was demonstrated in 22 rooted lines (T0) using the Roundup Ready immunodetection kit.

# Integration of Transgene in Soybean Plants

In order to analyze the stability of transgene integration in the T<sup>1</sup> generation, DNA of glyphosate-resistant lines underwent PCR analysis. This analysis detected the presence of the expected 140 bp fragment (**Figure 1B**), indicating the presence of the NmDef02 gene in the transgenic soybean plants, while it was not detected in non-transformed plants.

Six transgenic lines selected in glyphosate and showing resistance in the field were screened by Southern blot analysis in generation T3. Signals corresponding to the region of the

plasmid between the two sites that were recognized by the EcoRV enzyme were detected (**Figure 1A**). The signals showed a stable integration of the segment of the plasmid containing the cp4epsps and NmDef02 genes in the genome of the transformed plants of a size of 5.3 Kb (**Figure 1C**). DNA isolated from non-transformed plants did not show any hybridization signal (**Figure 1C**).

# Relative Expression of the NmDef02 Defensin Gene in Transgenic Plants

The relative expression of the NmDef02 gene in six transgenic soybean lines was evaluated by quantitative RT-PCR (**Figure 2**). The transgenic lines differed in defensin expression level. Although lines Def1, 17, and 18 showed significant difference (p ≤ 0.001) compared to the non-transgenic control, Def1 showed the highest accumulation of defensin NmDef02.

# Efficient Nodulation by Bradyrhizobium japonicum in Soybean Plants That Express the NmDef02 Defensin

This test was carried out to evaluate the efficiency of the symbiosis of this bacterium with transgenic soybean plants carrying defensin. Taking into account that the Def1 transgenic line showed fungal resistance in the experiments performed under natural infection conditions, nodulation in these plants when inoculated with B. japonicum was evaluated. In this experiment, the transgenic plants showed a similar phenotypic development to the non-transformed plants used as a control, which was observed in the growth of the stem and color of the leaves, as well as in flowering under greenhouse conditions (**Figure 3A**). Inoculation produced nodulation in all transgenic plants and the control (**Figure 3B**). Nodes of different sizes were obtained in all of the inoculated plants (**Figure 3C**), and the highest number of

levels of the NmDef02 gene in transgenic plants, compared to the constitutive expression of the endogenous β-actin of soybean. 1–18: Transgenic lines (Def1, 12, 15, 16, 17, and 18). NT: non-transgenic control. Bars represent mean (n > 9) and standard error of the results obtained. Asterisks indicate significant differences using ANOVA by Tukey's multiple range test with respect to the control (\*p ≤ 0.05, \*\*\*p ≤ 0.001).

FIGURE 3 | Symbiosis of Bradyrhizobium japonicum with soybean plants carrying the NmDef02 defense gene. (A) Transgenic plants in pots with zeolite and inoculated with B. japonicum. (B) Transgenic plant with nodules at the base of the stem, induced by inoculation (left) and plants without inoculation (right). (C) Nodules produced in inoculated transgenic plants. (D) Nodular mass with an internal red coloration indicating the production of the leg hemoglobin required for the fixation of atmospheric nitrogen.

nodules was observed in the plants harboring defensin, but this did not lead to marked differences in the average fresh weight of the nodules (**Table 1**). In all cases, the nodules were widely distributed in the root neck region (**Figure 3B**) and showed an internal red coloring (**Figure 3D**) due to the presence of the leghemoglobin protein. The number of functional nodules with a red coloring in the plants (**Table 1**) is an indirect indicator of the occurrence of the process of atmospheric nitrogen fixation by the bacteroide. The inoculated plants maintained an intense green color during the experiment and showed higher values of fresh weight and dry weight of the aerial parts than the non-inoculated plants. It is thus shown that the NmDef02 defensin does not affect B. japonicum nodulation.

# Field Resistance to Phakopsora pachyrhizi in Transgenic Lines

After the beginning of Asian soybean rust symptoms in the pod and grain formation stages, the presence of P. pachyrhizi was confirmed by the Envirologix QuickStix kit (**Figure 4A**), and reproductive structures were observed by a stereoscope (**Figures 4B,C**). Non-transgenic plants (DT-84) used as the control had a 100% incidence of rust (percentage of plants with pustules), demonstrating the high susceptibility of this cultivar to P. pachyrhizi (**Figure 4D**). In parallel, all transgenic lines showed signs of rust; in this case, the pustules were present in the leaves that were closest to the soil, although the percentage of affected plants in transgenic lines was lower than in nontransformed plants (**Figure 4D**). In this study, some transgenic plants presented a high incidence (**Figure 4D**) and low severity (**Figure 4E**) of Asian soybean rust. This was evident in line 18. This is so because, in this study, the incidence only showed the


TABLE 1 | Nodulation in transgenic and non-transgenic soy plants (cultivate DT-84) inoculated with Bradyrhizobium japonicum.

Def1 (transgenic line) n = 30; NT (non-transgenic control DT-84) n = 20. The values represent the means ± standard deviation corresponding to three replicates. Different letters show significant differences of p < 0.05 according to Tukey's multiple range test.

dispersion of fungi in the experimental area. All plants with any symptoms at all were counted, even those with few pustules at the lowest part of the plant, as occurred in lines Def1, 12, 17, and 18. The severity of soybean leaf rust was estimated through the visual observation of plants using the scale proposed by Ploper et al. (2006). The results are shown in **Figure 4E**. According to this study, severity was higher in the lower parts of the plants, mainly in older trifoliate leaves. All transgenic lines had a significantly lower severity than the control DT-84 (p < 0.05), where more than 40% of the leaves involved were from the lower parts of the plant (**Figure 4E**). Transgenic lines Def1, 12, 16, 17, and 18 showed severe damage by rust; lines Def1 and Def12 had less than 8% of the leaves affected in the lower plant parts, with less than 5% in the middle and upper parts (**Figure 4E**). Resistance to P. pachyrhizi was determined by visual description of signs and symptoms observed on the soybean leaves in response to infection by Asian soybean rust. The presence of reddish-brown (RB) pustules with different levels of sporulation and without sporulation was observed in the leaves of these transgenic lines. Contrarily, in non-transgenic plants, abundant sporulation in uredinias was observed, which shows the high susceptibility of the DT-84 cultivar to P. pachyrhizi. The presence of dark brown pustules with limited sporulation was also evident in transgenic and non-transgenic plants. In our study, transgenic lines expressing the NmDef02 gene showed different levels of resistance to P. pachyrhizi, displaying complete and incomplete resistance in lines Def1, 12, 17, and 18. Line Def12 also showed light brown lesions with reduced sporulation in some plants, suggesting a partial resistance. Some plants of the Def3, 4, 6, 10, and 14 lines also showed light brown lesions with reduced sporulation.

Quantitative PCR analysis confirmed the presence of P. pachyrhizi. The quantification of fungal biomass in plants made it possible to verify that all transgenic lines had a significantly lower amount of fungal biomass than the nontransgenic plants (p < 0.05), as shown in **Figure 5**. Transgenic lines Def1, 17, and 18 revealed the lowest amount of fungal biomass compared to the other transgenic lines and the non-transgenic control (**Figure 5**). The differences in fungal colonization between the transgenic soybean lines were expressed through the quantification of the fungal biomass, even when the visual differences in the signs and symptoms were not evident, as in the cases of lines Def3, 4, 6, and 14.

Leaves of transgenic plants remained green even when they were affected by P. pachyrhizi (**Figure 6A**), thus contrasting with non-transgenic plants, which showed intense chlorosis in their leaves (**Figure 6A**) followed by premature defoliation (**Figure 6B**). The highest percentage of defoliation was observed in non-transgenic plants at 60 days after rust infection. At that time, transgenic lines Def1, 5, 12, 17, and 18 showed defoliation of less than 10% (**Figure 6C**).

The incidence of rust at this stage caused early defoliation, which had a negative impact on all parameters. The disease affected the number of branches, pods and seeds, as well as the weight of seeds per plant in the non-transgenic control. In contrast, most of the transgenic lines were significantly superior to the control in all parameters related to yield. The results are summarized in **Table 2**.

# Resistance to Colletotrichum truncatum in Transgenic Lines

A total of 360 transgenic plants carrying the NmDef02 defensin gene and 200 non-transgenic plants (DT-84) representing the controls were used in the experiment under conditions favoring the incidence of Anthracnose. Symptoms of irregular brownshaped spots on pods, petioles and stems, similar to those described for anthracnose of soybeans, were observed in nontransgenic plants in the grain formation phase (**Figure 7A**). Tissues from the affected pods and stems were observed under an optical microscope to examine the structure of the fungus. The presence of C. truncatum was confirmed by PCR (data not shown).

In this study, line Def1 showed a high resistance to this pathogen (**Figure 7B**) because only some plants (10%) (**Figure 7C**) presented pods with spots at the basal zone of the plant at the end of the cycle. The Def12 and Def16 transgenic lines also showed symptoms of anthracnose, but their incidence was less than in the non-transgenic control (**Figure 7C**). These transgenic lines showed irregular spots on the pods of some plants, but their leaves remained green until the final stage of plant development, as occurred in line Def1 (**Figure 7B**). In contrast, 100% of DT-84 plants were affected (**Figure 7C**) and showed irregular brown spots on pods, as well as leaf chlorosis and high premature defoliation (**Figure 7D**). On the other hand, the Def12 and Def16 transgenic lines showed a certain reduction in seed quality in some plants. The seeds of line Def1, however, remained healthy (**Figure 7E**). Seeds and pods of the nontransgenic control were highly affected by the fungus (**Figure 7E**), showing wrinkling, mold and, in some cases, turning dark brown, which is similar to the symptoms reported for anthracnose associated with Colletotrichum in soybean. After plants were

transgenic lines (1–18) and in the non-transgenic control (NT) are shown. Bars represent the deviation of the means (n = 3). (E) Evaluation of soybean rust severity in transgenic soybean plants. The data show the average of two experiments. Severity was determined in different parts of the plant (upper, middle, and lower parts) (n = 20). Asterisks indicate significant differences using ANOVA by Tukey's multiple range test compared to the control (\*p ≤ 0.05, \*\*p ≤ 0.01, \*\*\*p ≤ 0.001).

harvested, the results showed a statistically better performance for line Def1 compared to lines Def12 and Def16 and the nontransgenic control in all parameters evaluated (**Table 3**).

Transgenic soybean lines did not present any detrimental agronomic features compared to the non-transgenic control DT-84 when plants were grown in disease-free soil (**Table 4**). All soybean plants showed a similar vegetative development. The parameters evaluated show differential behavior, both in transgenic plants carrying the NmDef02 gene and the cp4epsps gene and in non-transgenic plants, as shown in **Table 4**. These results demonstrate that the overexpression of defensin did not have a negative impact on parameters related to yield in soybean plants.

### DISCUSSION

The production of transgenic plants expressing antimicrobial genes is able to provide broad resistance against different pathogens while reducing the use of chemical pesticides. In the current study, we obtained the first evidence of resistance to the hemibiotrophic fungus C. truncatum and the biotrophic fungus P. pachyrhizi in soybean plants transformed with the NmDef02 defensin gene under the 35S constitutive promoter.

Defensin has antifungal activity and produces membrane disruption by pore formation in the cell membrane (Thevissen et al., 2007). A high concentration of defensins produces severe membrane permeabilization, which leads to fungal death (Thevissen et al., 2003; Seo et al., 2014). Previous studies have shown that plant defensins are accumulated in the peripheral cell layers of cotyledons, hypocotyls, endosperm, tubers, fruits, and floral organs, including style, ovary, filaments of stamen, and anthers (Thevissen et al., 2003). These defensin locations are consistent with their role in the first line of defense against potential pathogens (Thevissen et al., 2003; Lay and Anderson, 2005; De Coninck et al., 2013). Plant defensins can also be found in stomatal cells and cell walls of the sub-stomatic cavity; these are involved in plant protection against pathogens that penetrate the stomata (Prema and Pruthvi, 2012).

In this study, we have observed high protection against the pathogenic fungi P. pachyrhizi and C. truncatum in soybean leaves and pods, which may be favored by the constitutive overexpression of NmDef02 defensin in the membranes. Previously, Portieles et al. (2010) showed that the constitutive expression of the NmDef02 defensin gene provided strong resistance to P. infestans in transgenic potato plants under greenhouse and field conditions.

The inhibitory activity of defensins on the growth of a wide range of hemibiotrophic and necrotrophic fungi has been observed through in vitro studies at micromolar concentrations (Portieles et al., 2010; Lacerda et al., 2016). Nevertheless, studies to determine the antifungal activity of plant defensins against biotrophic fungi are much more difficult, since they are difficult to cultivate in vitro according to Kaur et al. (2011). These fungi establish a long-term feeding relationship with the living host cells. Studies recently published by Lacerda et al. (2016), reported that the Drr230a defensin expressed in yeast affected the in vitro germination of the spores of the P. pachyrhizi fungus.

They observed less severity of rust caused by P. pachyrhizi in leaves that were artificially inoculated with the fungus and the defensin. Also, the results obtained in our study showed that the transgenic lines expressing the NmDef02 defensin gene are able to inhibit the development and sporulation of the P. pachyrhizi fungus under natural infection conditions. Similar results were reported with the CcRpp1 gene isolate from C. cajan and cloned in soybean, where it conferred specific resistance to P. pachyrhizi (Kawashima et al., 2016).

The complex interactions occurring between the pathogen, its host, and the environment are expressed as the incidence or severity of a disease. In this study, the average severity of rust on older leaves (in the lower third of the plants) was statistically higher than on younger leaves (in the upper part of the plants). Similar results were reported by Xavier et al. (2017), who showed that younger soybean plants are more susceptible to Asian soybean rust than older plants but that older trifoliate leaves had the highest disease severity. Although all transgenic lines expressed signs of Asian soybean rust, they were of low severity compared with the non-transgenic control. Studies by Vittal et al. (2014) showed that the presence of reddish-brown pustules (RB) without sporulation indicated complete resistance and that RB lesions with different levels of sporulation meant that there was an incomplete resistance. Both types of lesions were found in the transgenic plants that remained with green leaves, unlike the nontransgenic plants, which showed chlorosis accompanied by the appearance of pustules. On the other hand, the presence of dark brown pustules with limited sporulation in certain transgenic plants shows partial resistance, as described by Pham et al. (2009). Our data support the idea that the constitutive expression of the NmDef02 gene produced a decrease in the number of pustules in the transgenic lines, and this led to a lower fungal biomass. In contrast, the leaves of the control plants (DT-84) colonized by the pathogen showed a lot of uredias with abundant sporulation

TABLE 2 | Agronomic field test with transgenic soybean lines affected by Asian soybean rust.


A simple classification ANOVA was used. Averages with different letters indicate significant differences of p < 0.05 according to Tukey's multiple range test. NT, non-transgenic control; SD, standard deviation; SE, standard error of the means. These data correspond to the average of 30 plants for each line and the non-transgenic control.

FIGURE 7 | Evaluation of transgenic soybean lines affected by anthracnose (Colletotrichum truncatum) in the field experiment. (A) Anthracnose symptoms observed in non-transgenic plants cv. DT-84. (B) Healthy plants of the Def1 transgenic line. (C) Evaluation of the number of plants affected by anthracnose in the field experiment. Transgenic lines (1, 12, and 16). Non-transgenic control (NT). Bars represent the deviation of the means (n = 3). (D) Early maturity and defoliation in soybean plants. Left: Def1 line, right: non-transgenic control. (E) Anthracnose symptoms observed in pods and seeds of non-transgenic plants cv. DT-84. Top: Def1 transgenic line, bottom: non-transgenic control.

and a high amount of P. pachyrhizi biomass, demonstrating a compatibility response with the pathogen, as described by some authors (Yamanaka et al., 2010; Vittal et al., 2014). The reduced sporulation of the fungus observed in some plants from the transgenic lines is evidence of resistance to P. pachyrhizi in this experiment under natural infection conditions. This effect of the


Averages with different letters in the same column indicate significant statistical differences of p < 0.05 according to the Tukey's multiple range test. NT, non-transgenic control. SD, standard deviation. SE, standard error. These data correspond to the average of 30 plants for each line.

TABLE 4 | Agronomic field trial with transgenic soybean lines under field conditions without fungal diseases.


A simple classification ANOVA was performed. Averages with different letters indicate significant differences of p < 0.05 according to Tukey's multiple range test. These data correspond to the average of 30 plants for each transgenic line carrying the NmDef02 gene (1, 17, 18), the cp4epsps gene (4, 9, 12), and the nontransgenic control (NT).

inhibition of fungal germination and growth of spores by plant defensins was also reported in transgenic bean plants that carry the pdf1.2 defensin gene against Colletotrichum lindemuthianum (Espinosa-Huerta et al., 2013).

Plant-pathogen interaction studies have shown that RB lesions can vary in color from light to dark red (Rosa et al., 2015). Because of this, the authors consider that the color of the lesions is not a reliable indicator of resistance or susceptibility to P. pachyrhizi. Lesion color is not always a reliable indicator, because it is influenced by environment (Yamanaka et al., 2010). Studies conducted by Miles et al. (2011) also showed that in some cases the severity of the rust is not related to lesion type. However, they found that the number of uredias per leaf area is inversely related to yield (Miles et al., 2011). Similar results were obtained in this study, with different types of lesions on the leaves of transgenic and non-transgenic plants. In addition, the high fungal biomass detected in the plants was related to the large number of uredias present on the leaves, regardless of the type of lesion. It was also consistent with the inverse relationship between severity and yield parameters, where the non-transgenic control plants affected by the fungus had a high percentage of uredias on the leaves and a small number of pods and seeds, as shown in **Table 2**.

Rust infestation was more severe in certain lines, which also had more fungal biomass on leaves. They were also affected by early defoliation, suggesting that although they were less affected than the DT-84 control, defensin expression was not enough to avoid fungal damage. Studies conducted by Ntui et al. (2010) with the wasabi defensin gene showed that fungal resistance is associated with the level of expression of the protein. In our study, we found a high correlation between the relative expression of NmDef02, as determined by qRT-PCR, and the high resistance against P. pachyrhizi in transgenic lines Def1, 17, and 18. Interestingly, lines Def12 and 16, which had relatively low expression of defensin, also showed resistance against P. pachyrhizi but developed symptoms due to C. truncatum infection.

The constitutive expression of NmDef02 also influenced the proliferation of C. truncatum, because there was a decrease in the number of lesions on the transgenic plants, in which line Def1 showed increased resistance to this pathogen. In contrast, DT-84 plants used as controls showed a high susceptibility to C. truncatum, with abundant lesions in stems and pods. Similar results were observed in common bean (Phaseolus vulgaris L.) carrying the pdf1.2 defensin gene, where the authors achieved a significant reduction in the formation of lesions in transgenic lines infected with Colletotrichum sp. (Espinosa-Huerta et al., 2013).

Some authors state that plants with a short life cycle are able to avoid yield reduction due to Asian soybean rust. This type of mechanism could be a form of horizontal resistance based on an escape mechanism or an unfavorable environment for disease development (Santos et al., 2018). This, however, depends on the susceptibility of the variety and development stage of the plants at the time of the appearance of the pathogen. In the present study, the short cycle cultivar DT-84 used as a nontransgenic control was highly susceptible to P. pachyrhizi and C. truncatum under the conditions of natural infection, and the plants underwent complete defoliation before concluding their maturation cycle. The early defoliation of these soybean plants reduced productivity by interfering with their physiological processes, thus resulting in less normal pods, fewer seeds per

pod and lower grain weight. Disease progression during the pod formation and pod-filling periods is most detrimental to yield (Kawuki et al., 2004). The negative effect of defoliation on crop yields was also observed in Brazilian soybean cultivars affected by Asian soybean rust (da Silva et al., 2015; Childs et al., 2018). This defoliation can affect the natural mechanisms of resistance, making them less active and increasing the susceptibility of soybean to end-of-cycle diseases.

To conclude, in experiments where no chemical fungicides were applied, transgenic plants showed increased resistance to P. pachyrhizi and C. truncatum until the end-of-cycle stage, where other pathogens normally appear. Molecular analyses showed the presence of the transgene in the progeny of these lines, where transgenic line Def1 accumulated the highest transcript levels and displayed the highest degree of resistance to both diseases. This could explain why severity and incidence of and defoliation by Asian soybean rust in plants of this transgenic line were lower. Similarly, the reduced fungal biomass present in the transgenic plants coincides with a reduced sporulation of the pathogen, which demonstrates the antifungal effect exerted by defensin NmDef02 on P. pachyrhizi.

The antifungal effect of the NmDef02 defensin had been previously demonstrated by Portieles et al. (2010). However, it is not obvious that there is resistance to a biotrophic fungus such as P. pachyrhizi, which is difficult to control, or against C. truncatum, because resistance to fungal pathogens is not only obtained by introducing this defensin into a culture. An example of this is the susceptibility to these fungi observed in some transgenic lines evaluated in the field.

Transgenic soybean plants had a similar development to nontransgenic plants when inoculated with B. japonicum. The use of rhizospheric microorganisms in the preparation of inoculants for soybeans was very important in maintaining high productivity with a lower environmental impact, as demonstrated by some authors (Menéndez et al., 2014; Nápoles García et al., 2014). This study also showed that the expression of the NmDef02 defensin gene in soybean plants had no negative effect on the nodulation induced by B. japonicum, a bacterium that plays an essential role in the technology of this crop in Cuba. Kaur et al. (2017) studied the symbiosis of mycorrhizae with transgenic wheat plants carrying MtDef4.2 defensin. This study also showed that the expression of that defensin in apoplast can provide resistance to leaf rust, without having a negative effect on the symbiosis with that beneficial fungus.

To the best of our knowledge, this is the first report on the transformation of soybean with a defensin gene for resistance to fungal pathogens. We demonstrated that the overexpression

### REFERENCES


of the NmDef02 gene resulted in a delay in progression of the fungi. However, more evaluations of these transgenic lines against these pathogens in different weather conditions are necessary, taking into account other parameters that were not taken into account in this study, to confirm resistance. A complete resistance to P. pachyrhizi was not found in the transgenic lines, and the application of fungicide is needed to completely control the pathogen. Evidently, the use of transgenic plants expressing this defensin would reduce the number of chemical fungicide applications in the field for an integral pest management in soybean with a minimal environmental impact. These results provide a sound environmental approach to decrease yield losses and to lower the burden of chemicals, both goals targeted by Next Generation Agriculture.

## DATA AVAILABILITY STATEMENT

All datasets generated for this study are included in the article/supplementary material.

# AUTHOR CONTRIBUTIONS

NS and GE conceived and designed the research work. YR performed the construction of pCP4EPSPS-DEF plasmid. NS, YH, and CD performed the soybean transformation experiments. NS, YH, CD, LV, RO, and GE performed the field experiments. NS, YH, OB-H, and GE conducted molecular analysis and analyzed the data. NS wrote the manuscript. GE, MP, and OB-H reviewed the manuscript.

## FUNDING

This work was funded by the Cuban Government.

# ACKNOWLEDGMENTS

The authors would like to thank their colleagues and collaborators in the Plant Biotechnology Department at the Center for Genetic Engineering and Biotechnology who were involved in the field experiments. The authors are grateful to Dr. Miriam Ribas Hermelo for language editing of the manuscript and the reviewers for their critical comments, which helped to improve the quality of the manuscript.




**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Soto, Hernández, Delgado, Rosabal, Ortiz, Valencia, Borrás-Hidalgo, Pujol and Enríquez. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Molecular Mechanisms of the Floral Biology of Jatropha curcas: Opportunities and Challenges as an Energy Crop

### Manali Gangwar and Jata Shankar\* †

Genomic Laboratory, Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology, Waknaghat, India

Fossil fuel sources are a limited resource and could eventually be depleted. Biofuels have emerged as a renewable alternative to fossil fuels. Jatropha has grown in significance as a potential bioenergy crop due to its high content of seed oil. However, Jatropha's lack of high-yielding seed genotypes limits its potential use for biofuel production. The main cause of lower seed yield is the low female to male flower ratio (1:25–10), which affects the total amount of seeds produced per plant. Here, we review the genetic factors responsible for floral transitions, floral organ development, and regulated gene products in Jatropha. We also summarize potential gene targets to increase seed production and discuss challenges ahead.

### Edited by:

Briardo Llorente, Macquarie University, Australia

### Reviewed by:

Toshiro Ito, Nara Institute of Science and Technology (NAIST), Japan Yaping Chen, Chinese Academy of Sciences, China

### \*Correspondence:

Jata Shankar manaligwr88@gmail.com; jata\_s@yahoo.com

†ORCID: Jata Shankar orcid.org/0000-0003-4993-9580

### Specialty section:

This article was submitted to Crop and Product Physiology, a section of the journal Frontiers in Plant Science

Received: 11 August 2019 Accepted: 21 April 2020 Published: 09 June 2020

### Citation:

Gangwar M and Shankar J (2020) Molecular Mechanisms of the Floral Biology of Jatropha curcas: Opportunities and Challenges as an Energy Crop. Front. Plant Sci. 11:609. doi: 10.3389/fpls.2020.00609 Keywords: Jatropha curcas, energy crop, transcriptome, biofuel, ABCDE model

# INTRODUCTION

About 11 billion tons of oil is consumed worldwide each year for fuel. With this rate of oil consumption, we may soon exhaust the oil reservoir (Shafiee and Topal, 2009) 2 . Climate change is also greatly influenced by fossil fuel combustion. Therefore, sustainable and environmentally friendly alternative energy sources are needed. Jatropha curcas L (Euphorbiaceae) is a plant with potential for biodiesel production due to its high seed oil content (around 45–50%) (Achten et al., 2008). Compared with other oil plants, Jatropha has its own merits, including an outstanding adaptability to varied environments, smooth propagation, and greater fruit and seed size. Furthermore, Jatropha grows well in the desert, adapts to drought conditions, has a short gestation period, and assists in soil conservation. Despite its advantageous properties for biodiesel production, Jatropha has some limitations that restrict its commercialization as an energy crop, such as low seed yield, inconsistent flowering and fruiting, and relatively expensive plantation maintenance. The significant factors influencing its potential as biofuel feedstock are the oil content in seeds, the number of seeds per tree, the number of fruits on each branch of the plant, and the number of branches per plant. Seed yield at each inflorescence is largely dependent on the number of female flowers. Jatropha's female to male flower ratio is quite small (1:25 to 1:30), which means that each inflorescence contains only about 10 to 12 female flowers (out of 300) that yield just 8 to 10 fruits. Therefore, a relatively small number of fruits are produced as compared to the total number of flowers (Kumar and Sharma, 2008). One way to increase the total seed yield in Jatropha would be to increase the number of female flowers per plant. In this context, we have discussed the genetic factors involved in the floral transformation, determination of sex, and floral growth of Jatropha curcas.

1 https://www.ecotricity.co.uk/our-green-energy/energy-independence/the-end-of-fossil-fuels

### BIOLOGY OF SEX DETERMINATION IN JATROPHA CURCAS

Sex determination processes allow floral organ development in plants. The two processes for forming a unisexual flower are (i) emergence of only one type of sex organ (unisexual tissues) and (ii) initiation of stamen and pistil followed by an arrest or abortion of one sex organ, which results in the functional immaturity of either stamens or carpels. The developmental arrest step occurs at an immature stage well before sexual maturity is reached (Ainsworth, 1999, 2000; Kater et al., 2001). There are two modes of sex determination and development in Jatropha. One mode is the development of male flowers with early adolescence without any female primordia. The other mode is by aborting male tissues, which results in female flowers developing (Li and Li, 2009; Wu et al., 2011). The male flower is unisexual right from the start, whereas the female flower is bisexual until its sixth developmental phase. Because of this, an inflorescence has three types of flowering sites; (i) female flowering site, (ii) male flowering site, and (iii) middle flowering site where both males and females may develop. Though male tissue abortion occurs in female flowers during sexual differentiation, traces of male tissue may be found in mature females. However, when abortion of male tissues fails in a female flower, it develops as a male at the female flowering site. Such inflorescence is known as middle type inflorescence. Due to the number of female flowers formed at middle type inflorescence, variation in the total number of female flowers occurs. An inflorescence statistical analysis found ∼75 percent of middle-type inflorescence and 0.09 percent of female flowers (Luo et al., 2007; Wu et al., 2011). For 18 female locations, Wu et al. (2011) found only seven female flowers. The female flowering sites and the sites occupied by middle-type inflorescence are important in increasing the number of female flowers. The presence of hermaphroditic flowers has also been recorded in Jatropha, showing structural similarity with female flowers but diffused stamens (Abdelgadir et al., 2010; Wu et al., 2011; Adriano-Anaya et al., 2016). A recent population analysis on Jatropha's floral diversity and sex expression has grouped accessions into gynoecious (having only females), androecious (having only males flowers), and andromonoecious (having both bisexual and male flowers) plants showing no correlation with their geographic location (Adriano-Anaya et al., 2016). Of the 103 accessions from 33 sites in southern Mexico, 93.2 percent were monoecious, while others were androgynomonoecious, androecious, or gynoecious (**Figures 1A,B**). It has been hypothesized that male development commences through suppression of females, which might be the result of male sterility mutation in gynomonoecious plants (Salvador-Figueroa et al., 2015; Adriano-Anaya et al., 2016). No gynomonoecious plants of Jatropha have been found. The possible explanation, according to Adriano-Anaya et al. (2016), is that gynoecious Jatropha plants derive from hermaphrodite ancestors through a one-step mutation.

# GENETIC FACTORS FOR VEGETATIVE TO A REPRODUCTIVE PHASE TRANSITION

In floral initiation, the apical shoot meristem differentiates into an inflorescence. The induction of floral signaling is genetically controlled by floral integrator genes, such as FT (FLOWERING LOCUS T), FLC (FLOWERING LOCUS C) and SOC1 (SUPPRESSOR OF OVEREXPRESSION OF CONSTANS 1). Ye et al. (2014) reported that JcFT (Jatropha Flowering locus T) overexpression caused early flowering by shortening the bolting time. Li et al. (2014) characterized FT in Jatropha, and data from its spatial expression showed higher expression in reproductive phases. The LFY gene has recently been identified and overexpressed in both Arabidopsis thaliana and Jatropha (Tang et al., 2016). During the early stages of flowering, they observed a higher expression of JcLFY (Jatropha LEAFY). Transgenics with JcLFY overexpression showed early flowering and increased transcript levels of floral meristem identity genes, such as JcAP1, JcAP3, JcSEP1, JcSEP3, and JcAG. In addition, cosuppression of LFY in Jatropha resulted in delayed flowering, abnormal floral flowers replaced by sepaloid organs, and an increased rate of floral abortion (Tang et al., 2016). Recently, the role of TFL1 homologs has been studied through the transgenic method, and their overexpression has resulted in delayed flowering due to reduced AP1 and FT gene expression (Karlgren et al., 2013; Li et al., 2017). In contrast, Li et al. (2014) reported higher expression of FT in Jatropha's reproductive phases and fruits. Circadian rhythms play an important role in the initiation of flowering. JcDof3, a plant-specific transcription factor with a conserved zinc finger (ZF) DNA-binding domain, is a circadian clock regulated gene. The C-terminal conserved region of Dof3 interacts with the F-box protein forming Dof3- Fbox complex regulating the expression of CO, a circadian clock regulating flowering gene (Yang et al., 2011). Foliar cytokinin (CTK) treatment upregulates genes GI, SOC1, and LFY, and inactivates genes COP1 and TFL1b that maintain a flowering signal which promotes flowering (Chen et al., 2014; Pan et al., 2014). Thus, the interplay between the circadian rhythm and hormones control flowering genes and phase transition to inflorescence meristem in Jatropha.

### MOLECULAR BASIS OF SEX DETERMINATION

Jatropha is a monoecious plant in which female flowers are formed due to stamen abortion/suppression. Remains of female tissues are not observed in male flowers, though remains of aborted stamens (male tissues) are present at the base in female flowers. By analyzing Jatropha floral buds for gene expression, the SUPERMAN gene was observed to suppress male tissue and promote the development of female tissue (Gangwar et al., 2018). A recent study suggested that, in Arabidopsis, the SUPERMAN gene not only bridges floral organogenesis and floral meristem but also regulates

**Abbreviations:** BRs, brassinosteroids; CTK, cytokinin; GAs, gibberellic acids; JAs, jasmonic acid; Jatropha, Jatropha curcas L.

auxin biosynthesis (Xu et al., 2018). Transcriptome analysis of Jatropha's floral buds showed reduced expression of the stamen development gene TASSELSEED 2 (TS2) that facilitated the growth of carpels (Chen et al., 2014). Transcriptomic analysis of different stages of male and female flower buds of Jatropha showed upregulation of CRABS CLAW (CRC) during development stages of female flowers. CRC, a C2C2- YABBY zinc finger protein, is involved in the regulation of carpel fusion and growth, nectary formation, and floral meristem termination (Xu et al., 2016; Gross et al., 2018). Genes encoding for inorganic phosphate transporter and ubiquitin carboxyl-terminal hydrolase were upregulated during female flower development and may contribute to embryo sac development (Xu et al., 2016). Further, upregulation of genes encoding for chlorophyll A/B-binding protein during initiation of carpel primordia may facilitate carpel differentiation. Genes encoding for Gibberellin-regulated protein 4-like protein, cytochrome c-oxidase subunit 1 (mitochondrial gene), and AMP-activated protein kinase, however, were upregulated during stamen development. Upregulation of genes encoding for RING-H2 finger protein ATL3J (E3 ubiquitin ligases), CLAVATA1 (receptor-like kinase), auxin-induced protein 22D, transcription factor R2R3-myb (regulating cell cycle genes and cytokinin signaling), and AGAMOUS-LIKE-20 (MADS-box genes) have been identified during the late stage of female flower development, which may facilitate the maturation of female flower (Alvarez and Smyth, 1999; Pelaz et al., 2000; Makkena et al., 2012; Xu et al., 2016). In both male and female flower buds, genes such as ARP1 (Auxin repressed protein), X10A (auxin-induced protein), and GID1 (gibberellin receptor protein) were upregulated (Xu et al., 2016). The role of JcFT, a florigen and a key flowering pathway regulator in Jatropha showed significantly high transcript levels in female flowers (Li et al., 2014). Another transcriptomic study

identified MYC2, TS2, KNAT6, SVP, TFL1, and SRS5 as sex determination regulators in J. curcas (Chen et al., 2017). The suppression of nodulin MtN3 or LESS ADHERENT POLLEN (LAP3) resulted in small anthers, sterile pollens, and abortion of female flowers in Oryza sativa, Vitis vinifera, and Medicago truncatula (Chu et al., 2006; Ramos et al., 2014). In Pisum sativum L, carpel senescence has been induced as a result of increased lipoxygenase gene expression (Rodríguez-Concepción and Beltrán, 1995). Pentatricopeptide repeat-containing gene is expressed in the female embryo sac and restores the cytoplasmic male sterility in Jatropha (Bentolila et al., 2002; Xu et al., 2016). Key genes involved in the floral transition, sex determination and development of reproductive organs are shown in **Figure 1C** and **Table 1**. These studies shed light on how sex determination and differentiation occur in monoecious plants and how some of the genes expressed during floral differentiation suppress male flowering.

### ABCDE MODEL FOR SEX DIFFERENTIATION

The ABCDE model is a scientific model that specifies the role of homeotic genes in the development of floral organs. Genes of the A class specify sepal development. The development of petals occurs by the combined effect of genes from the A and B classes. Both the B- and C-class genes are important for stamen growth. The carpel development and activity of ovules are determined by C-class and D-class genes, respectively. Recently, E-class genes were discovered to play a role in the development of carpel and ovary (Pelaz et al., 2000; Honma and Goto, 2001). A-, B-, C-, D-, and E-class genes are transcription factors with conserved DNA binding domains known as the MADS-box family and are involved in floral organogenesis regulation (Parenicová et al., 2003 ˇ ; Chen et al., 2019). PERIANTHA (PAN), a bZIP transcription factor, activates AG, a C-class MADS-box protein that regulates floral organ numbers and whorl patterning (Maier et al., 2009). In Elaeis guineensis, the mutants AP3 and PISTILLATA (PI) inhibited male tissues. AG2 has a mixed C/D function gene, and its expression has been observed in ovule primordia and carpel of Arabidopsis and Elaeis guineensis, respectively (Favaro et al., 2003; Adam et al., 2007). FLORAL BINDING PROTEIN 11 (FBP11), a D-class gene, determines the formation of ovules in cucumbers (Favaro et al., 2003). An increase in the C-class gene transcription level arrests the development of sexual organs in monoecious plants, such as Liquidambar styraciflua L and Rumex acetosa L (Ainsworth, 2000). B-and C-class genes are regulated at a sex locus by a genetic switch that further controls the development of male or female flowers in Populus trichocarpa (Leseberg et al., 2006). B-class genes PI and AP3 have been identified in the formation of stamen in Jatropha. A- and C-class gene AG and D-class gene SEEDSTICK1 (STKI) have been reported for carpel development and maturation (Hui et al., 2017). Thus, the ABCDE model helps to understand the floral differentiation in Jatropha.

# ROLE OF HORMONES IN SEX DETERMINATION

The process of flower development and sex determination is regulated by the interplay of endogenous hormones (auxins, cytokinins, gibberellins, abscisic acids, etc.). Auxin regulates sex determination in Jatropha. IAA enhanced female to male ratio from 1:27 to 1:23, and it also increased seed weight 3-fold (Joshi et al., 2011). Auxin biosynthesis and signaling are associated with genes such as ARFs, AUX1, and Transport inhibitor response 1 (TIR1). Transcriptome analysis of Jatropha suggested that AUX1 is responsible for sex determination. The main source of auxin production is through Trp-dependent auxin biosynthesis, which participates in embryo patterning and reproductive organ development (Chen et al., 2017). In this pathway, IAA is produced from indole-3-pyruvic acid by YUCCA (YUC), a flavindependent monooxygenase (Stepanova et al., 2008). During stamen primordia formation, auxin is produced locally by YUC1 and YUC4 followed by YUC2 and YUC6 genes at late stages of stamen development (Cheng et al., 2006; Cecchetti et al., 2008). In mature gynoecia, YUC4 and YUC8 genes were expressed in the style, whereas YUC2 was expressed in carpel valve tissues (Martínez-Fernández et al., 2014). Increased expression of ARF 10/16/17/18 leads to abnormalities in females and abortion of organs, resulting in fewer seed sets (Huang et al., 2016).

Gibberellic acids also contribute to the development of the stamens in monoecious plants. Exogenous application of GA on the inflorescences of Jatropha resulted in a 2-fold increase in female flowering. However, inflorescence branches were not affected. Hui et al. (2018) reported the altered endogenous CTK (increased) and GA (decreased) ratio due to exogenous GA application, which resulted in an increased proportion of female flowers. However, a higher concentration of GA caused withering of floral buds. Hu et al. (2017) isolated the JcGA2ox6 (Gibberellin oxidase) gene, which reduces the amount of endogenous GA4 (active gibberellin). They overexpressed JcGA2ox6 gene in Jatropha, which led to decreased inflorescence size, decreased male and female flowers, and decreased seed length in transgenic plants. There was a significant decrease in both seed weight and oil content. GA20ox and GA3ox have been observed in other studies to enhance the development of stamen, whereas the exogenous application of GA3 led to a restricted development of pistils, thus enabling the male to expand. GA treatment enhanced the development of stamens in monoecious females, and it resulted in bisexual flowers in monoecious plants. GASA4 protein functions in stamen differentiation. GID1, a positive GA signaling pathway regulator, controls Jatropha's female flowering (Roxrud et al., 2007; Hu et al., 2017). GA deficiency results in male sterility in plants. Therefore, GA allows the stamens to develop without affecting female flowers.

Paclobutrazol foliar application inhibits GA biosynthesis and promotes female flowering by suppressing no related pollen germination (JcNPGR2), male defective gametophyte (JcMGP2/3), duo pollen (JcDUO3), and male sterility protein (JcMS) genes, thus allowing female flowers to develop in Jatropha (Seesangboon et al., 2018).

### TABLE 1 | Key genes involved in the floral transition, sex determination, and reproductive organ development.


Genes associated with flowering sex determination


### TABLE 1 | Continued


Jasmonic acids and brassinosteroids (BRs) are active in floral development together with stamen development, pollen maturation, and male fertility (Park et al., 2002; Ye et al., 2010). In staminate maize flowers, brassinosteroids promoted pistil abortion. AG controls the maturation and late stages of stamen development in Arabidopsis by regulating the biosynthesis of jasmonates (Ito et al., 2007). Reduced JA synthesis in Jatropha led to male abortion and downregulation of the genes DAD1 and LOX2. Arabidopsis, maize, and tomato mutants with suppressed jasmonate synthesis and brassinosteroid signaling resulted in male sterility (Li et al., 2005; Ye et al., 2010). The SPL/NZZ, Aborted Microspores (AMS) and Defective in Tapetal Development and Function 1 (TDF1) genes are regulated by BRs and are critical for anther and pollen development (Ye et al., 2010). Thus, BRs and JAs promote the development of male organs.

Foliar application of ethylene induced femininity in Jatropha. To synthesize ethylene, 1-aminocyclopropane-1-carboxylic acid oxidase 2 (ACO1) oxidizes ethylene intermediates. Transgenics plants that overexpressed ACO2 were male sterile due to suppressed stamens. Little to no activity of ACO was observed in Arabidopsis, tomato, and tobacco during the development of anthers and pollens (Bartley and Ishida, 2007; Duan et al., 2008; Wang et al., 2010). These experiments have thus shown that ethylene promotes feminism in plants.

Studies have been conducted to see the effect of foliar cytokinin application on the inflorescences. It has been found that 29.99 percent of the total flowers were females in treated inflorescences as compared to 6.96 percent in control. In treated inflorescence, a 4–5-fold increase in the number of seeds was observed but the fruiting rate, seed weight, and oil content decreased (Pan and Xu, 2011; Pan et al., 2014; Chen et al., 2014).

Transcriptomic analysis of Jatropha inflorescences treated with cytokinin revealed that genes involved in the initiation of flowers, such as GI, SOC1, and LFY, and the CYP89A5 gene involved in the development of inflorescences were induced, whereas the AP1, AP2, PI, AG, and SEP1-3 genes were downregulated (Chen et al., 2014; Pan et al., 2014). These developments allowed more time for inflorescence meristems to generate floral primordia. A vital increase in the number of flowers was noted due to CUC1 upregulation. Application of BA (6-Benzylaminopurine) increased the rate of cell division in inflorescence meristem due to the upregulation of Cyclin-3-1 (CycD3;1/2) and Cyclin-dependent protein kinase 247 (CycA3;2) genes. Li et al. (2010) observed an increase in the number of flowers with an enlarged inflorescence and floral meristem in transgenic Arabidopsis overexpressing CK (cytokinin) biosynthetic gene (AtIPT4). Fewer flowers were observed at each inflorescence due to the overexpression of the CKX gene (Werner and Schmülling, 2009). Loss-of-function mutation of LONELY GUY (LOG) (encodes for CK-activating enzyme) gene of rice led to the significant decrease in the number of floral organs (Kurakawa et al., 2007). Chen et al. (2014) reported that BA treatment decreased the expression of TS2, which suppresses carpel in maize, leading to increased female to male flower ratios in Jatropha (Acosta et al., 2009).

# CHALLENGES

Genomic studies on flowering of Jatropha and phenotypic changes following the application of PGRs (Plant Growth Regulators) showed an opportunity to increase female flowering, which is one of the aspects for increasing seed yields. There are several challenges to increasing a number of female flowers: (i) manual hormone application to each inflorescence is laborious; (ii) hormone application is not economical; (iii) optimized hormone concentration at one environmental condition may not show the same efficiency under different environmental conditions; (iv) flowering and fruit maturity are not synchronized; and the (v) variation in fruiting rate. Genetic modification of flowering genes or overexpression of genes involved in suppression of male flowers may enable us to overcome these challenges by allowing more female flowers to develop. Other possibilities include enhancing cytokinin synthesis by overexpressing genes associated with cytokinin biosynthesis or suppressing cytokinin breakdown by gene silencing or mutagenesis. Additionally, further research could be carried out on the effect of central carbon flow on the fruiting rate.

### CONCLUSION AND PERSPECTIVE

The female to male floral ratio plays a significant role in deciding Jatropha's seed yield. Cytokinin application showed promising results in enhancing the ratio between female and male flowers. Promising approaches to increase the number of female flowers may be to induce the transitioning of male type inflorescences to the middle/intermediate type or to increase male flower abortion rates to allow female flowers to develop. Therefore, genes involved in female flowering or the abortion of male flowers could be targeted for the purpose of increasing female flowers in Jatropha.

# AUTHOR CONTRIBUTIONS

MG and JS conceived and designed the review manuscript, wrote, read, and approved the manuscript. JS contributed materials or analytical tools and supervised the work.

# REFERENCES


### ACKNOWLEDGMENTS

We are thankful to the Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology, Solan, India, for the providing facilities.

protein, is required for normal pollen germination and pollen tube growth in Arabidopsis. J. Integr. Plant Biol. 52, 829–843. doi: 10.1111/j.1744-7909.2010. 00963.x



termination in the fourth whorl of Arabidopsis thaliana flowers. PNAS 114, 7166–7171. doi: 10.1073/pnas.1705977114



**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Gangwar and Shankar. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Emerging Technologies to Enable Sustainable Controlled Environment Agriculture in the Extreme Environments of Middle East-North Africa Coastal Regions

Ryan M. Lefers1,2,3,4, Mark Tester1,3 and Kyle J. Lauersen<sup>1</sup> \*

<sup>1</sup> Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia, <sup>2</sup> Water Desalination and Reuse Center, Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia, <sup>3</sup> Center for Desert Agriculture, Biological and Environmental Science and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia, <sup>4</sup> Texas AgriLife Research and Extension Center at Dallas, Texas A&M University, Dallas, TX, Unites States

### Edited by:

Domenico De Martinis, Italian National Agency for New Technologies, Energy and Sustainable Economic Development (ENEA), Italy

### Reviewed by:

Maciej Maselko, Macquarie University, Australia Mohammad Pourkheirandish, The University of Melbourne, Australia Raquel Lia Chan, CONICET Santa Fe, Argentina

> \*Correspondence: Kyle J. Lauersen kyle.lauersen@kaust.edu.sa

### Specialty section:

This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science

Received: 04 February 2020 Accepted: 19 May 2020 Published: 02 July 2020

### Citation:

Lefers RM, Tester M and Lauersen KJ (2020) Emerging Technologies to Enable Sustainable Controlled Environment Agriculture in the Extreme Environments of Middle East-North Africa Coastal Regions. Front. Plant Sci. 11:801. doi: 10.3389/fpls.2020.00801 Despite global shifts in attitudes toward sustainability and increasing awareness of human impact on the environment, projected population growth and climate change require technological adaptations to ensure food and resource security at a global scale. Although desert areas have long been proposed as ideal sites for solar electricity generation, only recently have efforts shifted toward development of specialized and regionally focused agriculture in these extreme environments. In coastal regions of the Middle East and North Africa (MENA), the most abundant resources are consistent intense sunlight and saline sea water. MENA coastal regions hold incredible untapped potential for agriculture driven by the combination of key emerging technologies in future greenhouse concepts: transparent infrared collecting solar panels and low energy salt water cooling. These technologies can be combined to create greenhouses that drive regionally relevant agriculture in this extreme environment, especially when the target crops are salt-tolerant plants and algal biomass. Future controlled environment agriculture concepts will not compete for municipal fresh water and can be readily integrated into local human/livestock/fisheries food chains. With strategic technological implementation, marginal lands in these environments could participate in production of biomass, sustainable energy generation, and the circular carbon economy. The goal of this perspective is to reframe the idea of these environments as extreme, to having incredible untapped development potential.

Keywords: infrared solar, evaporative desiccant cooling, sustainability, combinatorial farming, algal biotechnology, salt water agriculture

# ARTICLE

Global changes in mean surface temperatures are driving increased extreme environmental events which are beyond the tolerance of traditional agricultural practices, creating concerns of food insecurity (Hansen et al., 2012; Rhines and Huybers, 2013; Stone et al., 2013; Sutton et al., 2015; Bathiany et al., 2018; Spinoni et al., 2018). A logical step toward increasing agricultural yield

predictability is to move toward contained agriculture concepts like greenhouses. Glasshouses, or other controlled environment agriculture (CEA) structures, are traditionally applied to extend growth and cultivation periods in cooler climates, reducing frost damage to crops or for specialty/ornamental plant cultivation. CEA enables expansion of horticulture into non-traditional environments and marginal lands, benefits that can contribute to increasing output and food security. The application of CEA in hot or desert environments has been less common than in temperate climates or higher latitudes, owing to the energy required for cooling these structures. The Middle East and North Africa (MENA) regions have some of most extreme desert climates in the world, with high average annual temperatures and very low precipitation (Beck et al., 2018). However, in coastal regions, these environments are rich in two key resources: consistent solar radiation and sea water. Development of technological solutions which work with these resources in this coastal climate to drive sustainable agricultural practices can assist the region to meaningfully contribute to the global bioeconomy and local food security.

The consistent solar radiation and strong daily winds of the MENA region can readily be used for sustainable electricity generation by traditional photovoltaics and wind turbines. The global horizontal irradiance (GHI) over the Arabian Peninsula shows significant seasonal variations with an annual mean ranging between 0.75 and 1.06 kW m−<sup>2</sup> , with higher values over the northwestern region (Dasari et al., 2019). The wind resources over Saudi Arabia exhibit strong spatial variations, with high annual mean wind power density at 80 m height over the northern Red Sea (>0.8 kW m−<sup>2</sup> ), indicating the capacity for these technologies to support local CEA efforts (Langodan et al., 2016). However, for CEA systems to be practical in hot environments, technological solutions are required to minimize operational energy needs, especially for cooling. The combination of low-energy cooling systems with heat-reducing energygenerating technologies will improve overall CEA operational efficiencies and enable their implementation in MENA coastal regions. These systems become even more attractive when mixed salt-tolerant plant species and algal biomass cultivation are combined in high density cultivation concepts to reduce freshwater requirements. Intensive novel crop combinations in future CEA concepts may provide practical regional solutions for food generation while contributing to carbon capture and cycling. This work seeks to shift the perception of coastal MENA regions from extreme and inhospitable environments to locations with ample resources in the form of sunlight and sea water which could be developed into global agricultural powerhouses. These two resources can be combined to drive a regional agricultural revolution when appropriate technologies are implemented with strategic species selection.

Traditional agriculture in MENA regions includes cultivation of date palms (Erskine et al., 2004), and some species of plants which are tolerant to local environment such as Salvadora (miswak). Recent developments have seen increased efforts toward aquaculture farming (i.e., NAQUA farms, KSA)<sup>1</sup> . In addition, there are increasing outdoor horticultural efforts in various regions, with Saudi Arabia and Oman being the largest participants (Erskine et al., 2004; Noorka and Heslop-Harrison, 2015). Agriculture in these environments requires large inputs of freshwater, which is dominated by small amounts from desalination and significant extractions from aquifers, the latter being largely unsustainable (Gleeson et al., 2012). Indeed, 80% of freshwater resources in the Gulf region are used for agricultural practices (Erskine et al., 2004; Noorka and Heslop-Harrison, 2015). Hydroponic or CEA systems like those which are practiced in greenhouses use a fraction of the freshwater and waste less fertilizers than field agriculture. Furthermore, there is a sound economic and environmental case for CEA in this region for a large number of vegetables, and even some fresh fruits. We estimate that up to 70% (on a fresh weight basis) of fresh fruits and vegetables could be economically grown locally, primarily facilitated by use of CEA, with an overall lower environmental footprint than from the use of imported food. The contribution of implementation of this practice to local food security cannot be over-stated.

Two key technologies can enable low-energy greenhouses in hot coastal environments: these are efficient organic transparent infrared solar panels and liquid desiccant-based cooling (**Figure 1**). Newly reported advances in efficient infrared organic solar panels have shown key efficiency advances in the capture of latent heat energy to generate electricity (Song et al., 2018). Using a blend of 4% 1-chloronaphthalene as a solvent and the narrow-band-gap non-fullerene acceptor IEICO-4F, a thin material which absorbs maximally at 900 nm was shown to generate 26.8 mA cm−<sup>2</sup> with photo conversion efficiencies of 12.8% (Song et al., 2018). The narrow bandgap of IEICO-4F allows penetration of photosynthetic wavelengths of light (400– 700 nm) unlike commercial silicon technology and is suitable for roll-to-roll production processes. This technology will allow the surfaces of greenhouses and windows to generate electricity, while simultaneously serving as a transparent enclosure to enable plant and algal growth. Capture of infrared energy will also reduce heating effects of sun-light within the CEA structure, thereby reducing the energy required for cooling. Surfaces in MENA regions are also prone to dust buildup and advances in mechanical automated dusting devices can now be implemented to ensure efficient operation of photovoltaics and glasshouse surfaces<sup>2</sup> .

Future CEA designs can be even more energy efficient when combined with low-energy air cooling technologies, especially liquid desiccant (Bettahalli et al., 2016; Lefers et al., 2016) or evaporative cooling (Chua et al., 1999; Lefers et al., 2018b; Shahzad et al., 2019). Liquid desiccant cooling relies on highly saline solutions that capture moisture from hot humid air and remove latent heat as moisture is absorbed by the desiccant. Moisture removal from humid air results in a pronounced cooling effect in the air passed through these structures as latent heat is removed (Lefers et al., 2016). Liquid desiccant systems can be further combined with sea water evaporative cooling systems to substitute the latent cooling achieved by the liquid desiccant system pending crop needs. Although sites will not be independent from municipal freshwater use,

<sup>1</sup>http://www.naqua.com.sa

<sup>2</sup>http://www.nomaddesertsolar.com/

FIGURE 1 | A low-energy glasshouse concept for future agriculture in coastal MENA regions. Future CEA greenhouses will combine infrared solar energy capture and desiccant cooling technologies to create stable contained environments for horticulture in extreme desert coastal environments. Infrared harvesting transparent solar panels allow photosynthetic active radiation (visible spectrum) to penetrate transparent glass surfaces to enable photosynthesis while simultaneously reducing the heating effect. Passive cooling can be achieved by passing hot external humid air through highly saline liquid desiccant solutions in porous matrices which adsorb air moisture, releasing dry, cooler air due to the vapor pressure difference. Coupling these technologies with high density hydroponic cultivation concepts and combined algae photobioreactors (green tubes) will maximize biomass productivity in these systems using seawater as cultivation medium. Macroalgae farming may also be an attractive addition to these concepts and can be coupled in managed pools on land or in the surrounding sea for nutrient removal and intensified biomass production. Plants which are naturally tolerant or those bred/engineered for salinity tolerance (pictured) can be cultivated with locally available sea water resources to minimize fresh-water requirements. Sustainable energy generation by traditional photovoltaics and wind turbines can be combined to support the energy requirements of these facilities.

vacuum regeneration of the liquid desiccant can be applied to yield additional freshwater as a by-product, which can support horticulture within the glasshouse (Lefers et al., 2018a). It is likely that future CEA concepts will be coupled to desalination efforts near local municipalities to supply the freshwater needed for human as well as some plant use. Evaporative cooling technologies utilizing sea water could also provide a lower energy solution and can be coupled to on site desalination plants, such as reverse osmosis and adsorption desalination, to provide more abundant freshwater resources without increasing energy demands and reducing the amount of brine from desalination processes (Ng et al., 2013). The combination of infrared solar cells on greenhouse surfaces with low-energy cooling and desalination technologies provides the preconditions for design of energy efficient and sustainable structures to enable CEA concepts in hot environments. In addition to water-use and temperature control, other challenges to address for CEA in MENA coastal regions include brine management, wastewater reuse, automation, and sustainable sourcing of plant nutrients. Needless to say, for widespread adoption of innovations, demonstration of costeffectiveness at scale is essential. Our preliminary studies suggest that the modest increases in CapEx for such CEA systems are partially offset by reduced OpEx and reduced transportation. Overall, the extra costs incurred by use of salt water are very modest and do not have a significant effect on the overall business case for such greenhouses.

In addition to structural and technological considerations, selection of appropriate saline tolerant or drought resistant plant species is another key contributing factor in the success of future CEA in MENA coastal regions. New initiatives in the Gulf region are underway to promote desert agriculture using salt tolerant plant species such as Salicornia, the group of plants collectively known as sea purslane, Chenopodium quinoa (quinoa), miswak, Sesbania, Triticale, and Carthamus (safflower)<sup>3</sup> . The efficiency of water use could also be improved if appropriate saline-tolerant crops are cultivated in full or partial sea-water. Coastal regions are especially valuable in this context as beach wells can provide abundant naturally filtered, thermally consistent, saline water. Combination of freshwater plant species with saline-tolerant crops and algae in combined high-density cultivation concepts has the potential to generate robust biomass production processes in smaller land areas than by traditional field agriculture while using less overall freshwater resources. Plants such as grasses or quinoa are able to be cultivated in marginal lands and arid outdoor environments, and greenhouses are not necessary for their enhanced agricultural production. However, wild, bred or engineered food crops, like recently described saline tolerant tomato varieties (Pailles et al., 2020) and edible greens, will be well suited to high density cultivation in contained greenhouses in MENA coastal regions. The role of genetically modified (GM) crops in this region is a large topic which could be its own article, permissions for various crops have been granted in Egypt, Turkey, Sudan, Iran, and Pakistan<sup>4</sup> . However, the use of GMs is completely prohibited at the time of writing in countries like Saudi Arabia. CEA would allow some amounts of containment and minimize environmental risks/concerns over GM crops, however, improved hardiness would be less important in these controlled environments. GM may be interesting for nutritional improvements of crops grown in CEA, such as increased anthocyanin or phenylpropanoid contents of tomatoes (Butelli et al., 2008; Zhang et al., 2015). Modified traits may be more valuable for microalgal cultivation where increases in lipid content (Ajjawi et al., 2017) or novel traits (Lauersen, 2019; Fabris et al., 2020) are highly desired and add value to the biomass. Ongoing difficulties with consumer acceptance of GM organisms and complex country specific regulatory constraints makes widespread deployment of transgenics unlikely for the foreseeable future, so we do not discuss these issues further here. We also limit the discussion of field agriculture as our work focuses on future CEA. Additional efforts in improving rhizobial interactions and desert-probiotics for crops grown in harsh environments are also steadily developing, with promising results for encouraging heat tolerance and desiccation resistance under outdoor conditions (Bang et al., 2018; Daur et al., 2018; Eida et al., 2018). These efforts can serve as a roadmap for encouraging low-water use in field agriculture for some plants that are grown outdoors in harsh environments (Saad et al., 2020). Neo-domestication of other thermo and saline tolerant plant species as well as selective breeding/engineering could potentially increase productivities of these cultivation concepts and work is accelerating in this field (Lemmon et al., 2018; Li et al., 2018; Dawson et al., 2019; Fernie and Yan, 2019; Shane-Ali Zaidi et al., 2019).

In future salt-water driven CEA concepts, it is likely that salt-tolerant and fresh-water cultivars will be alternated to provide balanced supplies of both types of plants for food/feed and minimize fresh water demands. Potential exists here for the introduction of non-traditional agriculture in

<sup>3</sup>https://www.biosaline.org

<sup>4</sup>http://www.isaaa.org/gmapprovaldatabase/

Lefers et al. Sustainable Desert Glasshouses, Saltwater Agriculture

the form of combinatorial cultivation concepts that integrate algal photobioreactors with higher plant hydroponics. Algae are rapidly growing photosynthetic organisms that can add increased productivity to the CEA system using full sea water as culture medium. Cultivation of algae is practiced both indoors and outdoors, with greenhouses providing improved environmental control as with higher plants (Posten, 2009). Integration of algal photobioreactors into future high-density CEA concepts could provide continual biomass generation for a range of applications and enhance areal carbon turnover rates (Lehr and Posten, 2009; Posten, 2009). Algal biomass can be used in aquaculture, animal feed, bioplastics and cosmetics, or to generate environmentally friendly replacements for plant-based oils (Radmer, 1996; Priyadarshani and Rath, 2012; Gangl et al., 2015; Chen et al., 2019). CEAs that include algal cultivation could act as sustainable sources for the natural inputs to greater bio-based industries, as algal cultivation can provide consistent biomass production at high turnover rates. Some reports even indicate co-cultivation of green algae like Chlorella and Scenedesmus together with the roots of higher plants in hydroponic systems has a dual benefit for both organisms which share growth promoting factors (Zhang et al., 2017; Barone et al., 2019). No work has yet been performed on combinatorial agriculture with algae and salt-water tolerant plants in controlled settings, which may be a new avenue for biofortification in greenhouse concepts. Industrial scale cultivation of marine algal strains such as Nannochloropsis, Dunaliella, and Phaeodactylum are already practiced in many locations globally both in CEA and outdoor cultivation (Laurens, 2017). In the MENA region there is potential to even further develop local strains of interest, for example, a recently described halotolerant Chloroidium sp. was isolated in the United Arab Emirates that has similar triacylglycerol profile to palm oil (Nelson et al., 2017). Its intensified cultivation could reduce global impact of palm agriculture as a sustainable alternative. Macroalgae may also be integrated as part of auxiliary value additions to coastal CEA concepts as they can be cultivated near shore or in pond units. Promising productivities in small scale for marine macroalgae Asparagopsis armata and Ulva rigida have been reported, and pending appropriate water flow rates, can serve as an effective biofilter to capture excess nutrients (nitrogen and phosphorous) in the form of valuable biomass (Mata et al., 2010). Very little work has been done to characterize macroalgae from the MENA region, with some studies of broad population dynamics now emerging (Geraldi et al., 2019; Ortega et al., 2019). It is likely that local bioprospecting will yield further species of interest which are adapted to regional climate conditions and can contribute to enhancing the productivities of coastal CEA hubs.

The circular carbon economy (CCE) is a concept which seeks to capture and capitalize on waste carbon which is otherwise lost to the atmosphere, usually in the form of carbon dioxide (CO2) and reuse it in a cyclic fashion to minimize the environmental impacts of human activities (Stahel, 2016). This practice is of special importance to the MENA region as global economic and social trends look to the post-oil economies of the future. Plants and algae conduct light driven photosynthesis to generate cellular energy, and through the reactions of the Calvin-Benson-Bassham cycle are able to fix CO<sup>2</sup> into organic sugars for growth. Photosynthesis-based fixation and cycling of CO<sup>2</sup> to biomass is one part of the greater spectrum of currently developing carbon capture and reuse technologies which are of interest for the development of the CCE. Plant and algal biomass represent a 1.83 weight ratio of fixed CO<sup>2</sup> per unit of biomass (Chisti, 2007). This ratio improves if the biomass is lipid or carbohydrate rich and offers a direct biological route from waste carbon to valuable bio-products. Plant and algal biomass, therefore, are incredible feedstocks for sustainable CCE practices as they represent carbon captured from the atmosphere that can be reused as physical commodities. Intensified CEA concepts which emphasize high biomass productivity in marginal lands will contribute significantly to CCE practices and provide a sustainable source of biological materials for various industries.

### CONCLUSION AND OUTLOOK

Although greenhouses are not a new concept, emerging technologies now enable energy efficient and profitable implementation of CEA in extreme desert environments. Lowenergy cooling and enhanced energy generation/temperature reduction by transparent infrared harvesting solar cells can be combined to create energy efficient greenhouses primed for future agriculture concepts on marginal coastal lands of the MENA region. The combination of high density hydroponic saline horticulture and algal cultivation can minimize impacts on freshwater water resources and maximize carbon cycling. The increased efficiency of these greenhouses can improve agricultural efforts in the MENA region, while contributing to food security and encouraging development of the CCE. It remains to be seen whether regulatory control and growing demand for locally sourced crops will enable MENA coastal regions to become hubs of future innovative agricultural practices.

# AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

# FUNDING

The authors would like to acknowledge King Abdullah University of Science and Technology (KAUST) for financial support.

# ACKNOWLEDGMENTS

The authors would like to acknowledge Profs. Ibrahim Hoteit, Derya Baran, and Simon Krattinger for useful discussions and refining our manuscript. **Figure 1** was created by Ivan Gromicho, Scientific Illustrator at King Abdullah University of Science and Technology (KAUST).

# REFERENCES

fpls-11-00801 June 30, 2020 Time: 20:53 # 6


determine the source of organic carbon in marine depositional environments. Front. Mar. Sci. 6:263. doi: 10.3389/fmars.2019.00263



**Conflict of Interest:** MT and RL are co-founders of Red Sea Farms.

The remaining author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Lefers, Tester and Lauersen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Phosphorylation of ADP-Glucose Pyrophosphorylase During Wheat Seeds Development

Danisa M. L. Ferrero<sup>1</sup> , Claudia V. Piattoni <sup>1</sup>† , Mat´ıas D. Asencion Diez <sup>1</sup> , Bruno E. Rojas <sup>1</sup> , Mat´ıas D. Hartman1† , Miguel A. Ballicora<sup>2</sup> and Alberto A. Iglesias 1\*

<sup>1</sup> Laboratorio de Enzimolog´ıa Molecular, Instituto de Agrobiotecnolog´ıa del Litoral (UNL-CONICET) & FBCB, Santa Fe, Argentina, <sup>2</sup> Department of Chemistry and Biochemistry, Loyola University Chicago, Chicago, IL, United States

### Edited by:

Briardo Llorente, Macquarie University, Australia

### Reviewed by:

L. Curtis Hannah, University of Florida, United States Bahaji Nazih Abdellatif, Superior Council of Scientific Investigations, Spain

> \*Correspondence: Alberto A. Iglesias iglesias@fbcb.unl.edu.ar

### † Present addresses:

Claudia V. Piattoni, Researcher and Innovation, Institut Pasteur de Montevideo, Montevideo, Uruguay Mat´ıas D. Hartman, Molecular Genetics of Ageing, Metabolic and Genetic Regulation of Ageing, Max Planck Institute Cologne, Germany

### Specialty section:

This article was submitted to Crop and Product Physiology, a section of the journal Frontiers in Plant Science

Received: 21 November 2019 Accepted: 26 June 2020 Published: 10 July 2020

### Citation:

Ferrero DML, Piattoni CV, Asencion Diez MD, Rojas BE, Hartman MD, Ballicora MA and Iglesias AA (2020) Phosphorylation of ADP-Glucose Pyrophosphorylase During Wheat Seeds Development. Front. Plant Sci. 11:1058. doi: 10.3389/fpls.2020.01058 Starch is the dominant reserve polysaccharide accumulated in the seed of grasses (like wheat). It is the most common carbohydrate in the human diet and a material applied to the bioplastics and biofuels industry. Hence, the complete understanding of starch metabolism is critical to design rational strategies to improve its allocation in plant reserve tissues. ADP-glucose pyrophosphorylase (ADP-Glc PPase) catalyzes the key (regulated) step in the synthetic starch pathway. The enzyme comprises a small (S) and a large (L) subunit forming an S2L2 heterotetramer, which is allosterically regulated by orthophosphate, fructose-6P, and 3P-glycerate. ADP-Glc PPase was found in a phosphorylated state in extracts from wheat seeds. The amount of the phosphorylated protein increased along with the development of the seed and correlated with relative increases of the enzyme activity and starch content. Conversely, this post-translational modification was absent in seeds from Ricinus communis. In vitro, the recombinant ADP-Glc PPase from wheat endosperm was phosphorylated by wheat seed extracts as well as by recombinant Ca2+-dependent plant protein kinases. Further analysis showed that the preferential phosphorylation takes place on the L subunit. Results suggest that the ADP-Glc PPase is a phosphorylation target in seeds from grasses but not from oleaginous plants. Accompanying seed maturation and starch accumulation, a combined regulation of ADP-Glc PPase by metabolites and phosphorylation may provide an enzyme with stable levels of activity. Such concerted modulation would drive carbon skeletons to the synthesis of starch for its long-term storage, which later support seed germination.

Keywords: crop grasses, glucan accumulation, starch biosynthesis, post-translational modification, enzyme regulation

# INTRODUCTION

Starch is a major product of photosynthesis performed by vascular plants and constitutes the foremost storage of carbon and energy in grasses (Jeon et al., 2010; Pandey et al., 2012; Tuncel and Okita, 2013; MacNeill et al., 2017; Goren et al., 2018). Chemically, starch is a mix of two homopolysaccharides, amylose and amylopectin; which are mainly composed of glucose units linked by a-1,4-bonds. Amylose is essentially a linear a-1,4-glucan with scarce branches of

**219**

a-1,6-bonds, while amylopectin contains a high number of a-1,6-linked branch points (Pandey et al., 2012; MacNeill et al., 2017; Goren et al., 2018). Cereals (wheat, maize, barley, and rice as more relevant in production) store starch in seed endosperm, and they supply more than half of the caloric demands in the world population (Tuncel and Okita, 2013; Goren et al., 2018). Enhanced biosynthesis of the polysaccharide greatly influences the grain yield of cereals having starch as the principal reserve compound (Jeon et al., 2010; Tuncel and Okita, 2013; MacNeill et al., 2017; Goren et al., 2018; Zi et al., 2018). Improvement (in quantity and quality) in the production of key harvestable grains is a challenge to be solved in the coming decades. Indeed, it is projected a critical demographic expansion together with increasing industrial requirements of feedstock for biofuels, bioplastics, and bioadhesives to cope with climate change by the mid-century (Morell and Myers, 2005; Godfray et al., 2010; Tuncel and Okita, 2013; Iglesias, 2015; Goren et al., 2018). In this scenario, the in-depth understanding of the process of starch biosynthesis is relevant to better design strategies to increase yields in plants of agronomic interest.

Starch accumulates in plastids of both photosynthetic and nonphotosynthetic plant cells (Jeon et al., 2010; MacNeill et al., 2017; Goren et al., 2018). The linear a-1,4-glucan is elongated by the starch synthases, which use ADP-glucose (ADP-Glc) as the glycosyl donor molecule. Synthesis of this sugar nucleotide takes place from glucose-1-phosphate (Glc1P) and ATP in a reaction catalyzed by ADP-Glc pyrophosphorylase (EC 2.7.7.27, ADP-Glc PPase) (Ballicora et al., 2003; Ballicora et al., 2004). The metabolic route follows by the action of enzymes implicated in branching, debranching, phosphorylation, and de-phosphorylation of the polymer under formation (Wilkens et al., 2018). Most of these enzymes arise in isoforms exhibiting changes in specificity and ability to interact with different partners forming multi-protein complexes of functional relevance for the production of the polysaccharide (Crofts et al., 2017; Wilkens et al., 2018). The step producing ADP-Glc is rate-limiting in the starch biosynthetic pathway (Kavakli et al., 2001b; Ballicora et al., 2003; Ballicora et al., 2004; Tuncel and Okita, 2013). ADP-Glc PPase is present in bacteria (where it is involved in glycogen synthesis) and green plants. The enzyme from different sources is allosterically regulated by metabolites that are critical intermediates of the central carbon energy metabolism operating in the respective organism (Ballicora et al., 2003; Ballicora et al., 2004). ADP-Glc PPase from cyanobacteria (Iglesias et al., 1991), green algae (Iglesias et al., 1994), and higher plants (Ballicora et al., 2004) have 3Pglycerate (3PGA) and inorganic orthophosphate (Pi) as the principal allosteric activator and inhibitor, respectively; with some enzyme promiscuity toward being activated by hexose-Ps (Kuhn et al., 2013).

The ADP-Glc PPase presents in green algae and higher plants is a heterotetramer (S2L2), composed of small (S, 50-53 kDa) and large (L, 54-60 kDa) subunits (Ballicora et al., 2003; Ballicora et al., 2004). Subunits S and L are homologous proteins, where the L polypeptide has emerged in different species via gene duplication followed by subfunctionalization (Ballicora et al., 2005; Georgelis et al., 2008; Georgelis et al., 2009; Kuhn et al., 2009; Ferrero et al., 2018; Figueroa et al., 2018). In potato tuber, the S subunit retained the catalytic function, whereas the L subunit specialized in modulating the regulation of the former (Ballicora et al., 2005). Even so, different alternatives exist where the L subunit is also catalytic, or it presents isoforms within the same organism (Crevillén et al., 2003; Ventriglia et al., 2008; Kuhn et al., 2009). The interaction between both subunits is determinant for the enzyme activity and regulatory responses (Kavakli et al., 2001a; Baris et al., 2009). As demonstrated in previous studies (Crevillén et al., 2003; Ferrero et al., 2018), the L subunit confers the sensitivity to allosteric regulators exhibited by heterotetrameric ADP-Glc PPases from Arabidopsis and wheat. Excluding the enzyme from monocot (as wheat) endosperm, the ADP-Glc PPase S subunit from leaves and other plant tissues has an N-terminal cysteine residue that is critical for redox regulation. This latter involves the formation of a disulfide bridge between the S subunits in the heterotetramer, mediated by the thioredoxin system (Ballicora et al., 1999; Ballicora et al., 2000; Tiessen et al., 2002). In addition, as it has been reviewed previously (Ballicora et al., 2004), ADP-Glc PPases found in cereal endosperm have distinct regulatory properties. In the barley and maize forms, 3PGA modifies the relative affinity for substrates (Plaxton and Preiss, 1987; Kleczkowski et al., 1993). Also, in wheat (Triticum aestivum) endosperm enzyme, neither 3PGA nor fructose-6P (Fru6P) modify the Vmax, but they increase the relative affinity for Glc1P by 2-fold (Gómez-Casati and Iglesias, 2002; Ferrero et al., 2018). On the other hand, the 3PGA and Fru6P revert the inhibition by Pi (Gómez-Casati and Iglesias, 2002).

A substantial body of experimental evidence is giving support to the modulation of starch biosynthesis by the combined action of post-translational mechanisms, including redox modification, protein complex formation, and protein phosphorylation (Lohrig et al., 2009; Kötting et al., 2010; Geigenberger, 2011; Ma et al., 2014; Tetlow et al., 2015; Goren et al., 2018). In this context, proteomic information obtained in the last decade draws attention to ADP-Glc PPase as a putative target of protein kinases (Lohrig et al., 2009; Kötting et al., 2010; Ma et al., 2014). Specifically, a proteomic analysis of maize endosperm identified the phosphorylation of the small subunit of the enzyme (Yu et al., 2019). However, all these predictive results lack functional and developmental evidence of the tangible presence of the enzyme at a phosphorylated state in planta. The latter is critical for the complete understanding of factors affecting plant productivity, thus limiting the design of better strategies for its improvement. Herein, we report the in vivo phosphorylation of ADP-Glc PPase associated with the development of wheat seeds. This modification was further analyzed by in vitro studies working with recombinant enzymes, specifically wheat endosperm ADP-Glc PPase, as well as Ca2+-dependent SOS2 and CDPK plant protein kinases. Results suggest that phosphorylation of the enzyme involved in the limiting step of starch built-up would be functionally relevant for the yield of grains in grass crops.

### MATERIALS AND METHODS

### Chemicals

ATP, Fru6P, 3PGA, and Glc1P were from Sigma Aldrich (St. Louis, MO, USA). All other reagents were of the highest quality available.

### Seeds Harvest

Seeds harvesting was as described previously (Piattoni et al., 2017). Briefly, Triticum aestivum L. cv. Baguette 11 samples collected at 3, 6, 10, 14, 17, and 27 days post-anthesis (DPA), and spikes were frozen immediately in liquid nitrogen. This sampling was based on reports indicating that the complete growth of wheat seeds is reached at 45 DPA, establishing the following respective phases of development: cell proliferation 0–10 DPA, accumulation or reserves 11–30 DPA, and seed maturation and desiccation 30–45 DPA (http://bio-gromit.bio.bris.ac.uk/ cerealgenomics/WheatBP/Documents/DOC\_WheatBP.php). Wheat samples were grains taken from the central part of the frozen spike (between the fifth and tenth spike) stored at -80°C until analysis. Castor (Ricinus communis) seeds were collected at 5, 10, 20, 25, 32, and 40 days post-pollination (DPP). The sampling criterion considered that castor oil seed development completes in ~60 days (Canvin, 1963), after which six groups classified according to the morphology exhaustively described in (Greenwood and Bewley, 1982). Sampled seeds were dissected from the capsule, frozen immediately in liquid nitrogen, and store at -80°C until analysis.

### Soluble Protein Extraction From Seeds

Seeds whole protein extraction was made by triplicate (using independent biological replicates) as reported elsewhere (Piattoni et al., 2017). Frozen seeds of wheat or castor oil were ground to a fine powder in liquid nitrogen using a mortar and pestle. It followed the addition of 1 µl cold fresh prepared extraction buffer per 1 mg of frozen powdered tissue. Composition of the extraction buffer was 50 mM MOPS pH 8.0, 1 mM EDTA, 1 mM EGTA, 25 mM NaF, 0,1% (v/v) Triton X-100, 20% (v/v) glycerol, 10 mM MgCl2, 2 mM DTT, 4% (p/v) PEG-8000, 2 mM PMSF, 5 mM malic acid, and 1% (p/v) polyvinylpyrrolidone (Turner et al., 2005). This buffer was supplemented with 2 mM aminocaproic acid, 1 mM benzamidine, 10 mM NaF, 1 mM Na2MoO4, 1 mM Na2VO4, and SETIII (1X) protease inhibitor cocktail EDTA-Free (Calbiochem). The mixture was incubated 20 min on ice with constant homogenization. Extracts were centrifuged 30 min at 4°C and 15,000 × g. The supernatant was recovered and used immediately for the different assays.

# Starch and Lipid Quantification

Quantification of starch from seeds was as indicated in (Piattoni et al., 2017), following a protocol that combines two previously described procedures (Reibach and Benedict, 1982; Baud and Graham, 2006). Plant tissue (100 mg) ground in a mortar under liquid nitrogen was soaked with 500 ml of ethanol 95% (v/v) at 4°C and then centrifuged 10 min at 15,000 × g and 4°C. The supernatant was discarded and the extraction repeated thrice to eliminate soluble sugars. The pellets were dried at 60° C, weighed, and then dissolved in 10 ml of distilled H2O per mg of extracted material. The resulting tubes (hermetically closed) boiled for 1 h to solubilize the starch, and then centrifuged during 10 min at 15,000 ×g and 4°C. The soluble fraction (20 µl) was added to 200 µl of 100 mM sodium acetate pH 4.5 plus 70 U of amyloglucosidase (1,4-a-D-glucan glucohydrolase) and incubated 16 h at 55–60°C for starch digestion. After centrifuging for 10 min at 15,000 × g and 4°C, the resulting soluble sugars were quantified by an enzymatic colorimetric assay where the H2O2 produced by glucose oxidase is measured by a peroxidase coupled to a colorimetric compound. The reaction mixture (100 µl) consisted of 70 µl of the commercial reactive (10 kU/l glucose oxidase, 1 kU/l peroxidase, 0.5 mM 4 aminophenazone, 100 mM phosphate buffer pH 7.0, and 12 mM 4-hydroxybenzoates) and 30 µl of sample conveniently diluted. The reaction was developed for 10 min at 37°C and the product quantified at 492 nm. To correlate the quantity of starch and the concentration of soluble sugars, we constructed a calibration curve [glucose (mg/ml) versus starch (mg)] with a standard starch solution treated identically to the sample.

To determine the contents of TAGs in seeds, we utilized protocols already described (Folch et al., 1957; Piattoni et al., 2017). Plant tissue (200 mg) was ground to a fine powder in liquid nitrogen and lipids extracted with 0.2 ml of MilliQ H2O and 3.8 ml of chloroform/methanol: 2/1 (v/v) solution. The extraction was performed in hermetically closed tubes incubated 2 h at room temperature with gentle mixing every 30 min. Samples were filtered on oil-free filter paper previously washed with the extraction solution, placed in previously weighed tubes, and vigorously mixed with 0.7 ml of 0.02% CaCl2 solution in chloroform/methanol/H2O: 3/48/47 (v/v). After centrifugation at 3,000 × g for 5 min and discard the upper phase, 0.7 ml of chloroform/methanol/H2O: 3/48/47 (v/v) were added, following vigorous mixing. Samples were centrifuged as before until phase separation, and the upper phase was discarded by suction, then evaporating chloroform at 45–50°C to obtain the lipids. After weighing, the lipids dissolved in isopropanol served to quantify TAGs by an enzymatic colorimetric assay. This method was based on the treatment of TAGs with lipase, then converting the produced glycerol into glycerol-3P by a specific kinase coupled to the reaction of glycerol-3P oxidase that produces H2O2. The latter was quantified by employing a peroxidase generating a compound that absorbs at 492 nm. The reaction mixture contained 50 mM PIPES pH 7.5, 5 mM 4-chlorophenol, 15 kU/l lipase, 1 kU/l glycerol-3P kinase, 2.5 U/ml glycerol-3P oxidase, 0.44 kU/l peroxidase, 0.7 mM 4-aminophenasone, 0.18 mM ATP, and the lipid sample.

All values of starch and lipids contents are the mean of at least three independent determinations, and reproducible within ±10%.

### Enzyme Activity Assay

ADP-Glc PPase activity was determined at 37°C in ADP-Glc synthesis direction by following Pi formation (after hydrolysis of PPi by inorganic pyrophosphatase) using the highly sensitive colorimetric method previously described (Fusari et al., 2006). The reaction mixture contained 100 mM MOPS pH 8.0, 7 mM MgCl2, 1.5 mM ATP, 1.0 mM Glc1P, 0.2 mg/ml BSA, 0.5 U/ml yeast inorganic pyrophosphatase and a proper sample enzyme dilution. Assays initiated by the addition of Glc1P at a final concentration of 1.5 mM in a total volume of 50 µl. The reaction mixture was incubated for 10 min at 37°C and terminated by adding the Malachite Green reactive. The complex formed with the released P<sup>i</sup> was measured at 630 nm. One unit of activity (U) is the amount of enzyme catalyzing the formation of 1 µmol of ADP-Glc per minute under the described conditions. All determinations are the mean of at least three independent sets of data that were reproducible within ±10%.

### Protein Methods

Total protein quantification was determined by the method of Bradford (Bradford, 1976) using BSA as a standard. Protein electrophoresis in polyacrylamide gels under denatured conditions (SDS-PAGE) was performed as previously described by Laemmli (Laemmli, 1970). Western blotting was performed by transferring the proteins resolved by SDS-PAGE to nitrocellulose membranes using a Mini-PROTEAN II (Bio-Rad) apparatus. The membrane was blocked 2 h at room temperature and subsequently incubated with primary antibody during 16 h at 15°C with agitation. The primary antibodies were raised in rabbits against the ADP-Glc PPase purified from spinach leaves (Gómez-Casati and Iglesias, 2002) or against each subunit of the wheat endosperm enzyme (TaeS and TaeL) produced in our laboratory. After intensive washing, membranes were incubated with HRP −conjugated anti−rabbit secondary antibody (Sigma) for 1 h. Bands were visualized using the ECL method and detection reagents (Thermo Scientific).

For the phosphorylated protein mobility delay assay, Phos-tag™ (Wako Chemicals) was added (100 µM) to the 10% (w/v) SDS-PAGE acrylamide gel. The acrylamide-pendant Phos-tag ligand provides a molecule with a functional affinity to interact with phosphate groups, after which produce electrophoretic mobilityshifts of phosphorylated proteins (Kinoshita et al., 2015). The phosphorylation reaction was carried out as described below except for the use of non-radioactive ATP and modifying the reaction time to 2 h with the addition of the recombinant kinase (and the corresponding amount of buffer according to the final volume change) every 30 min. Then, samples were denatured with SDS-PAGE buffer [1% SDS (w/v), 100 mM b-mercaptoethanol, in 50 mM TRIS-HCl pH 6.8] at 100°C for 5 min. After the electrophoretic run (30 mA per gel), gels were washed twice in methanol-free Tris-Gly transfer buffer and 1 mM EDTA, for 10 min each time with gentle agitation, followed other 10 min in methanolfree transfer buffer without EDTA. It followed electrotransference of the proteins and immunodetection with specific antibodies.

### Phosphorylated Proteins Purification and De-Phosphorylation

Purification of phosphorylated proteins was performed by Fe3+- Immobilized metal affinity chromatography (IMAC-Fe3+), as previously described (Muszyńska et al., 1986). Total proteins (1.2 mg) in 50 mM MES-NaOH pH 6.0 were loaded onto 100 ml of iminodiacetic acid-Fe3+ previously equilibrated with the same buffer. After 1 h incubation at room temperature and constant homogenization, non-adsorbed proteins were washed out twice with 2 ml of 50 mM MES-NaOH, pH 6.0. The increase of pH in three steps was then employed to elute the adsorbed. First, 5 volumes of 50 mM PIPES-HCl pH 7.2 were applied, then the adsorbed proteins were washed out with three volumes of 50 mM Tris-HCl pH 8.0, and finally phosphorylated proteins (tightly bound to the matrix) eluted with two volumes of 50 mM of Tris-HCl pH 9.0. SDS-PAGE served to resolve the samples eluted at the different pH conditions and for the analysis of the phosphorylated proteins.

The protein purification was performed from three independent biological replicates to enable an assessment of significance.

Protein de-phosphorylation experiments followed a protocol described elsewhere (Bustos and Iglesias, 2002). Proteins (600 µg) extracted from seeds were diluted in 50 mM Tris–HCl (pH 8.5), 1 mM EDTA, 10 mM MgCl2, 1.2 mM CaCl2, 20 mM 2 mercaptoethanol, 1mM PMSF, and 20 U of alkaline phosphatase (Promega). Samples were incubated at 37°C for 5 h, and the reaction stopped by the addition of 4X SDS-PAGE sample buffer. Then, samples were analyzed by electrophoresis followed by western blotting and immunodetection.

### ADP-Glc PPase and Protein Kinases Cloning, Expression, and Purification

The genes coding for the small (TaeS) and large (TaeL) subunits of wheat endosperm ADP-Glc PPase were synthesized de novo from DNA sequence reported previously (Ainsworth et al., 1993 and Ainsworth et al., 1995) and subcloned into pET28c as reported elsewhere (Ferrero et al., 2018). This recombinant enzyme displayed similar properties to the ADP-Glc PPase purified from the wheat endosperm (Gómez-Casati and Iglesias, 2002). For individual expression, each subunit was subcloned into the pET19TEV vector using NdeI and XhoI. For coexpression, TaeS was subcloned from the pET28c plasmid into the pCDFDuet-1 vector using NdeI and XhoI sites. By combining the pCDFDuet-1/ TaeS and pET19TEV/TaeL constructs, we obtained the heterotetrameric wheat endosperm ADP-Glc PPase (TaeSL) with a His-tag only in the L subunit. All the sequences were confirmed by the University of Chicago DNA Sequencing Facility (Chicago, IL, United States).

The AthSnRK1a1 gene amplified from A. thaliana cDNA served to generate the S198D mutation [required for its activity, as reported by (Shen et al., 2009)], by employing the QuickChange Site-Directed Mutagenesis kit (Agilent) (Kunkel et al., 1991). The mutated gene was subcloned in the pETDuet-1 vector to produce a protein with an N-term His-tag. The MdoSOS2 gene was synthesized de novo with a sequence and codon usage optimized for its expression in E. coli. The T168D mutation [required for activity (Guo, 2001)] was introduced by quick-change mutagenesis and then subcloned into a pET28b vector to produce a protein with an N-term His-tag. Molecular cloning of the gene coding for StuCDPK1 was as already reported (Santin et al., 2017; Rojas et al., 2018).

Recombinant TaeS, TaeL and TaeSL were obtained from E. coli BL21(DE3) RIL cells transformed with [pET19TEV/TaeS], [pET19TEV/TaeL] and co-transformed with [pCDFDuet-1/ TaeS + pET19/TaeL], respectively. Recombinant AthSnRK1a1 was obtained from E. coli BL21 Shuffle cells transformed with [pETDuet-1/AthSnRK1a1]; MdoSOS2 was obtained from E. coli BL21 Codon Plus cells transformed with [pET28b/MdoSOS2] and StuCDPK1 was obtained from E. coli BL21 Shuffle cells transformed with [pET22(+)/StuCDPK1]. Transformed cells were grown in 1 L of LB medium supplemented with the appropriate antibiotic (100 µg/ml ampicillin for pET19TEV, pET22(+), and pETDuet-1; 50 µg/ml kanamycin for pET28b, and 100 µg/ml for pCDFDuet-1) at 37 °C and 200 rpm until OD600 reached ~1.2. Cells were induced with 0.5 mM isopropylb-D-1-thiogalactopyranoside (IPTG) at 25°C and 180 rpm for 16 h and then harvested by centrifuging 10 min at 4°C and 5000 xg. The cells were resuspended in Buffer H [50 mM Tris-HCl pH 8.0, 300 mM NaCl, 5% (v/v) glycerol] and disrupted by sonication on ice (4 s pulse on with intervals of 3 s pulse off for a total time of 5 min). After centrifuging twice at 30,000 xg for 10 min, the supernatant (crude extract) was loaded onto a 1 ml HisTrap column (GE Healthcare) previously equilibrated with Buffer H. The recombinant protein was eluted with a linear gradient from 10 to 300 mM imidazole in Buffer H, and fractions containing the highest activity (or more pure proteins) were pooled with 10% (v/v) glycerol and stored at -80°C until use. The pools of TaeSL, TaeL and TaeS, were dialyzed against Buffer X [50 mM MOPS pH 8.0, 0.1 mM EDTA, 20% (w/v) sucrose, 5 mM MgCl2] and concentrated using an Amicon Ultra-4 30 K unit (Millipore, Billerica, MA, USA). Under these conditions the enzyme was fast frozen and stored at -80°C until use, being fully actives for at least 1 year (in the case of TaeSL).

# In Vitro Phosphorylation Assays

For in vitro phosphorylation of recombinant forms of the wheat endosperm ADP-Glc PPase [the heterotetrameric (TaeSL), small subunit (TaeS), or large subunit (TaeL); see (Ferrero et al., 2018), the respective purified enzyme (2 µg) was incubated under conditions established for the activity of different protein kinases from plants (Piattoni et al., 2011; Rojas et al., 2018)]. The three (I–III) specific phosphorylation conditions used for assay crude extracts from seeds or purified recombinant protein kinase were as follows. I. For SnRK or Ca2+-independent kinases: 100 mM HEPES pH 7.3, 5 mM DTT, 10 mM MgCl2, 0.05 mM ATP, 0.5 mM EGTA. II. For CDPK or Ca2+-dependent kinases: 20 mM Tris-HCl pH 7.5, 10mM MgCl2, 5 mM DTT, 1 mM CaCl2, 0.05 mM ATP. III. For SOS2: 20 mM Tris-HCl pH 7.2, 5 mM MgCl2, 5 mM DTT, 0.5 mM CaCl2, 0.01 mM ATP, 2.5 mM MgCl2. Worthy of mention is that the critical difference between these protein kinases is related to requirement (or not) of calcium. In contrast, all of them need reducing condition (5 mM DTT) for optimal activity (Raıces et al., ́ 2001; Crozet et al., 2010; Gong et al., 2019). Each reaction (final volume 20 µl) was incubated at 30°C for 30 min after the addition of 1 µCi of [32P]-g-ATP (Migliore-Lacaustra) and initiated by adding wheat endosperm extract as kinase resource, or an appropriate aliquot of the recombinant protein kinase [0.4 mU (determined for AMARA peptide)]. After the reaction, the protein mixtures were denatured with SDS-PAGE buffer [1% SDS (w/v), 100 mM bmercaptoethanol, in 50 mM Tris-HCl pH 6.8]. It followed the resolution of the proteins by electrophoresis [SDS-PAGE with 10% (w/v) acrylamide and run at 30 mA/gel]. Then, gels were stained with Coomassie Brilliant Blue R-250, dried, and radioactivity incorporation detected by Storing Phospho-screen (GE Healthcare) exposure and scanning with the Typhoon™ system (GE Healthcare). All phosphorylation assays were performed from three independent technical replicates to enable an assessment of significance.

# RESULTS AND DISCUSSION

### ADP-Glc PPase Is Phosphorylated Along With Development of the Seed in Wheat but Not in Castor Bean

Seeds of grasses are relevant components within the world crops providing food and energy, with wheat being one of the principal edible grains. From this, it is clear the relevance of starch as a major staple. Another primary product of agriculture is TAGs accumulated in seeds of oleaginous plants. To better understand starch biosynthesis (and its regulation) in seeds, we explored about contents of the polysaccharide and TAGs, as well as ADP-Glc PPase activity and phosphorylation profiles along with the development of Triticum aestivum (wheat) seeds (Table 1 and Figure 1). For comparison, we performed similar studies with seeds of the oleaginous Ricinus communis (castor bean) (Supplemental Table 1 and Supplemental Figure 1). As shown in Table 1, wheat seeds stored significantly more starch than TAGs, with levels of the polysaccharide progressively increasing along the stages of development, reaching levels up to 30% of the seed weight during accumulation of reserves (between 11–30 DPA). These profiles are in agreement with values reported in previous studies on developing wheat endosperm (Briarty et al., 1979; Dai, 2010; Piattoni et al., 2017). Instead, castor bean seeds exhibited a progressive increase in the amount of starch during cell proliferation (0–20 DPP), but an abrupt decrease during accumulation of reserves (20–40 DPP) and maturation plus desiccation (30–50 DPP). As reported for the oleaginous plant (Canvin, 1963; Greenwood and Bewley, 1982), TAGs remained at a low level during cell proliferation switching to an increase during accumulation to reach up to 30% of the seed weight at 50 DPP (Supplemental Table 1).

It is worth correlating levels of starch and ADP-Glc PPase activity in Table 1 with profiles of the enzyme immunodetected in extracts (either whole protein or phosphoprotein enriched after IMAC-Fe3+) from wheat seeds at the different development stages detailed in Figure 1. As expected from the critical role played by ADP-Glc PPase in starch biosynthesis, the specific activity determined for the enzyme followed a moderate increase between the cell proliferation, transition, and maturation steps of development of wheat grains, accompanying levels of the

TABLE 1 | Determinations of weight, starch, and TAG contents, ADP-Glc PPase activity and total soluble proteins in wheat seeds at different development stages. DPA, days post-anthesis.


The determinations are the mean of at least three independent set of data that were reproducible within ±10%.

polysaccharide accumulation (Table 1). Thus, the enzyme activity increased ~2-fold between 3 and 10 DPA and other ~2.5-fold between 10 and 27 DPA. The profile of the enzyme immunodetection showed a continued increase in protein levels between 3 and 10 DPA, then remaining almost similar between 10 and 17 DPA with a slight decrease at 27 DPA (Figure 1A). It is also evident that since 14 DPA (with small increases at 17 and 27 DPA), ADP-Glc PPase was found as a phosphoprotein in wheat seed extracts (Figure 1B). Concerning castor bean seeds, values of the specific activity (Supplemental Table 1) and protein levels (Supplemental Figure 1A) for ADP-Glc PPase accompanied amounts of starch accumulated, being higher at the first stage and continuously decreasing along with development. A remarkable difference compared towheat grains is that in castor bean, the ADP-Glc PPase exhibited no phosphorylation at any DPP stage of seeds (Supplemental Figure 1B).

It is worth noting that our results reveal the proteolytic degradation of control actin at 17–27 DPA in wheat (Figure 1C) and 35-50 DPP in castor bean (Supplemental Figure 1C). This is in agreement with reports indicating that at the advanced stages of development the seed may contain high levels of proteolytic enzymes, which with the accumulation of storage products could constitute a detriment for the stability of proteins (Ahn and Chen, 2007; Nadaud et al., 2010; Nogueira et al., 2012; Piattoni et al., 2017). The important issue concerning ADP-Glc PPase is that: (i) in the case of castor bean the absence of phosphorylation is apparent even when no proteolytic activity is observed (5–35 DPP) (Supplemental Figure 1); and (ii) still, its presence (phosphorylated) in wheat was observed even when there were signs of degradation (17–27 DPA) (Figure 1). These results support that the phosphorylation of ADP-Glc PPase would be of relevance for starch synthesis in tissues where the polysaccharide accumulation is the long-term metabolic goal for carbon partitioning. Besides, the metabolic picture is markedly different in oleaginous seeds, this fact linked with the storage product in these plants.

Itwas relevant to corroborate that ADP-Glc PPase recovery after IMAC-Fe3+ chromatography of the total protein extractfromwheat grains was a consequence of enzyme phosphorylation, rather than an unspecific interaction of its non-phosphorylated form. For this, we performed a de-phosphorylation treatment of the whole protein extract obtained from wheat seeds at 17 DPA with alkaline phosphatase (Figure 2). Figure 2A shows that the ADP-Glc PPase from the untreated protein extract loaded for IMAC-Fe3+ interacted with the matrix, because of its immunodetection after elution. Conversely, for extracts incubated with alkaline phosphatase, no immunoreactive protein bands were observed after elution from the affinity chromatography (Figure 2B). These results confirmed that the phosphoprotein enrichment is specifically related to the presence of a post-translationally modified ADP-Glc PPase in the wheat seeds.

### Ca2+-Dependent Protein Kinases Are Involved in Phosphorylation of ADP-Glc PPase in Wheat Seeds

To advance in the characterization of the modification by phosphorylation of ADP-Glc PPase from wheat endosperm by protein kinases at the molecular level, we performed in vitro studies using enzymes produced recombinantly with high purity degree. The experimental approach considered that plant protein kinases

can be classified based on structure-to-function relationships. Purposely, the classification establishes families relating protein sequence and specificity for substrates, cofactors (e.g., Ca2+ dependence), and other reaction conditions (Hardie, 1999). Within this categorization, it is common that the properties are similar among different plant species. In this context, the plantspecific superfamily emerges by grouping protein kinases of the type SNF1-related and Ca2+-dependent (SnRK-CDPK) (Vlad et al., 2008).

Based on the above detailed, we first incubated the recombinant heterotetrameric wheat endosperm ADP-Glc PPase (TaeADP-Glc PPase) with crude extracts from wheat seeds at the 17 DPA stage of development in media containing [32P]ATP and optimal conditions for the activity of different plant protein kinases. These conditions allowed to assay either, Ca2+-independent (SnRK1) or Ca2+-dependent (CDPK and SOS2) protein kinases (Piattoni et al., 2011; Piattoni et al., 2017; Rojas et al., 2018). As shown in Figure 3, TaeSL was poorly phosphorylated under conditions promoting SnRK1 activity (left panels), even when this kinase was able to modify the wheat NAD<sup>+</sup> -dependent glyceraldehyde-3P dehydrogenase (EC 1.2.1.12, TaeGa3PDHase) used as control of positive phosphorylation (right panels) (Piattoni et al., 2011). Conversely, marked phosphorylation of TaeADP-Glc PPase was evident in media favoring Ca2+-dependent protein kinases (Figure 3).

The above described higher capacity of Ca2+-dependent protein kinases was confirmed performing in vitro experiments using highly purified, recombinant forms of SnRK1a1 (Ca2+-independent), SOS2, and CDPK1 (Ca2+-dependent) kinases from different plants to phosphorylate TaeADP-Glc PPase. Results illustrated by Figure 4 indicate that the incorporation of radioactive phospho-moieties from [32P]-ATP into the ADP-Glc PPase was significantly higher when the enzyme was incubated with SOS2 and CDPK1 in comparison with that performed with the SnRK1a1 plant protein kinase. These data support a scenario where the phosphorylation of ADP-Glc PPase by Ca2+-dependent protein kinases would contribute to direct metabolism toward the active synthesis of

FIGURE 3 | Phosphorylation of recombinant TaeADP-Glc PPase by crude extracts (CE) from wheat seeds at 17 DPA. The recombinant enzyme was incubated in the presence (+) or absence (-) of total wheat seed crude extract under two different phosphorylation conditions: for (I) Ca2+-independent and (II) Ca2+-dependent protein kinases. After the phosphorylation reaction with [ 32P]ATP, the presence of radioactive label was detected by exposure of the SDS-PAGE gel to a Storage Phosphor-Screen. As controls, we used seed crude extract without further addition and the recombinant TaeGa3PDHase, an enzyme already reported as a target of phosphorylation in wheat (Piattoni et al., 2017). The protein phosphorylation was performed from three independent technical replicates.

starch in wheat seeds. This fact would be relevant in a plant tissue that, at complete development, mainly drives assimilated carbon to produce the polysaccharide for a long-term reserve component.

### The L Subunit of ADP-Glc PPase Is Majorly Phosphorylated in Wheat Seeds

Since the functional ADP-Glc PPase found in the wheat endosperm is a heterotetrameric enzyme, we sought to determine if its phosphorylation involves the indistinct modification of both subunits. For such a purpose, we performed the incubation of TaeADP-Glc PPase with SnRK1a1, SOS2, or CDPK1 under conditions of phosphorylation using non-radioactive ATP. Afterward, the samples were analyzed by SDS-PAGE (either alone or containing Phos-tag, to explore the lower mobility of phosphorylated proteins) followed by electro-transference and detection using antibodies specific for each ADP-Glc PPase S or L subunits. Figure 5 shows that after treatment assuring modification by the Ca2+-dependent protein kinases, the delay in migration was only evident for the L polypeptide, suggesting that this is the subunit that is the target of phosphorylation.

Further experiments gave support to the specificity of the protein kinases to modify the L subunit of TaeADP-Glc PPase. Indeed, the incubation of the purified recombinant TaeS or TaeL forms with crude extracts from wheat seeds and [32P]ATP rendered SDS-PAGE

profiles revealing phosphorylation of both subunits but at a different degree. As illustrated by Supplemental Figure 3, significantly higher incorporation of radioactivity took place on the TaeL protein. Also, in the analysis of the samples from the incubation of TaeS or TaeL subunits with the recombinant plant protein kinases using the approach of western-blots from SDS-PAGE with Phos-tag, the lower mobility was only observed for L polypeptide (Supplemental Figure 4). These results strongly agree with the key (tissue-specific) regulatory role assigned to the L subunit of heterotetrameric ADP-Glc PPases (Crevillén et al., 2003; Ballicora et al., 2004; Ventriglia et al., 2008; Ferrero et al., 2018; Figueroa et al., 2018), suggesting that its post-translational modification by phosphorylation would be relevant for starch synthesis in wheat seeds.

# CONCLUSIONS

In this work, we report that wheat seed ADP-Glc PPase undergoes progressive phosphorylation along with grain development. The modification of the enzyme increased, as well as the specific activity, during the transition and the beginning of the maturation steps of development. These changes followed the pattern of active synthesis of starch, which is the principal carbon reserve in this grain. The plant protein kinases involved in the phosphorylation of TaeADP-Glc PPase are of the type Ca2+-dependent, and studies with recombinant enzymes support the specific action of SOS2 and CDPK1 kinases. Notably, the post-translational modification of ADP-Glc PPase was absent in seeds of castor bean, which accumulate lipids instead of carbohydrates as a reserve. Results reported herein suggest that phosphorylation of ADP-Glc PPase in seeds of grasses would be relevant for the synthesis of starch. Future studies will shed light on the actual effects of the phosphorylation on

the stability or allosteric regulation (or both) of the wheat endosperm ADP-Glc PPase. A critical role of phosphorylation on the modulation of this enzyme is in good agreement with the limiting role it plays for the production of the main reserve polysaccharide in bacteria and plants (Ballicora et al., 2003; Ballicora et al., 2004). The characterization of the posttranslational modification at a molecular level indicated that in the heterotetrameric TaeADP-Glc PPase, the L subunit is at least the major target of phosphorylation. In other plant species, the L subunit also has a primary function in tissue-specific regulation of the catalytic activity exerted by the S subunit (Crevillén et al., 2003; Ballicora et al., 2004; Ballicora et al., 2005; Ventriglia et al., 2008; Ferrero et al., 2018; Figueroa et al., 2018).

This work provides evidence on the post-translational phosphorylation of ADP-Glc PPase in wheat seeds, thus complementing previous predictive phosphoproteomic studies (Lohrig et al., 2009; Kötting et al., 2010; Ma et al., 2014). This modification of the enzyme limiting the pathway of starch biosynthesis would be part of a proposed general mechanism in which protein kinases critically phosphorylate different enzymes involved in the metabolism of the polysaccharide in plants (Kötting et al., 2010; Geigenberger, 2011; Goren et al., 2018). In this framework, phosphorylation would chiefly orchestrate the functioning of the starch metabolism in combination with redox modification and the formation of complexes between different proteins (Lohrig et al., 2009; Kötting et al., 2010; Geigenberger, 2011; Ma et al., 2014; Tetlow et al., 2015; Goren et al., 2018). These combined processes would operate as an effective mechanism to optimize the accumulation of the polysaccharide in seeds of wheat and other grass crops. Concerning post-translational modification, the wheat endosperm ADP-Glc PPase was distinctively characterized as insensitive to the redox regulation demonstrated for the enzyme from other plant sources (Ballicora et al., 2004; Linebarger et al., 2005; Tuncel et al., 2014; Ferrero et al., 2018). Thus, results reported herein provide new information about the mechanism involved in starch synthesis in grasses.

The gained information reported at present opens many research approaches in the way to reach a better understanding (at the molecular level) of starch synthesis in wheat (and other cereals producing the polysaccharide as a prime component). A critical issue to investigate is related to protein chemistry studies (including mass spectrometry) to reach the identification of the specific residues in the enzyme that are the target of protein kinases. This latter characterization could be followed by the production of phosphomimetic forms of the TaeADP-Glc PPase, which would allow analyzing how phosphorylation modifies its kinetic, regulatory, and stability properties. In this scenario, the possibility that phosphorylated TaeADP-Glc PPase would better interact with other enzymes and proteins is an issue needing attention in future research. Another topic to be explored refers to gaining detail of the specific Ca2+-dependent protein kinase from wheat involved in the post-translational modification. This subject is of complex understanding, considering that T. aestivum is a hexaploid organism, a product of broad hybrid mix in the breeding process (Appels et al., 2018). At present, genomic information on Triticum aestivum L. identified fifteen and twenty genes coding for SnRK1 (Perochon et al., 2019) and CDPK (Li et al., 2008), respectively.

Future studies will be central for the design of strategies and biotechnological tools to improve yields in agriculture. Starch, as the primary feedstock of many crops, is currently the leading world supplier for caloric demands of animals (including humans). Besides, starch is a natural product of high relevance for the development of biofuels and bioplastics, which are critical products to cope with the world-wide challenges tied to demographic expansion and climate change.

### DATA AVAILABILITY STATEMENT

All datasets generated for this study are included in the article/ Supplementary Material.

# AUTHOR CONTRIBUTIONS

DF, CP, MA, BR, MH, MB, and AI conceived and designed the experiments. AI wrote the paper. DF, BR, and MH performed the experiments. DF, CP, MA, MB, and AI analyzed the data. AI, MA and MB contributed reagents, materials, and analysis tools.

# FUNDING

This work was supported by the National Science Foundation (grant MCB 1616851 to MAB), and by grants from ANPCyT (PICT 2017 1515 and PICT 2018 00929 and to AAI; PICT 2015 0634 to MDAD), UNL (CAID 2016, PIC 50420150100053LI, to AAI) and CONICET (PUE 2016 0040 to IAL). MDAD and AAI are members of the Research Career from CONICET. DF and BR are doctoral fellows from CONICET.

# ACKNOWLEDGMENTS

Mrs. Jaina Bhayani for detailed reading and correction of the English language.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2020.01058/ full#supplementary-material

SUPPLEMENTAL TABLE 1 | Determinations of weight, starch, and TAG contents, ADP-Glc PPase activity and total soluble proteins in castor bean at different development stages. DPP, days post-pollination. The determinations are the mean of at least three independent set of data that were reproducible within ±10%.

SUPPLEMENTAL FIGURE 1 | Immunodetections in castor bean samples throughout development: (A) total proteins and (B) phosphoproteins purified by IMAC-Fe3+ evaluated with anti-ADP-Glc PPase obtained from the purified enzyme of leaves of Spinacia oleracea (Gó mez-Casati and Iglesias, 2002) and (C) control of actin in total proteins. Crude extracts preparation and phosphoprotein purification were performed from three independent biological replicates. Protein profiles of the samples are shown in Supplemental Figure 3.

SUPPLEMENTAL FIGURE 2 | SDS-PAGE of wheat and castor bean seed samples throughout development: (A) total proteins (same amount in all cases) and (B) phosphoproteins purified by IMAC-Fe3+ (same volume of eluted protein of IMAC-Fe3+ in all cases).

SUPPLEMENTAL FIGURE 3 | Phosphorylation of TaeS and TaeL recombinant versions by crude extracts (CE) from wheat seeds at 17 DPA. The recombinant enzyme was incubated in the presence (+) or absence (-) of total wheat seed crude extract under two different phosphorylation conditions: for (I) Ca2+-independent and (II) Ca2+-dependent protein kinases. After the phosphorylation reaction with [32P] ATP, the presence of radioactive label was detected by exposure of the SDS-PAGE gel to a Storage Phosphor-Screen. The protein phosphorylation was performed from three independent technical replicates

SUPPLEMENTAL FIGURE 4 | Immunodetection of phosphorylated TaeL and TaeS subunits. The subunits were phosphorylated with the respective recombinant protein kinase and then resolved by SDS-PAGE with or without Phos-tag, with subsequent electrotransfer and immunodetection using specific antibodies anti-TaeL or anti-TaeS. Lanes without recombinant subunit are shown as control. The white arrows indicate the phosphorylated delayed peptides. The protein phosphorylation was performed from three independent technical replicates.

## REFERENCES


Ballicora, M. A., Iglesias, A. A., and Preiss, J. (2003). ADP-glucose pyrophosphorylase, a regulatory enzyme for bacterial glycogen synthesis. Microbiol. Mol. Biol. Rev. 67, 213–225. doi: 10.1128/MMBR.67.2.213-225.2003

Ballicora, M. A., Iglesias, A. A., and Preiss, J. (2004). ADP-glucose pyrophosphorylase: A regulatory enzyme for plant starch synthesis. Photosynth. Res. 79, 1–24. doi: 10.1023/B:PRES.0000011916.67519.58

Ballicora, M. A., Dubay, J. R., Devillers, C. H., and Preiss, J. (2005). Resurrecting the ancestral enzymatic role of a modulatory subunit. J. Biol. Chem. 280, 10189–10195. doi: 10.1074/jbc.M413540200

Baris, I., Tuncel, A., Ozber, N., Keskin, O., and Kavakli, I. H. (2009). Investigation of the interaction between the large and small subunits of potato ADP-glucose pyrophosphorylase. PloS Comput. Biol. 5, 1–14. doi: 10.1371/journal.pcbi.1000546

Baud, S., and Graham, I. A. (2006). A spatiotemporal analysis of enzymatic activities associated with carbon metabolism in wild-type and mutant embryos of Arabidopsis using in situ histochemistry. Plant J. 46, 155–169. doi: 10.1111/ j.1365-313X.2006.02682.x

Bradford, M. M. (1976). A Rapid and Sensitive Method for the Quantitation Microgram Quantities of Protein Utilizing the Principle of Protein-Dye Binding. Anal. Biochem. 254, 248–254. doi: 10.1016/0003-2697(76)90527-3

Briarty, L. G., Hughes, C. E., and Evers, A. D. (1979). The Developing Endosperm of Wheat - A Stereological Analysis. Ann. Bot. 44, 641–658. doi: 10.1093/ oxfordjournals.aob.a085779

Bustos, D. M., and Iglesias, A. A. (2002). Non-phosphorylating glyceraldehyde-3 phosphate dehydrogenase is post-translationally phosphorylated in heterotrophic cells of wheat (Triticum aestivum). FEBS Lett. 530, 169–173. doi: 10.1016/S0014-5793(02)03455-5

Canvin, D. T. (1963). Formation of oil in the seed of Ricinus communis L. Can. J. Biochem. Physiol. 41, 1879–1885. doi: 10.1139/y63-214

Crevillén, P., Ballicora, M. A., Mé rida, Á , Preiss, J., and Romero, J. M. (2003). The different large subunit isoforms of Arabidopsis thaliana ADP-glucose pyrophosphorylase confer distinct kinetic and regulatory properties to the heterotetrameric enzyme. J. Biol. Chem. 278, 28508–28515. doi: 10.1074/ jbc.M304280200

Crofts, N., Nakamura, Y., and Fujita, N. (2017). Critical and speculative review of the roles of multi-protein complexes in starch biosynthesis in cereals. Plant Sci. 262, 1–8. doi: 10.1016/j.plantsci.2017.05.007

Crozet, P., Jammes, F., Valot, B., Ambard-Bretteville, F., Nessler, S., Hodges, M., et al. (2010). Cross-phosphorylation between Arabidopsis thaliana sucrose nonfermenting 1-related protein kinase 1 (AtSnRK1) and its activating kinase (AtSnAK) determines their catalytic activities. J. Biol. Chem. 285, 12071–12077. doi: 10.1074/jbc.M109.079194

Dai, Z. (2010). Activities of enzymes involved in starch synthesis in wheat grains differing in starch content. Russ. J. Plant Physiol. 57, 74–78. doi: 10.1134/ S1021443710010103

Ferrero, D. M. L., Asencion Diez, M. D., Kuhn, M. L., Falaschetti, C. A., Piattoni, C. V., Iglesias, A. A., et al. (2018). On the roles of wheat endosperm ADP- glucose pyrophosphorylase subunits. Front. Plant Sci. 9, 1498. doi: 10.3389/ fpls.2018.01498


Georgelis, N., Shaw, J. R., and Hannah, L. C. (2009). Phylogenetic analysis of ADPglucose pyrophosphorylase subunits reveals a role of subunit interfaces in the allosteric properties of the enzyme. Plant Physiol. 151, 67–77. doi: 10.1104/ pp.109.138933

Godfray, H. C. J., Beddington, J. R., Crute, I. R., Haddad, L., Lawrence, D., Muir, J. F., et al. (2010). Food security: The challenge of feeding 9 billion people. Science 327, 812–818. doi: 10.1126/science.1185383

Gong, D., Guo, Y., Jagendorf, A. T., and Zhu, J. (2019). Biochemical Characterization of the Arabidopsis Protein Kinase SOS2 That Functions in Salt Tolerance 1. Plant Physiol. 130, 256–264. doi: 10.1104/pp.004507.256

Goren, A., Ashlock, D., and Tetlow, I. J. (2018). Starch formation inside plastids of higher plants. Protoplasma 255, 1855–1876. doi: 10.1007/s00709-018-1259-4


Jeon, J. S., Ryoo, N., Hahn, T. R., Walia, H., and Nakamura, Y. (2010). Starch biosynthesis in cereal endosperm. Plant Physiol. Biochem. 48, 383–392. doi: 10.1016/j.plaphy.2010.03.006

Kötting, O., Kossmann, J., Zeeman, S. C., and Lloyd, J. R. (2010). Regulation of starch metabolism: The age of enlightenment? Curr. Opin. Plant Biol. 13, 321– 329. doi: 10.1016/j.pbi.2010.01.003

Kavakli, I. H., Greene, T. W., Salamone, P. R., Choi, S. B., and Okita, T. W. (2001a). Investigation of subunit function in ADP-glucose pyrophosphorylase. Biochem. Biophys. Res. Commun. 281, 783–787. doi: 10.1006/bbrc.2001.4416

Kavakli, I. H., Park, J. S., Slattery, C. J., Salamone, P. R., Frohlick, J., and Okita, T. W. (2001b). Analysis of Allosteric Effector Binding Sites of Potato ADPglucose Pyrophosphorylase through Reverse Genetics. J. Biol. Chem. 276, 40834–40840. doi: 10.1074/jbc.M106310200

Kinoshita, E., Kinoshita-Kikuta, E., and Koike, T. (2015). Advances in Phos-tagbased methodologies for separation and detection of the phosphoproteome. Biochim. Biophys. Acta - Proteins Proteomics 1854, 601–608. doi: 10.1016/ j.bbapap.2014.10.004


Solanum tuberosum that is induced at the onset of tuber development. Plant Mol. Biol. 46, 591–601. doi: 10.1023/A:1010661304980


Conflict of Interest: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Ferrero, Piattoni, Asencion Diez, Rojas, Hartman, Ballicora and Iglesias. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Global Role of Crop Genomics in the Face of Climate Change

*Mohammad Pourkheirandish† , Agnieszka A. Golicz† , Prem L. Bhalla and Mohan B. Singh\**

*Plant Molecular Biology and Biotechnology Laboratory, Faculty of Veterinary and Agricultural Sciences, University of Melbourne, Parkville, VIC, Australia*

The development of climate change resilient crops is necessary if we are to meet the challenge of feeding the growing world's population. We must be able to increase food production despite the projected decrease in arable land and unpredictable environmental conditions. This review summarizes the technological and conceptual advances that have the potential to transform plant breeding, help overcome the challenges of climate change, and initiate the next plant breeding revolution. Recent developments in genomics in combination with high-throughput and precision phenotyping facilitate the identification of genes controlling critical agronomic traits. The discovery of these genes can now be paired with genome editing techniques to rapidly develop climate change resilient crops, including plants with better biotic and abiotic stress tolerance and enhanced nutritional value. Utilizing the genetic potential of crop wild relatives (CWRs) enables the domestication of new species and the generation of synthetic polyploids. The high-quality crop plant genome assemblies and annotations provide new, exciting research targets, including long non-coding RNAs (lncRNAs) and cis-regulatory regions. Metagenomic studies give insights into plant-microbiome interactions and guide selection of optimal soils for plant cultivation. Together, all these advances will allow breeders to produce improved, resilient crops in relatively short timeframes meeting the demands of the growing population and changing climate.

### *Edited by:*

*Nicola Colonna, Italian National Agency for New Technologies, Energy and Sustainable Economic Development (ENEA), Italy*

### *Reviewed by:*

*Harsh Raman, New South Wales Department of Primary Industries, Australia Ki-Hong Jung, Kyung Hee University, South Korea*

### *\*Correspondence:*

*Mohan B. Singh mohan@unimelb.edu.au † These authors share first authorship*

### *Specialty section:*

*This article was submitted to Plant Biotechnology, a section of the journal Frontiers in Plant Science*

*Received: 24 December 2019 Accepted: 05 June 2020 Published: 16 July 2020*

### *Citation:*

*Pourkheirandish M, Golicz AA, Bhalla PL and Singh MB (2020) Global Role of Crop Genomics in the Face of Climate Change. Front. Plant Sci. 11:922. doi: 10.3389/fpls.2020.00922*

Keywords: domestication, genomics, climate change, crops, transcriptomics, abiotic stress

### INTRODUCTION

The world will require a dramatic increase in food production in the next 30 years. Global food security is one of the key challenges of this century with the current human population of 7.7 billion expected to reach 8.6 billion in 2030 and 10 billion by 2050 (Tomlinson, 2013). The increase in population has led to an increase in urbanization, which is directly and indirectly, reducing our access to suitable land for agriculture (Satterthwaite et al., 2010). Simultaneously, the effects of climate change, including but not limited to increased temperature, changing patterns of rainfall, and increased levels of CO2 and ozone, impose further pressure on agriculture *via* drought and salinity that limit agricultural land and water use (Godfray et al., 2010).

Population growth is not the only reason we will need to increase food production. Significant income growth in rapidly developing economies gave rise to an emerging middle class, accelerating the dietary transition toward higher consumption of meat, eggs, and dairy products and boosting the need to grow more grain to feed more cattle, pigs, and poultry (Tilman and Clark, 2014). Agriculture in 2050 will need to produce almost 60–100% more food and feed than it is doing now (Tilman et al., 2011). This goal must be achieved despite the increase in global temperatures associated with climate change and growing scarcity of water and land, which are predicted to have significant impacts on the yield of all major crops.

In the last few centuries, plant breeders successfully used crossing and selection to improve the agronomic character of cultivated crops, such as wheat, maize, rice, barley, and others, resulting in dramatic increases in food production. However, agriculture has shifted to monoculture, resulting in the significant reduction of genetic diversity with today's global agricultural food depending on a few key plant species (Khoury et al., 2014).

The genetic gains achieved by conventional crop breeding and advanced agronomic practices have led to more than a double increase in crop yields between 1960 and 2015. The development of dwarf varieties of rice and wheat coupled with greater use of synthetic fertilizers and irrigation led to the first green revolution. However, the yield increases due to the green revolution are declining and/or beginning to plateau for the major food crops (Grassini et al., 2013). After years of improvement, we are getting close to the final capacity of these few crops on yield and their tolerance to biotic and abiotic stresses. The current trend of annual yield increases for major crops of between 0.9 and 1.6% is insufficient to meet requirements in the near future (Ray et al., 2013). It has been estimated that about 2.4% annual yield gain is required to meet the global food demand (Ray et al., 2013). Thus, development of high-yielding climate change resilient crops with enhanced tolerance to water deficit, temperature, and biotic stresses is critical for increasing productivity to keep pace with the increasing human population.

The challenge of feeding the increasing human population under climate change conditions is unlikely to be met by conventional breeding technologies alone. Plant breeding must adopt new, multidisciplinary approaches to enhance the rate of genetic gain (Varshney et al., 2018). Fortunately, the science underpinning plant breeding is being revolutionized by the recent conceptual and technological innovations including the development of rapid, cheap sequencing technologies and the rise of genomics allowing for the detailed analysis of plant genomes and dissection of the genetic basis of agronomic traits. Genomics is now at the core of crop improvement, including the identification of genetic variation underlying differences in phenotypes, identification of additional sources of variation and novel traits, and characterization of molecular pathways involved in biotic and abiotic stress tolerance.

Recently the development of genome editing technologies, especially CRISPR/Cas9, opened new routes of fast and precise genome modification promising rapid translation of knowledge from the lab to the field. Genome editing allows introduction of insertions/deletions or an entirely new sequence at a desired location in the target genome (Scheben et al., 2017). Known genes controlling important traits can be selectively modified using genome editing, allowing for manipulation of phenotypes. In recent years, several genome edited crop plants entered final stages of commercialization in the United States of America including drought and salt tolerant soybean, *Camelina* with increased oil content, and waxy corn (Waltz, 2018).

Considering the urgent need for crop plant improvement and the new, exciting technological and conceptual developments, this review outlines the potential of genomic approaches (**Table 1**) for the development of climate change resilient crops.

# USING GENOMICS TO IMPROVE CROP PLANT DIVERSITY AND RESILIENCE

### Accessing Genetic Diversity of CWRs

Wild plants have survived under a changing climate for millions of years, during which they have been subjected to selective pressure by biotic and abiotic factors. This natural selection has led to the accumulation of genes allowing plants to resist, tolerate, or avoid extreme temperatures, draught, or flooding, as well as pests and diseases. However, during subsequent domestication, many of those, now important, traits and associated genetic material were lost, transforming some of the plants into our current remarkably productive crops with limited genetic diversity (**Figure 1**). The remainder of the genetic resources was left behind and mostly treated as a weed. Insights gained from the genome sequencing projects of different crops

TABLE 1 | Summary of different approaches, which can be used to improve crop diversity and resilience.


(Zhou et al., 2015b; Mascher et al., 2017; Appels et al., 2018; Springer et al., 2018; Zhao et al., 2018b) demonstrated the narrow germplasm of our modern crops and emphasized their vulnerabilities to climate change. However, all modern crop plants were domesticated from crop wild relatives (CWRs), which are still found in the wild, and provide a rich pool of genetic material, which is often excluded from the existing breeding programs (Brozynska et al., 2016).

Elite cultivated crops, such as wheat, maize, rice, and barley, are often dependant on farmer supplied resources, including water *via* irrigation, nutrition *via* fertilizers, and resistance to biotic stresses through the use of pesticides. This has led to the elite varieties becoming less resilient compared to their wild counterparts. In addition, strong artificial selection for a handful of crucial traits resulted in reduced diversity and restriction of the gene pool available within breeding programs. CWRs constitute an additional source of genetic diversity, which can be utilized during crop improvement programs, with as much as 30% of the increases in crop yields during the late 20th century being attributed to the use of CWRs in plant breeding programmes (Pimentel et al., 1997; Brozynska et al., 2016).

Over 1,500 CWRs of food crops have been identified as a potential source of genetic diversity for 173 globally important

crops (Vincent et al., 2013). Advances in sequencing technologies facilitated construction of CRW reference genomes, which in turn can be used in comparative genomics analyses, allowing for the identification of novel genes controlling key traits (Brozynska et al., 2016).

Traditionally, the new genetic material was transferred from CWRs to crop plants by introgression of new genes into elite cultivar background (**Figure 2**; Dempewolf et al., 2017). Genomic resources have been widely used to speed up the process *via* marker assisted selection, including transfer of disease resistance genes in grape vine, apple, and banana (Migicovsky and Myles, 2017). Despite its obvious success, especially in transfer of major genes, the method is time consuming and restricted to sexually compatible species. For example, *Hordeum vulgare* (cultivated barley) has been extensively crossed with cross-compatible *Hordeum spontaneum* (wild progenitor), and there has been limited success crossing cultivated barley with *Hordeum bulbosum*, where chromosome segment from *H. bulbosum* can be transferred to the chromosomes of cultivated barley (Westerbergh et al., 2018). There are however 32 species in the genus *Hordeum*, including diploid, tetraploid, and hexaploid varieties (Bothmer et al., 1995), and the vast majority of *Hordeum* species cannot be used due to crossing barriers. However, once the candidate genes have been identified, transgenics and genome editing technologies can be used to transfer the desirable genetic material between species regardless of natural crossing barriers. To aid improvement, rich CWR genomic resources for many key crop species have been developed including soybean, rice, and maize.

*Glycine soja* – a wild relative of cultivated soybean (*Glycine max*) – has been shown to have a much more diverse gene pool compared to *G. max*, due to artificial selection during domestication and further loss as a result of modern breeding practices (Hyten et al., 2006; Kofsky et al., 2018). Wild and cultivated soybean differ in a number of agriculturally important traits, including pod shattering (Dong et al., 2014), determinate growth habit (Tian et al., 2010), and seed size (Zhou et al., 2015a; Kofsky et al., 2018). Additionally, it was shown that half of the annotated resistance-related sequences in *G. soja* were absent in both the landraces and cultivars (Zhou et al., 2015b). Despite the phenotypic differences, *G. soja* and *G. max* are cross-compatible, facilitating the transfer of desirable traits.

In maize (*Zea mays*), lowland teosinte (*Z. mays* ssp. *parviglumis*), highland teosinte (*Z. mays* ssp. *mexicana*), and the genus *Tripsacum* comprising nine species of warm-season, perennial grasses have been characterized as donors of important traits, which could be used for improvement (Mammadov et al., 2018). Genome-wide studies demonstrated that over 10% of the maize genome shows evidence of introgression from the *mexicana* genome, suggesting its contribution to adaptation and improvement (Hufford et al., 2013; Yang et al., 2017).

Rice (*Oryza sativa* L.) belongs to the genus *Oryza*, encompassing over 20 species, two of which are cultivated (*O. sativa* L. and *Oryza glaberrima* S.). The species are subdivided into several groups, and not all are cross-compatible. In recent analyses, *Oryza rufipogon*, a wild species believed to be the immediate progenitor of *O. sativa*, showed higher sequence diversity and harbored sequence and genes completely missing from the population of cultivated rice (Huang et al., 2012; Xu et al., 2012; Zhao et al., 2018b), highlighting the potential of its use for modern rice improvement.

In *Brassica*, a comparison of a CWR *Brassica macrocarpa*, with nine cultivated lines of *Brassica oleracea* showed that the former harbored unique disease resistance genes most likely lost during the domestication and improvement of elite *B. oleracea* germplasm (Golicz et al., 2016b).

The increasing abundance of genomic resources for CWRs will significantly aid future breeding efforts, helping identify the optimal crosses and genome editing targets.

### *De novo* Crop Domestication

Another strategy for utilization of the wild plant resources is new crop (*de novo*) domestication. The domestication syndrome refers to a unique collection of phenotypic traits associated with the genetic change of an organism from a wild progenitor to a domesticated one. Most of the changes linked to the domestication syndrome, such as grain dispersal in wheat, barley, and rice; apical dominance in maize; fruit size in tomato; and grain quality in wheat, result from modification of a single or few genes (Frary et al., 2000; Clark et al., 2004; Konishi et al., 2006; Uauy et al., 2006; Dubcovsky and Dvorak, 2007; Pourkheirandish et al., 2015, 2018). Also, most of them are due to a loss of function mutation in the causal gene (Komatsuda et al., 2007; Ramsay et al., 2011; Ishimaru et al., 2013; Pourkheirandish et al., 2015). For example, wheat, barley, rice, maize, and sorghum were selected for inflorescence that retained the grains, which made it easy to harvest. This characteristic results from the loss of function mutations in the genes controlling shattering (Konishi et al., 2006; Lin et al., 2012; Pourkheirandish et al., 2015). Similarly, a domestication associated NAC gene controlling pod shattering resistance has been identified in soybean (Dong et al., 2014). Advances in genomics provided the necessary platform to facilitate gene discovery and identification such as detection of genes associated with non-brittle rachis in pasta wheat and seed filling in maize using whole-genome sequencing (Sosso et al., 2015; Avni et al., 2017); smooth awn in barley using genotyping by sequencing (Milner et al., 2019); seed quality in soybean; and cutin responsible for water retention in barley using RNA sequencing (Li et al., 2012, 2017a; Gao et al., 2018).

The syntenic and orthologous gene relationships among plant genomes are well demonstrated (Devos, 2005; Tang et al., 2008). Synteny allows identification of homologous genes and has been used to identify genes with similar functions in related species (Pourkheirandish et al., 2007; Chen et al., 2009; Sakuma et al., 2010; Ning et al., 2013). For example, grain retention in both wheat and barley results from a mutation in homologous genes *brittle rachis* 1 (Pourkheirandish et al., 2018). The *brittle rachis* 1 homologues appear to have a similar role in grain dispersal in wild progenitors of wheat and barley. A loss of function mutation in this gene results in spike stiffness in the domesticated lines. As the same gene controls brittleness in both wheat and barley, *brittle rachis* 1 most likely evolved before the divergence of *Triticum* (wheat genus) and *Hordeum* (barley genus) over 5 Mya (Middleton et al., 2014). This suggests that the other non-domesticated species within *Hordeum* and *Triticum* that are not cross fertile with cultivated wheat and barley probably carry the *brittle rachis* 1, which controls their mode of grain dispersal. Recently a study involving crop plant species from multiple families used genome-wide association study (GWAS) to identify a domestication-related gene controlling seed dormancy in soybean and then showed that orthologs of this gene in rice and tomato also display evidence of selection during domestication. Analysis of transgenic plants confirmed the conservation of function in soybean, rice, and *Arabidopsis*, highlighting the power of comparative genomics in new domestication target gene identification (Wang et al., 2018a).

A pre-existing knowledge of target gene makes further crop domestication speedy and feasible. Domestication of a new crop species allows access to a novel gene pool with the potential for generating new crops, which are productive, resilient, and nutritious. Recent successes in wild tomato domestication by editing loci important for yield and productivity provide a proof of concept (Li et al., 2018; Zsögön et al., 2018). For example, targeting of *brittle rachis* 1 gene in any wild species of *Hordeum* or *Triticum* using gene editing would disrupt its function and result in a significant step toward domestication of a new species. It is important to note that the ease of genome editing and therefore its use for crop *de novo* domestication and other applications is related to plant ploidy. Gene knockout efficiency is lower in polyploids compared to diploids, as multiple alleles must be edited simultaneously to achieve a similar effect (Zhang et al., 2019b).

### Engineering Polyploidy

Polyploid plants possess three or more sets of homologous chromosomes stemming either from the duplication of a single genome (autopolyploidy) or hybridization followed by doubling of two diverged genomes (allopolyploidy; Comai, 2005). Many of the agriculturally important crop plants and staple food species are natural polyploids, including: bread wheat (allo-hexaploid; 6× = 42), pasta wheat (allo-tetraploid; 4× = 28), strawberry (allo-octaploid; 8× = 56), potato (auto-tetraploid; 4× = 48), and banana (auto-triploid; 3× = 33). Recent modeling work linked the occurrence of polyploidy to domestication (Salman-Minkov et al., 2016). Higher genome copy number masks deleterious mutations, increases the adaptive potential, and provides the opportunity for genes to gain new function. Thus, polyploidy is considered a major driver of evolution (Sattler et al., 2016). Induced polyploidy has also been used by breeders to develop new crops and flowers, such as triploid watermelon (seedless), hexaploid Triticale (a hybrid of wheat and rye), triploid tulips, roses, and many more ornamental flowers (Sattler et al., 2016). Polyploid plants tend to display hybrid vigor and improved abiotic stress tolerance (Chen, 2010; Tamayo-Ordóñez et al., 2016), with different manifestations of traits observed depending on the level of ploidy. For example, a study in *Arabidopsis*, which performed a rigorous comparison of plants with different somatic ploidy levels (2×, 4×, 6×, and 8×) observed significant differences in phenotypes (Corneillie et al., 2019).

The engineering of polyploid plants has been proposed as one of the routes for the generation of improved crop varieties (Katche et al., 2019). However, a better understanding of the causes and effects of polyploidy is a necessary prerequisite. Two major routes of polyploid plant formation are *via* unreduced gametes or somatic doubling (Ramsey and Schemske, 1998; Tamayo-Ordóñez et al., 2016). In laboratory conditions, polyploidy can be induced by application of antimicrotubule drugs such as colchicine. The viability of polyploid plants depends on stabilization of mitotic and meiotic divisions (Comai, 2005). Understanding of the molecular mechanisms behind cell cycle control, homologous chromosome pairing, and meiotic crossover formation is therefore paramount. Molecular mechanisms controlling cell cycle progression are deeply conserved and rely on cyclins (CYCs) and cyclin dependent kinases (CKDs). Previous studies in *Arabidopsis thaliana* identified seven classes of CDKs, named CDKA through CDKF, but CDKA and CDKB were identified as major drivers of cell cycle in plants (Menges et al., 2005; Tank and Thaker, 2011; Tamayo-Ordóñez et al., 2016). An extensive literature search compiled a list of over a 100 meiosis-related genes in *Arabidopsis* (Gaebelein et al., 2019). Comparative genomics approaches can be used to find orthologs of those genes in other species and perform further characterization. For example, a recent study of synthetic allohexaploid *Brassica* hybrids (2*n* = 6× = AABBCC) identified genomic regions associated with fertility, which harbored orthologs of *A. thaliana* genes involved in meiosis (Gaebelein et al., 2019).

In addition, plant genomes are known to undergo extensive structural rearrangements and methylation changes upon polyploidization. A study of resynthesized *Brassica napus* lines demonstrated extensive restructuring of the merged genomes in the early generations following hybridization (Szadkowski et al., 2010). Many hybrids and recent allopolyploids display genome dominance, resulting in sub-genome biases in gene content and expression (Bird et al., 2018). Genomics can be used to track post-hybridization structural re-arrangements and the establishment of sub-genome dominance to better understand plant genome evolution post-hybridization (Edger et al., 2017). It can also help predict the optimal combination of different wild species to construct new synthetic crops that can diversify our agriculture and bring resilience to climate change.

As an example, bread wheat (*Triticum aestivum*), a major crop accounting for 20% of world daily food consumption, is an allohexaploid plant originated *via* multiple hybridizations. The most accepted hypothesis of its origin is demonstrated in **Figure 3** (Haider, 2013). Because the bread wheat carries genes of three different genomes (A, B, and D), it is robust and has been able to adapt to different climatic zones. Today, bread wheat (AABBDD), which originated from fertile crescent (30–35°N), can grow from Sweden (65°N) to Argentina or New Zealand (45°S), a cultivation zone much broader than that pasta wheat (AABB; Feuillet et al., 2008). Another example of a widely known polyploid plant is the octoploid strawberry (Edger et al., 2019). The modern strawberry arose from a series of hybridization events between diploid, tetraploid, and hexaploid species spanning Eurasia and North America (**Figure 4**).

2× = 14) and *Aegilops speltoides* (BB; 2× = 14), resulting in tetraploid pasta wheat (AABB; 4× = 28). At the next step, hybridization of the tetraploid wheat with *Aegilops tauschii* (DD; 2× = 14) resulted in the emergence of the hexaploid bread wheat (AABBDD; 6× = 42).

Bertioli (2019).

### Harnessing Plant-Microbe Interactions to Boost Agricultural Output

Microbes which live within (endosphere) and surrounding plant roots in the soil (rhizosphere) have a significant impact on the host, including health and fitness, productivity, and responses to climate change (Wei et al., 2019). Plants and microbes interact *via* signaling molecules originating from both organisms (Leach et al., 2017; Chagas et al., 2018). Some bacterial communities have been shown to manipulate the plant potential to use soil resources, promote plant biotic and abiotic stress tolerance, and stimulate growth and nutrient uptake (Mendes et al., 2011; Berendsen et al., 2012; Fellbaum et al., 2012; Fitzpatrick et al., 2018; Paredes et al., 2018). At the other end of the spectrum, pathogenic microbes also exist, which negatively affect plant health (**Figure 5**; Pusztahelyi et al., 2015; Chagas et al., 2018).

Soil microorganisms are in constant competition to accumulate around the root and access the plant secreted carbohydrates (Venturi and Keel, 2016). Plants and microbes evolved together, resulting in beneficial microbes being attracted to a specific root exudate profile and forming a community of microbes in the rhizosphere (Garbeva et al., 2008). Plant species can therefore shape the composition of their rhizosphere's microbiome. Crops grown in the soils with microbial profiles similar to their native environment are expected to have a better chance of forming beneficial plant-microbiome interactions (Pérez-Jaramillo et al., 2018), promoting tolerance to biotic and abiotic stress. The advances in genomic technologies resulted in the sequencing of numerous soil microorganisms and improvement of our understanding of the soil microbial communities (Jansson and Hofmockel, 2018). For example, the availability of genomic sequences for nitrogen-fixing and phosphate-solubilizing bacteria expanded significantly (Zekic et al., 2017; Basenko et al., 2018; Jeong et al., 2018; Ormeño-Orrillo et al., 2018). Combined analysis of microbiome genomic and metabolomic data provides an accurate tool necessary to understand plant-microbe interactions and predict the most favorable crop plant–soil microbiome combinations, allowing for mapping of suitable crops to specific locations.

FIGURE 5 | Plant-microbe interactions in the rhizosphere. Plants can influence the composition of microbiome surrounding plant roots through exudation of compounds that stimulate (green arrows) or inhibit (red blocked arrows) microbes. A wide range of pathogens living in the soil can also affect plant health. Being able to attract the beneficial microbes will limit the success of the pathogenic microbes due to resource competition or by enhancing the plant immune system. The commensal microbes do not affect the plant or the pathogen directly. Microbiome-plant interactions are presented as described by Berendsen et al. (2012).

### The Challenge of Climate Change and Plant Diseases

Crop plant pathogens are considered a major threat to modern agriculture (Oerke and Dehne, 2004; Savary et al., 2012; Nelson et al., 2018). The ongoing battle between plants and pathogens resulted in their co-evolution and shaped the genetic diversity of both (Jones and Dangl, 2006; Tiffin and Moeller, 2006; Dodds and Rathjen, 2010; Hartmann et al., 2017). Diseases generally result from a specific interaction between host and pathogen (Veresoglou and Rillig, 2014; Põlme et al., 2018). For example, the wheat leaf rust pathogen *Puccinia triticina*, one of the most common diseases of wheat globally, does not affect rice, maize, or any other crop. World-wide over-cultivation of a few crops (wheat, maize, rice, soybean, and barley) with low genetic diversity has led to the increased pathogen inoculum and accelerated pathogen evolution, promoting its spread globally (Savary et al., 2019).

Climate change affects the epidemiology of pathogens at specific locations and the geographic distribution of plant diseases (Barford, 2013; Chakraborty, 2013). Increasing the crop plant diversity by cultivation of orphan crops and the domestication of new crops will result in reduced selective pressure on pathogen populations; thus, the life of genetic resistance is expected to be longer (Cook, 2006; Hajjar et al., 2008; Storkey et al., 2019). The life extension of genetic resistance could be an effective and ecologically sustainable way to control diseases. Climate change affects not only the crop but also the pathogen survival and reproduction. One of the expected impacts of climate change on plant disease is the migration of pathogens to latitudes beyond their historical range, examples of which have already been documented (Barford, 2013; Chakraborty, 2013). An increase in temperature would result in pathogen movement and spread of disease further from north in the northern hemisphere and south in the southern hemisphere to geographical locations in which they previously have not been able to reproduce effectively nor infect the plant. Recent genomic advances have resulted in the prediction and isolation of several resistance genes from crops and the identification of the corresponding genes from the pathogen (Fu et al., 2009; Mago et al., 2015; Moore et al., 2015; Sperschneider et al., 2015; Krattinger et al., 2016; Wan et al., 2019). These advances have provided a snapshot of resistance mechanisms that crops have developed during the long co-evolutionary history. The discovery of the genes underlying resistance has led to an improved understanding of their molecular function and established an entry point for studies of the defense pathways.

In addition, genome sequencing provides a rapid method of pathogen identification (Boykin et al., 2019), outbreak progression, and tracking of its spread to new locations. In fact, the development of third-generation sequencing technologies, especially Oxford Nanopore, resulted in the introduction of small, affordable, mobile sequencing instruments perfectly suited for in-field diagnostic system. Oxford Nanopore MinION technology has already been used for real-time diagnostics of human pathogens including Ebola (Quick et al., 2016) and Zika viruses (Faria et al., 2016) with protocols for identification of plant pathogens and pests under active development. For example, a recent proof-of-concept study has shown that using portable sequencing technology diagnostic test, it is possible to deliver test results within 48 h and, thus, greatly reduce the risk of community crop failure (Boykin et al., 2019).

### Genome Editing for Nutritionally Enhanced Crop Production

Augmentation of crop nutritional value plays a central role in ensuring global food security. Breeding of crops for enhanced nutrient content has been a long standing goal of plant research (DellaPenna, 1999; Welch and Graham, 2002). Plants are a key source of macro‐ and micro-nutrients, but many of the staple foods, including cassava, wheat, rice, and maize are poor source of some macro-nutrients and many essential micro-nutrients (DellaPenna, 1999). However, nutrient profile can be altered by manipulation of biochemical pathways involved in macro‐ and micro-nutrient biosynthesis. Advances in genome sequencing and annotation provided the necessary resource to identify the candidate genes involved in plant metabolism. As a result, genome editing technologies could be used to modify nutritional profiles of crops, for example producing soybeans with high oleic acid and low linoleic acid content (Haun et al., 2014; Demorest et al., 2016) and reducing antinutritional phytic acid content in maize (Liang et al., 2014). Nutritional enhancement of crops can also be achieved using transgenic technologies (Hefferon, 2015). In addition, genome editing facilitated *de novo* domestication of new nutrient rich crops could lead to a more diversified and healthier diet.

# ACCESSING NEW BREEDING TARGETS USING GENOMIC TECHNOLOGIES

### Third-Generation Sequencing for Improved Reference Genomes

The beginning of the twenty-first century saw rapid development of new sequencing methods. Second-generation sequencing technologies, including Illumina, allowed assembly of over 200 plant genomes (Chen et al., 2018) with much more ambitious plans of generating 10,000 draft genome assemblies by 2025 (Cheng et al., 2018). The main challenge posed by second-generation sequencing technologies was short-read length, making them unable to bridge over long stretches of repetitive sequences, resulting in fragmented assemblies. However, the introduction of third-generation sequencing and long reads produced by PacBio and Oxford Nanopore now allows for chromosomal level assemblies of plant genomes (Belser et al., 2018). The long-read sequencing technologies are often combined with optical mapping and conformation capture, achieving draft genomes of unprecedented contiguity (Belser et al., 2018; Shi et al., 2019). Importantly, the sequencing strategy used and the resulting contiguity and completeness of the assembly have been shown to impact downstream evolutionary and functional analyses. For example, comparative analysis of two *Brassica rapa* assemblies, one built using Illumina sequencing data and the other one using a PacBio, optical mapping (BioNano) and conformation capture (Hi-C) revealed that the latter harbored ~3,000 assembly specific genes as well as over 500 previously unidentified transposable element (TE) families (Zhang et al., 2018). The availability of high-quality, chromosome scale genome assemblies substantially improves the accuracy of the downstream genomic analysis, including gene and regulatory region annotation, GWAS, gene expression quantification, and homologue detection.

### Accurate Gene Prediction and Functional Annotation for Precise Candidate Gene Identification

The explosion of plant genome sequencing was accompanied by extensive annotation efforts aiming to generate a comprehensive catalog of gene models for a given species. Gene model is defined as a region of the genome, which is believed to be transcribed into protein-coding messenger RNA (mRNA) or one of the classes of non-coding RNAs (ncRNA; Schnable, 2019). Gene models are often built using a combination of *ab*  *initio* gene prediction and homology-based methods that take advantage of sequence similarity to known transcripts or proteins (Campbell et al., 2014; Klasberg et al., 2016). Early on gene expression evidence was mostly derived from expressed sequence tags (ESTs) and extended by full-length sequencing *via* cloning followed by Sanger sequencing. Later, the information was supplemented by RNASeq data from diverse tissues, and it was shown that gene models and isoforms with highly tissue-specific expression were underrepresented in exiting annotations (Cheng et al., 2017; Golicz et al., 2018b; Van Bel et al., 2019). Currently, addition of long reads generated by PacBio or Oxford Nanopore sequencing technologies allows for recovery of full-length transcripts, providing new insights into the extent of alternative splicing and transcriptome diversity (Cook et al., 2019). Annotation of loci harboring non-coding transcripts is also becoming routine, further improving our understanding of the complexity of plant transcriptomes (Van Bel et al., 2019).

Despite the availability of genome annotations, functional characterization of annotated genes, which allows for the direct connection between genome and phenome, poses a key challenge in molecular breeding pipelines (Scheben and Edwards, 2018). In the key experimental model plant species, *A. thaliana*, >90% of genes have been annotated with putative functions and ~50% of genes have annotation supported by experimental evidence (Van Bel et al., 2019). However, for most of the crop plants, gene functional annotations rely on homology-based inference and are performed by transfer of annotation from most similar genes in model plants like *Arabidopsis* and rice, with very little direct experimental support. Annotation transfer is further complicated by plant evolutionary history, where successive rounds of polyploidy and subsequent diploidization lead to gene redundancy, differential loss, and neo‐ and sub-functionalization (Jiao and Paterson, 2014; Salman-Minkov et al., 2016). However, rapid progress in application of CRISPR/Cas9 genome editing will soon allow construction of genome-wide mutant libraries for key crops, significantly contributing to the functional annotation efforts. In fact, such libraries are already available for rice (Lu et al., 2017; Meng et al., 2017). Integrative genomics approaches have also been used to facilitate discovery of top candidates. For example, specialized databases integrating genotypic, phenotypic, and association data have been developed for rice (SNP-Seek), soybean (SoyBase), and wheat (T3; Grant et al., 2010; Blake et al., 2016; Mansueto et al., 2017). Beyond specialized database, tools like KnetMiner and MCRiceRepGP were developed aiming to rank candidate genes involved in biological processes of interest using multicriteria decision analysis (Hassani-Pak and Rawlings, 2017; Golicz et al., 2018b).

### Non-coding Part of Genome as a Reservoir of New Breeding Targets

Only several percent of most large crop plant genomes encode protein-coding genes and the remainder is made up of non-coding sequences. For a long time, the non-coding stretches of DNA were considered to have little function; however, recent technological and conceptual developments revealed that plant genomes encode thousands of potentially functional ncRNAs as well as prevalence of distant regulatory elements including enhancers (Weber et al., 2016). The ncRNAs encompass several classes of transcripts, including not only the relatively well characterized ribosomal RNAs (rRNAs), transfer RNAs (tRNAs), small nucleolar RNAs (snoRNAs), and micro RNA (miRNAs) but also much more poorly understood long non-coding RNAs (lncRNAs). LncRNAs are transcripts over 200 base pairs in length without discernible protein coding protentional, identified from RNASeq data, and have been shown to be involved in a range of biological processes, including flowering time regulation, stress tolerance, and gamete formation (Golicz et al., 2018a). At least some of the lncRNAs are likely to be functional, as evidenced by mutant phenotypes of knock-outs of newly discovered lncRNAs (Huang et al., 2018). Interestingly, lncRNAs have a strong bias toward transcription in reproductive tissues (Zhang et al., 2014; Golicz et al., 2018b; Johnson et al., 2018), suggesting involvement in plant sexual reproduction, a critical process affecting flowering, fruit, and grain formation. Newly characterized lncRNA, which affect important traits, can become genome editing targets. For example, a rice lncRNA LDMAR was shown to be involved in control of photoperiod-sensitive male sterility (PSMS), a key trait which contributed to the development of hybrid rice (Ding et al., 2012).

Another promising category of non-coding DNA sequences are cis-regulatory elements (CREs, promoters and enhancers/ silencers), capable of recruiting transcription factors and promoting gene expression. Changes in CREs are considered one of the key evolutionary mechanisms underlying, for example, emergence of novel morphological forms (Stern and Orgogozo, 2008; Weber et al., 2016) and the divergence of cis-regulatory regions associated with domestication underscore their important roles in control of traits targeted by artificial selection (Lemmon et al., 2014; Wang et al., 2017). Several enhancers have been identified that modulate the expression of genes involved in the control of important traits, like anthocyanin content in maize and flowering time in *Arabidopsis* (Chua et al., 2003; Louwers et al., 2009; Adrian et al., 2010). SNPs corresponding to the different rapeseed ecotype groups were also identified in the promoter regions of FLOWERING LOCUS T and FLOWERING LOCUS C orthologs (two key genes controlling flowering time; Wu et al., 2019). An SNP corresponding to spatial expression of a homeobox transcription factor was selected for during the selection of non-shattering rice (Konishi et al., 2006). An insertion of TE in the CRE of *teosinte branched* 1 gene was discovered as the reason for apical dominance in maize also selected in the course of plant domestication (Studer et al., 2011). In the last few years, a significant progress has been made in identification of plant CREs with studies of the model plant species *Arabidopsis* as well as rice, maize, and cotton (Zhang et al., 2012; Pajoro et al., 2014; Zhu et al., 2015; Rodgers-Melnick et al., 2016; Oka et al., 2017; Wang et al., 2017; Bajic et al., 2018; Tannenbaum et al., 2018; Zhao et al., 2018a; Yan et al., 2019). The rapid developments are due to adoption of DNase-Seq and ATAC-Seq techniques in plant research, which measure DNA "openness" as a proxy for the accessibility of DNA to transcription factors, RNA polymerase, and other protein complexes involved in gene expression (Pajoro et al., 2014; Wang et al., 2016). Improved understanding of the function of the non-coding elements of the genome will provide a new, yet untapped pool of breeding targets.

### Beyond Single Reference Genomics – The Pan-Genome Approach

Generation of the reference genomes and subsequent large-scale re-sequencing of hundreds to thousands of individuals per species revealed extensive genomic diversity, including large-scale presence/ absence variation (Golicz et al., 2016a; Varshney et al., 2017, 2019; Fuentes et al., 2019; Wu et al., 2019). As our knowledge of genomic variation increased, it become apparent that a single reference sequence is insufficient to represent the extent of genomic variation found within species, resulting in the introduction and adoption of the pangenome concept (**Figure 6**; Golicz et al., 2020). Pangenome represents the entirety of the genomic sequence and gene content found within a species rather than a single individual. First introduced in bacteria (Tettelin et al., 2005), it is highly relevant to plant research with more than 50% of genes in some species being variable (accessory), found in some individuals but not others (Golicz et al., 2020). Pangenomes have been constructed for key crop species, such as rice, soybean, bread wheat, and oilseed rape (Li et al., 2014; Golicz et al., 2016b; Contreras-Moreira et al., 2017; Gordon et al., 2017; Montenegro et al., 2017; Zhou et al., 2017; Hurgobin et al., 2018; Ou et al., 2018; Zhao et al., 2018b; Gao et al., 2019; Zhang et al., 2019a). Plant accessory genes have been shown to be over-represented in functions related to signaling and disease resistance as well as abiotic stress response (Golicz et al., 2016b; Montenegro et al., 2017; Hurgobin et al., 2018; Wang et al., 2018b), perhaps contributing to environmental adaptation and phenotypic plasticity and providing promising targets for crop improvement. Especially, since some of the accessory genes may be completely missing from the elite germplasm.

In addition, the pangenome offers a natural replacement for the current paradigm of using a single reference genome, as the choice of the reference affects downstream genomic analyses, including GWAS and gene expression quantification (Gage et al., 2019). Using pangenome as a reference improves read mapping and variant calling accuracy (Eggertsson et al., 2017; Garrison et al., 2018; Kim et al., 2019; Tian et al., 2019). The adoption

of the pangenome reference will also allow the inclusion of variants beyond SNPs in GWAS. Several studies in both plants and humans showed that the inclusion of structural variants in association studies could help identify causal variants (Chiang et al., 2017; Fuentes et al., 2019). For example, the use of sequence presence/absence variation allowed the identification of missing quantitative trait locus (QTLs) associated with disease resistance in oilseed rape (Gabur et al., 2018).

## PAIRING GENOMICS WITH OTHER EMERGING TECHNOLOGIES TO MAXIMIZE THEIR POTENTIAL

### Machine Learning and Crop Plant Genomics

Almost all aspects of genomic analyses can now be supported by the development and implementation of machine learning algorithms. Machine learning algorithms find new patterns and "learn" the necessary predictive features from the data, rather than rely on pre-existing criteria. This property makes them suitable for analysis of complex, multilayer datasets, where expert knowledge is incomplete or inaccurate, and when the amount of data is too large to be handled manually (Yip et al., 2013). Several promising applications of machine learning to plant genomics exits. As discussed above, the functional non-coding portions of plant genomes remain largely poorly understood. In animal research, machine learning and deep learning based methods have been particularity successful in genomic feature annotation, including regulatory regions like promoters, enhancers, and transcription factor binding sites (Xu and Jackson, 2019). The use of machine learning improved the quality of feature annotation, helped uncover the underlying sequence characteristics of the regulatory regions, and even allowed prediction of variant impact (Zhou and Troyanskaya, 2015; Kelley et al., 2016). However, the limited availability of large-scale epigenetic modification and chromatin accessibility datasets may delay similar studies in crop plants. While the lack of suitable datasets may be hampering regulatory region annotation, hundreds of sequenced and assembled genomes are readily available for comparative analyses. Identification of conserved and unique elements is one of the primary aims of comparative genomics. To date, sequence comparisons are mostly based on local or whole-genome alignments and limited by sensitivity of alignment tools. However, machine learning algorithms are being developed, which are capable of computing probability of sequence conservation for any query of interest (Joly-Lopez et al., 2016; Li et al., 2017b), providing new, exciting avenues for plant comparative genomics. Finally, plant phenotyping has legged significantly behind genotyping, requiring considerable resources and specialized equipment (Scheben and Edwards, 2018). One proposed application of machine learning is the prediction of phenotype from genotype and complementation of the more traditional genomic prediction models (Ma et al., 2018). Taken together, machine learning methods have the potential to add significant value to the existing genomic resources and methodologies.

### Speed Breeding to Accelerate the Development of New Crops

Advances in molecular and genomic technologies resulted in isolation and characterization of many agronomically important genes, for example ones controlling seed shattering, dormancy, increasing seed number, and size (Doebley et al., 2006). Improved understanding of the molecular function of these genes makes the new crop domestication and improvement of orphan crops feasible. However, the generation of new crops or improved crop varieties using traditional breeding techniques requires a lengthy process of recurrent selection, which can take many years (Gorjanc et al., 2018). One of the limiting factors in the process is the plant generation time, from seed germination to the harvest. The plant generation cycle takes up to 4 months in wheat and barley and even longer in others (Watson et al., 2018). Domestication of new crops would require numerous generations to stack the edited genes before crop release. Speed breeding is a procedure, which accelerates crop generation time by changing growth conditions, such as day length and temperature (Hickey et al., 2017). Growing long-day species under extended photoperiod (22 h light/2 h dark) and controlled temperature stimulates rapid flowering and maturation. The technology successfully shortened the plant generation time of some of the world's major agri-food crops, such as bread wheat, pasta wheat, barley, and canola (Watson et al., 2018). Production of up to six generations for wheat and barley is documented using speed breeding, which is much more efficient compared to two generations per year in traditional methods. Speed breeding protocols have also been successfully applied to orphan crops, such as chickpea, peanut, grass pea, lentil, and quinoa (O'Connor et al., 2013; Chiurugwi et al., 2019). The successful application of speed breeding to orphan crops indicates its flexibility and possible application for new crop domestication. A combination of speed breeding with our current knowledge about the target genes and genomic tools such as precision genome editing by CRISPR would make the new crop domestication feasible in a short time. Speed breeding can also be paired with genomic selection (GS), allowing further reductions in plant breeding cycles. GS is a modern breeding technology, which uses genome-wide markers to estimate the breeding values (EBV) and allows simultaneous selection for multiple traits. A recent study combined multivariate GS and speed breeding for yield prediction in spring wheat (Watson et al., 2019). Even though the current speed breeding protocols are limited to the long-day species, new protocols are expected for the short-day crops in the near future. Coupling speed breeding with genomics will make the GS for breeding and *de novo* domestication feasible.

### High-Throughput Phenotyping

Plant phenotyping refers to the measurement of any morphological or physiological characteristics of plants. The phenotype can result from the action of individual genes, gene-by-gene, or gene-by-environment interactions. Many agronomically essential traits, such as yield and its components and drought/salt tolerance, are controlled by multiple genes with small effects and their interactions with the environment (Mickelbart et al., 2015). For practical reasons, many research groups focus on a controlled environment to grow plants and study their response to biotic and abiotic stresses (Velásquez et al., 2018). This includes stress induced by temperature, humidity, light, and other environmental factors. However, in farming, the environment and microclimate change dynamically during the day and affect the plant unevenly, for example due to shading. Moreover, controlled light conditions are hardly equivalent to the irradiance levels and spectral quality typical of natural sun conditions. There is a great need to study plant stresses in dynamic environmental conditions to thoroughly understand the complete picture of plant-stress responses. As the genotypic information is now available for hundreds or thousands of breeding lines in different species, collection and analysis of the corresponding high-throughput phenotyping data is one of the significant tasks ahead (Araus et al., 2018).

High-throughput phenotyping platforms, which employ robotics and spectral-based imaging technologies, are rapid and reliable (Galvez et al., 2019). The main limitation is the controlled environment, which is different from the natural growth conditions in the field. The introduction of hyperspectral imaging technology combined with drones and manned aircrafts provides an opportunity for high-throughput in-field phenotyping of traits, such as canopy temperature, chlorophyll fluorescence, as well as other biochemical plant characteristics (Camino et al., 2019). This technology increases the resolution and accuracy of measurements and is becoming cost-effective. The main challenge of using airborne platforms would be the analysis of large quantity of data in a short time frame (Singh et al., 2016; Taghavi Namin et al., 2018). However, machine learning based methods have shown promise in high-throughput phenotyping data processing. In-field high-throughput phenotyping is perfectly suited for evaluation of the complex physiological traits such as abiotic stresses tolerance.

### CONCLUSION

Recent advances in genome sequencing, assembly, and annotation allowed unprecedented access to crop plant genomic information. High-throughput phenotyping techniques have been significantly advanced through the introduction of hyperspectral cameras and specialized processing software. Integration of genomic and phenomic data provides an opportunity to identify new agronomically relevant genes and characterize their functions. This knowledge has direct practical implications and can be translated to crop plant improvement using genome editing. While genome editing is currently applied in major crops and model plants, the technique has the potential to accelerate *de novo* domestication and allow rapid improvement of orphan crop plants, targeting the current and future climate challenges. The success of genomics in crop improvement is also influenced by the type of trait under investigation. For example, traits strongly affected by the environment and the interaction between genotype and the environment are more challenging to study and modify.

Disease resistance and dwarfing genes were introduced into crops such as wheat and rice during the green revolution (Khush, 2001).

candidate gene selection. Candidate genes can then be modified using genome editing resulting in generation of improved crop types.

### REFERENCES


Breeders developed the high yielding varieties using the extra supply of nitrogen fertilizers in the presence of sufficient water under the climate conditions of the 1950–1960's. The equation is different today as climate change causes water shortages and temperature increases. However, the information gained from genomics and phenomics will drive candidate gene identification and enable genome editing (**Figure 7**), initiating the new crop plant breeding revolution.

# AUTHOR CONTRIBUTIONS

MP and AG contributed equally. All authors contributed to the article and approved the submitted version.

### FUNDING

This work is supported by the School of Agriculture and Food internal fund at the University of Melbourne to MP and McKenzie Fellowship scheme to AG.


and transport in arbuscular mycorrhizal symbiosis. *Proc. Natl. Acad. Sci. U. S. A.* 109, 2666–2671. doi: 10.1073/pnas.1118650109


MADS-domain transcription factors in flower development. *Genome Biol.* 15:R41. doi: 10.1186/gb-2014-15-3-r41


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2020 Pourkheirandish, Golicz, Bhalla and Singh. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*